All of lore.kernel.org
 help / color / mirror / Atom feed
From: Richard Guy Briggs <rgb@redhat.com>
To: Paul Moore <paul@paul-moore.com>
Cc: linux-audit@redhat.com
Subject: Re: [RFC PATCH] audit-testsuite: improve our chances of losing records in lost_reset
Date: Fri, 14 Dec 2018 16:00:23 -0500	[thread overview]
Message-ID: <20181214210023.q2xmnt6ncz6dlhnn@madcap2.tricolour.ca> (raw)
In-Reply-To: <CAHC9VhQCxznE8PuwJRdLVEVTUsVaNuWwN3hQ83qxKbq990OkmQ@mail.gmail.com>

On 2018-12-14 15:35, Paul Moore wrote:
> On Fri, Dec 14, 2018 at 11:12 AM Richard Guy Briggs <rgb@redhat.com> wrote:
> > On 2018-12-14 10:53, Paul Moore wrote:
> > > On Thu, Dec 13, 2018 at 8:59 PM Richard Guy Briggs <rgb@redhat.com> wrote:
> > > > On 2018-12-13 18:23, Paul Moore wrote:
> > > > > On Thu, Dec 13, 2018 at 6:17 PM Paul Moore <paul@paul-moore.com> wrote:
> 
> ...
> 
> > > As an aside, have you spent any time trying to debug that wrong PID
> > > problem?  While not strictly audit related, that seems like a pretty
> > > serious Bash bug.  Or maybe it's a problem with the test?  I vaguely
> > > remember a discussion between you and Ondrej and some confusion around
> > > which Bash variable to use to fetch PIDs, but I may be mistaken.
> >
> > I haven't spent much time trying to debug that bash PID increment issue,
> > but it perplexed me since it was the identical technique used in
> > multiple tests in the audit-testsuite that has never caused an issue
> > previously on any of the same set of test machines.  This was for the
> > missing mount umount2 hang bug test
> > https://github.com/linux-audit/audit-testsuite/pull/76
> 
> Ah, I think I see the problem.  Unless I'm mistaken, you are talking
> about the shell/Bash command where the tests print the PID using "echo
> $$" or similar, yes?  As you likely already know, in Bash (and likely
> other shells as well), $$ is the PID of the running Bash process; in
> the two places where I saw it used in the existing audit-testsuite $$
> is being used to reference the Bash instance itself or something it
> exec's (which ends up inheriting the PID).  It looks like you are
> using $$ as a way to capture the PID of a child process being spawned
> by the shell, yes?  This may explain why you sometimes get lucky and
> $$+1 works for the PID.

I was always getting lucky with $$+1, but understandably uncomfortable
with it since others weren't so fortunate.

In the code that's there, that process is backgrounded, but I'm fairly
certain I tested without that and carefully checked the options (-f and
-s) to ensure it wasn't daemonizing or multithreading.  I was pretty
careful to set up exactly the same conditions for running that process
as other tests use, but that looks like the first thing to check next
time I try it.  It wouldn't be the first time I've missed something
obvious.

> > > This brings up to the next step: how do we want to address this?
> > >
> > > Prior to the queue rework that started in v4.10 things were a bit
> > > simpler and it looks like we always registered a lost record
> > > independent of the "audit=?" setting on the kernel command line and
> > > the current queue backlog.  While this would have made this test
> > > easier, it could result in some over counting problems in the cases
> > > where an auditd instance came along and read the "lost" records from
> > > the queue.  I don't think reverting to this behavior is ideal.
> > >
> > > I'm also not certain that recording lost records in the *not*
> > > "audit=1" case is a good solution either.  In the case where the
> > > system is not running an audit daemon they are almost guaranteed to
> > > hit the backlog at some point and then their system will start spewing
> > > scary looking audit log lost messages; we would surely win lots of
> > > friends this way.
> > >
> > > We could move the "audit=1" check (really it's an audit_default check)
> > > into audit_log_lost() and use it to squelch the printk() and then call
> > > audit_log_lost in kauditd_hold_skb() if we can't queue the record.
> > > This should preserve an accurate lost record count (well, until it
> > > rolls over, but that's not a new concern), prevent unnecessary scary
> > > lost record messages, and ensure a consistent audit_log_lost()
> > > operation (all the other callers I didn't mention in this mail).
> > >
> > > Or the simplest option is to just ignore this and require that the
> > > audit-testsuite be run on a system booted with "audit=1" :)
> > >
> > > I'm currently leaning towards moving the "audit=1" check into
> > > audit_log_lost(), what do you guys think?
> >
> > I'll need some time to digest all this.  My first reaction is that
> > requiring "audit=1" for the audit-testsuite is not the right answer and
> > as we've discussed previously, it makes sense to run the entire
> > testsuite both with and without "audit=1".
> 
> Yes, requiring "audit=1" is not an answer.  I put it there mostly as
> an attempt at humor, hence the smiley at the end, but clearly my
> abilities as a comic are lacking ;)

Sorry, email is limited...  I read :) and ;) differently.

> Anyway, no rush on looking it over; this isn't a -stable fix in my
> eyes so it's not a candidate for merging until after the upcoming
> merge window.  I think it will make a lot more sense once you go back
> and look at the code, reading about the problem in email probably
> makes is sound more complicated than it is.  I'm fairly certain the
> right solution is to move the "audit=1" check into audit_log_lost(),
> but I sent this out in case you or anyone else could think of a better
> solution.  In the meantime I'll throw together a quick patch (it
> should be pretty small) and add it to my testing queue.

I'll look at the code and your patch carefully when I can.

> paul moore

- RGB

--
Richard Guy Briggs <rgb@redhat.com>
Sr. S/W Engineer, Kernel Security, Base Operating Systems
Remote, Ottawa, Red Hat Canada
IRC: rgb, SunRaycer
Voice: +1.647.777.2635, Internal: (81) 32635

  reply	other threads:[~2018-12-14 21:00 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-12-13 23:17 [RFC PATCH] audit-testsuite: improve our chances of losing records in lost_reset Paul Moore
2018-12-13 23:23 ` Paul Moore
2018-12-14  1:59   ` Richard Guy Briggs
2018-12-14 15:53     ` Paul Moore
2018-12-14 16:12       ` Richard Guy Briggs
2018-12-14 20:35         ` Paul Moore
2018-12-14 21:00           ` Richard Guy Briggs [this message]
2018-12-15 17:35             ` Ondrej Mosnacek

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20181214210023.q2xmnt6ncz6dlhnn@madcap2.tricolour.ca \
    --to=rgb@redhat.com \
    --cc=linux-audit@redhat.com \
    --cc=paul@paul-moore.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.