Re: [PATCH ghak95] audit: Do not log full CWD path on empty relative paths

From: Ondrej Mosnacek <omosnace@redhat.com>
To: Paul Moore <paul@paul-moore.com>
Cc: Richard Guy Briggs <rgb@redhat.com>,
	Linux-Audit Mailing List <linux-audit@redhat.com>
Subject: Re: [PATCH ghak95] audit: Do not log full CWD path on empty relative paths
Date: Wed, 19 Sep 2018 13:01:41 +0200	[thread overview]
Message-ID: <CAFqZXNtZ1=_2ZvKcc17ZK2swZQPdmwOpa=BR6Tz4FkM5o3bHUQ@mail.gmail.com> (raw)
In-Reply-To: <CAHC9VhQuvqjErN42PiXzcwY=bYdeKTOaN65MrJ8Wx=+w-=KHAg@mail.gmail.com>

On Wed, Sep 19, 2018 at 3:35 AM Paul Moore <paul@paul-moore.com> wrote:
> On Thu, Sep 13, 2018 at 10:13 AM Paul Moore <paul@paul-moore.com> wrote:
> > On Thu, Sep 13, 2018 at 9:58 AM Ondrej Mosnacek <omosnace@redhat.com> wrote:
> > > Paul, could you please answer this question so I can move forward? :)
> >
> > Yep, sorry for the delay ...
>
> I just went back over the original problem, your proposed fix, and all
> of the discussion in this thread.
>
> Sadly, I don't think the patch you have proposed is the right fix.
>
> As Steve has pointed out, the CWD path is the working directory from
> which the current process was executed.  I believe we should log the
> full path, or as complete a path as possible, in the nametype=CWD PATH
> records.  While the nametype=PARENT PATH records have a connection
> with some of the other PATH records (e.g. DELETE and CREATE), the
> nametype=PARENT PATH records are independent of the current working
> directory, although they sometimes may be the same; in the cases where
> they are the same, this is purely a coincidence and is due to
> operation being performed, not something that should be seen as a
> flaw.
>
> From what I can tell, there are issues involving the nametype=PARENT
> PATH records, especially when it comes to the *at() syscalls, but no
> issue where the nametype=CWD PATH records have been wrong, is that
> correct?

Sorry, but I think you are completely misunderstanding the problem...
I tried to explain it several times in different ways, but apparently
I'm still not doing it right...

First of all, there is no "nametype=CWD" PATH record. There is only
the classic CWD record that is associated to every syscall and I don't
touch that one at all. The information in the CWD record is perfectly
fine.

Let me try to demonstrate it with some more verbose examples. (TL;DR:
The info in the CWD record is correct, but it is being abused in
audit_log_name().)

EXAMPLE #1 (The non-edge case):
1. A userspace process calls rename("dir1/file1", "dir2/file2") with
CWD set to "/home/user".
2. The syscall causes four calls to __audit_inode(), which generate
four 'struct audit_names' objects with the following information
(maybe not in this specific order, but that doesn't matter):
.name = "dir1/file1", .type = AUDIT_TYPE_PARENT, .name_len = 5
.name = "dir1/file1", .type = AUDIT_TYPE_DELETE, .name_len = AUDIT_NAME_FULL
.name = "dir2/file2", .type = AUDIT_TYPE_PARENT, .name_len = 5
.name = "dir2/file2", .type = AUDIT_TYPE_CREATE, .name_len = AUDIT_NAME_FULL
3. At the end of the syscall, audit_log_name()  is called on each of
these objects and produces the following PATH records (simplifed):
nametype=PARENT name="dir1/"
nametype=DELETE name="dir1/file1"
nametype=PARENT name="dir2/"
nametype=CREATE name="dir2/file2"

Notice that for the PARENT objects the .name_len is truncated to only
the directory component.

EXAMPLE #2 (The single-path-component case):
1. A userspace process calls rename("file1", "file2") with CWD set to
"/home/user".
2. The 'struct audit_names' objects will now be:
.name = "file1", .type = AUDIT_TYPE_PARENT, .name_len = 0
.name = "file1", .type = AUDIT_TYPE_DELETE, .name_len = AUDIT_NAME_FULL
.name = "file2", .type = AUDIT_TYPE_PARENT, .name_len = 0
.name = "file2", .type = AUDIT_TYPE_CREATE, .name_len = AUDIT_NAME_FULL
3. At the end of the syscall, audit_log_name()  is called on each of
these objects and produces the following PATH records (simplifed):
nametype=PARENT name="/home/user"
nametype=DELETE name="file1"
nametype=PARENT name="/home/user"
nametype=CREATE name="file2"

Notice that in this case, the "clever" logic in audit_log_name()
wanted to avoid logging an empty path (name="") in the PARENT records,
so it instead put the CWD path in there ("/home/user"). In this case
this is perfectly valid (although could be a bit surprising that there
is suddenly a full path instead of a relative one), since the full
path of "file1" is actually "/home/user/file1".

EXAMPLE #3 (The non-edge renameat(2) case):
1. A userspace process calls the following syscalls (with CWD set to
"/home/user"):
int srcfd = open("/some/path1", O_DIRECTORY | O_PATH);
int dstfd = open("/another/path2", O_DIRECTORY | O_PATH);
renameat(srcfd, "dir1/file1", dstfd, "dir2/file2");
2. The 'name', 'type' and 'name_len' fields of the 'struct
audit_names' objects will now be exactly the same as in EXAMPLE #1:
.name = "dir1/file1", .type = AUDIT_TYPE_PARENT, .name_len = 5
.name = "dir1/file1", .type = AUDIT_TYPE_DELETE, .name_len = AUDIT_NAME_FULL
.name = "dir2/file2", .type = AUDIT_TYPE_PARENT, .name_len = 5
.name = "dir2/file2", .type = AUDIT_TYPE_CREATE, .name_len = AUDIT_NAME_FULL
3. The type and name information in the PATH records will be also
exactly the same:
nametype=PARENT name="dir1/"
nametype=DELETE name="dir1/file1"
nametype=PARENT name="dir2/"
nametype=CREATE name="dir2/file2"

Even in this case the information in the records is correct (although
there is no information telling us that "dir1/..." actually
corresponds to "/some/path1/dir1/..." and "dir2/..." actually
corresponds to "/another/path2/dir2/...").

So far so good, but now we are coming to...

EXAMPLE #4 (The single-path-component renameat(2) case):
1. A userspace process calls the following syscalls (with CWD set to
"/home/user"):
int srcfd = open("/some/path1", O_DIRECTORY | O_PATH);
int dstfd = open("/another/path2", O_DIRECTORY | O_PATH);
renameat(srcfd, "file1", dstfd, "file2");
2. The 'name', 'type' and 'name_len' fields of the 'struct
audit_names' objects will now be exactly the same as in EXAMPLE #2:
.name = "file1", .type = AUDIT_TYPE_PARENT, .name_len = 0
.name = "file1", .type = AUDIT_TYPE_DELETE, .name_len = AUDIT_NAME_FULL
.name = "file2", .type = AUDIT_TYPE_PARENT, .name_len = 0
.name = "file2", .type = AUDIT_TYPE_CREATE, .name_len = AUDIT_NAME_FULL
3. The type and name information in the PATH records will be also
exactly the same:
nametype=PARENT name="/home/user"
nametype=DELETE name="file1"
nametype=PARENT name="/home/user"
nametype=CREATE name="file2"

...and HERE is the problem. The parent of "file1" is not "/home/user",
it is "/some/path1", and the parent of "file2" is also not
"/home/user", it is "/another/path2".

The proposed fix simply changes the handling of the name_len == 0 case
to output "." instead of the CWD. This doesn't solve the wider problem
that we don't have the dirfd path information (GHAK #9), but it at
least makes the situation in EXAMPLE #4 *not worse* than in EXAMPLE #3
(i.e. it fixes the less demanding GHAK #95). As an additional minor
benefit it also brings a bit more consistency - with it the PATH
records will contain relative (resp. absolute) paths if and only if
the corresponding path given to the syscall was relative (resp.
absolute).

I hope this finally clears things up.

BTW, you still didn't answer my brief question from a few replies
above (quoted below with context):

> > > > >                 case 0:
> > > > >                         /* name was specified as a relative path and the
> > > > >                          * directory component is the cwd */
> > > > > -                       audit_log_d_path(ab, " name=", &context->pwd);
> > > > > +                       audit_log_untrustedstring(ab, ".");
> > > >
> > > > This isn't a comprehensive review, I only gave this a quick look, but
> > > > considering you are only logging "." above we can safely use
> > > > audit_log_string() and safe a few cycles.
> > >
> > > I used audit_log_untrustedstring() to maintain the current norm that
> > > the name= field would always contain a quoted string (either in
> > > double-quotes or hex-escaped). I don't know if such consistency is
> > > important for parsing in userspace (it should probably be robust
> > > enough to handle any format), but I wanted to be on the safe side.
> >
> > In this particular case there should be no visible difference in the
> > resulting record and audit_log_string() is going to have less overhead
> > in the kernel.
>
> OK, so should I just replace it with:
>
>     audit_log_string(ab, ".");
>
> or should I still escape the path manually:
>
>     audit_log_string(ab, "\".\"");
>
> ?

Thanks,
--
Ondrej Mosnacek <omosnace at redhat dot com>
Associate Software Engineer, Security Technologies
Red Hat, Inc.