[PATCH] audit: grab a reference to context->pwd when it's cached

All of lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH] audit: grab a reference to context->pwd when it's cached
@ 2012-08-16  1:13 Peter Moody
  2012-08-16  1:26 ` Alexander Viro
  0 siblings, 1 reply; 26+ messages in thread
From: Peter Moody @ 2012-08-16  1:13 UTC (permalink / raw)
  To: linux-audit

On certain systems, in certain pathalogical cases, current's cwd can
be deleted while we're still processing a syscall. This should prevent
the system from evicting the inode while we're still referencing it.

This seems to fix the bug I reported here:
https://www.redhat.com/archives/linux-audit/2012-August/msg00017.html

Signed-off-by: Peter Moody <pmoody@google.com>
---
 kernel/auditsc.c |    2 ++
 1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/kernel/auditsc.c b/kernel/auditsc.c
index 4b96415..e86b8b9 100644
--- a/kernel/auditsc.c
+++ b/kernel/auditsc.c
@@ -2064,6 +2064,7 @@ void __audit_getname(const char *name)
 
 	if (!context->pwd.dentry)
 		get_fs_pwd(current->fs, &context->pwd);
+	path_get(&context->pwd);
 }
 
 /* audit_putname - intercept a putname request
@@ -2091,6 +2092,7 @@ void audit_putname(const char *name)
 				       n->name, n->name ?: "(null)");
 			}
 #endif
+		path_put(&context->pwd);
 		__putname(name);
 	}
 #if AUDIT_DEBUG
-- 
1.7.7.3

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: [PATCH] audit: grab a reference to context->pwd when it's cached
  2012-08-16  1:13 [PATCH] audit: grab a reference to context->pwd when it's cached Peter Moody
@ 2012-08-16  1:26 ` Alexander Viro
  2012-08-16  1:41   ` Peter Moody
  0 siblings, 1 reply; 26+ messages in thread
From: Alexander Viro @ 2012-08-16  1:26 UTC (permalink / raw)
  To: Peter Moody; +Cc: linux-audit

On Wed, Aug 15, 2012 at 06:13:33PM -0700, Peter Moody wrote:
> On certain systems, in certain pathalogical cases, current's cwd can
> be deleted while we're still processing a syscall. This should prevent
> the system from evicting the inode while we're still referencing it.
> 
> This seems to fix the bug I reported here:
> https://www.redhat.com/archives/linux-audit/2012-August/msg00017.html

Sigh...  The bug is real, but I really don't like the fix.  Among other
things, it's going to cause cacheline bouncing from hell and not everyone
runs with audit sensibly disabled...  Let me think for a while and see
if I can come up with something less unpleasant, OK?

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH] audit: grab a reference to context->pwd when it's cached
  2012-08-16  1:26 ` Alexander Viro
@ 2012-08-16  1:41   ` Peter Moody
  2012-08-20 17:19     ` Peter Moody
  0 siblings, 1 reply; 26+ messages in thread
From: Peter Moody @ 2012-08-16  1:41 UTC (permalink / raw)
  To: Alexander Viro; +Cc: linux-audit

On Wed, Aug 15, 2012 at 6:26 PM, Alexander Viro <aviro@redhat.com> wrote:
> On Wed, Aug 15, 2012 at 06:13:33PM -0700, Peter Moody wrote:
>> On certain systems, in certain pathalogical cases, current's cwd can
>> be deleted while we're still processing a syscall. This should prevent
>> the system from evicting the inode while we're still referencing it.
>>
>> This seems to fix the bug I reported here:
>> https://www.redhat.com/archives/linux-audit/2012-August/msg00017.html
>
> Sigh...  The bug is real, but I really don't like the fix.  Among other
> things, it's going to cause cacheline bouncing from hell and not everyone
> runs with audit sensibly disabled...  Let me think for a while and see
> if I can come up with something less unpleasant, OK?

Works for me. I have a set of machines that I can very easily test a fix on.

Cheers,
peter

-- 
Peter Moody      Google    1.650.253.7306
Security Engineer  pgp:0xC3410038

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH] audit: grab a reference to context->pwd when it's cached
  2012-08-16  1:41   ` Peter Moody
@ 2012-08-20 17:19     ` Peter Moody
  2012-09-25 17:03       ` Peter Moody
  0 siblings, 1 reply; 26+ messages in thread
From: Peter Moody @ 2012-08-20 17:19 UTC (permalink / raw)
  To: Alexander Viro; +Cc: linux-audit

Hi Al,

Any word on a less unpleasant fix? Also, do you know if/how I could
measure the impact of the cacheline bouncing? I haven't been able to
notice anything anecdotally on the machine running this patch, but I'm
not pushing it incredibly hard.

Cheers,
peter

On Wed, Aug 15, 2012 at 6:41 PM, Peter Moody <pmoody@google.com> wrote:
> On Wed, Aug 15, 2012 at 6:26 PM, Alexander Viro <aviro@redhat.com> wrote:
>> On Wed, Aug 15, 2012 at 06:13:33PM -0700, Peter Moody wrote:
>>> On certain systems, in certain pathalogical cases, current's cwd can
>>> be deleted while we're still processing a syscall. This should prevent
>>> the system from evicting the inode while we're still referencing it.
>>>
>>> This seems to fix the bug I reported here:
>>> https://www.redhat.com/archives/linux-audit/2012-August/msg00017.html
>>
>> Sigh...  The bug is real, but I really don't like the fix.  Among other
>> things, it's going to cause cacheline bouncing from hell and not everyone
>> runs with audit sensibly disabled...  Let me think for a while and see
>> if I can come up with something less unpleasant, OK?
>
> Works for me. I have a set of machines that I can very easily test a fix on.
>
> Cheers,
> peter
>
> --
> Peter Moody      Google    1.650.253.7306
> Security Engineer  pgp:0xC3410038



-- 
Peter Moody      Google    1.650.253.7306
Security Engineer  pgp:0xC3410038

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH] audit: grab a reference to context->pwd when it's cached
  2012-08-20 17:19     ` Peter Moody
@ 2012-09-25 17:03       ` Peter Moody
  2012-09-26 13:50         ` Alexander Viro
  0 siblings, 1 reply; 26+ messages in thread
From: Peter Moody @ 2012-09-25 17:03 UTC (permalink / raw)
  To: Alexander Viro; +Cc: linux-audit

Hey folks,

following up on old patches, are there any comments on this? Did you
get around to finding a better way to fix this bug, Al?

Cheers,
peter

On Mon, Aug 20, 2012 at 10:19 AM, Peter Moody <pmoody@google.com> wrote:
> Hi Al,
>
> Any word on a less unpleasant fix? Also, do you know if/how I could
> measure the impact of the cacheline bouncing? I haven't been able to
> notice anything anecdotally on the machine running this patch, but I'm
> not pushing it incredibly hard.
>
> Cheers,
> peter
>
> On Wed, Aug 15, 2012 at 6:41 PM, Peter Moody <pmoody@google.com> wrote:
>> On Wed, Aug 15, 2012 at 6:26 PM, Alexander Viro <aviro@redhat.com> wrote:
>>> On Wed, Aug 15, 2012 at 06:13:33PM -0700, Peter Moody wrote:
>>>> On certain systems, in certain pathalogical cases, current's cwd can
>>>> be deleted while we're still processing a syscall. This should prevent
>>>> the system from evicting the inode while we're still referencing it.
>>>>
>>>> This seems to fix the bug I reported here:
>>>> https://www.redhat.com/archives/linux-audit/2012-August/msg00017.html
>>>
>>> Sigh...  The bug is real, but I really don't like the fix.  Among other
>>> things, it's going to cause cacheline bouncing from hell and not everyone
>>> runs with audit sensibly disabled...  Let me think for a while and see
>>> if I can come up with something less unpleasant, OK?
>>
>> Works for me. I have a set of machines that I can very easily test a fix on.
>>
>> Cheers,
>> peter
>>
>> --
>> Peter Moody      Google    1.650.253.7306
>> Security Engineer  pgp:0xC3410038
>
>
>
> --
> Peter Moody      Google    1.650.253.7306
> Security Engineer  pgp:0xC3410038



-- 
Peter Moody      Google    1.650.253.7306
Security Engineer  pgp:0xC3410038

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH] audit: grab a reference to context->pwd when it's cached
  2012-09-25 17:03       ` Peter Moody
@ 2012-09-26 13:50         ` Alexander Viro
  2012-09-26 18:20           ` Peter Moody
  2012-10-04 18:48           ` Peter Moody
  0 siblings, 2 replies; 26+ messages in thread
From: Alexander Viro @ 2012-09-26 13:50 UTC (permalink / raw)
  To: Peter Moody; +Cc: linux-audit

On Tue, Sep 25, 2012 at 10:03:23AM -0700, Peter Moody wrote:
> Hey folks,
> 
> following up on old patches, are there any comments on this? Did you
> get around to finding a better way to fix this bug, Al?

Alas, I've found none ;-/  Looks like we'll have to go with this one,
at least until somebody comes up with better solution.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH] audit: grab a reference to context->pwd when it's cached
  2012-09-26 13:50         ` Alexander Viro
@ 2012-09-26 18:20           ` Peter Moody
  2012-10-04 18:48           ` Peter Moody
  1 sibling, 0 replies; 26+ messages in thread
From: Peter Moody @ 2012-09-26 18:20 UTC (permalink / raw)
  To: Alexander Viro; +Cc: linux-audit

On Wed, Sep 26, 2012 at 6:50 AM, Alexander Viro <aviro@redhat.com> wrote:
> On Tue, Sep 25, 2012 at 10:03:23AM -0700, Peter Moody wrote:
>> Hey folks,
>>
>> following up on old patches, are there any comments on this? Did you
>> get around to finding a better way to fix this bug, Al?
>
> Alas, I've found none ;-/  Looks like we'll have to go with this one,
> at least until somebody comes up with better solution.

Did you want me to re-send this or is the patch from Aug 15 ack-able/apply-able?

Cheers,
peter

-- 
Peter Moody      Google    1.650.253.7306
Security Engineer  pgp:0xC3410038

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH] audit: grab a reference to context->pwd when it's cached
  2012-09-26 13:50         ` Alexander Viro
  2012-09-26 18:20           ` Peter Moody
@ 2012-10-04 18:48           ` Peter Moody
  2012-10-05 11:02             ` Jeff Layton
  2012-10-05 12:55             ` Jeff Layton
  1 sibling, 2 replies; 26+ messages in thread
From: Peter Moody @ 2012-10-04 18:48 UTC (permalink / raw)
  To: Alexander Viro; +Cc: linux-audit, jlayton

On Wed, Sep 26, 2012 at 6:50 AM, Alexander Viro <aviro@redhat.com> wrote:
> On Tue, Sep 25, 2012 at 10:03:23AM -0700, Peter Moody wrote:
>> Hey folks,
>>
>> following up on old patches, are there any comments on this? Did you
>> get around to finding a better way to fix this bug, Al?
>
> Alas, I've found none ;-/  Looks like we'll have to go with this one,
> at least until somebody comes up with better solution.

Not surprisingly, this patch doesn't actually fix the issue (or at
least doesn't do it correctly).

I hadn't noticed that get_fs_pwd() actually calls path_get() on
&context->pwd so the additional path_get() is useless and the
reference doesn't ever actually get freed if audit_putname is called
while we're in a syscall.

Al, Eric, Jeff; do any of you guys have an understanding of what the
initial bug actually is since this clearly doesn't fix it?

Cheers,
peter

-- 
Peter Moody      Google    1.650.253.7306
Security Engineer  pgp:0xC3410038

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH] audit: grab a reference to context->pwd when it's cached
  2012-10-04 18:48           ` Peter Moody
@ 2012-10-05 11:02             ` Jeff Layton
  2012-10-05 12:55             ` Jeff Layton
  1 sibling, 0 replies; 26+ messages in thread
From: Jeff Layton @ 2012-10-05 11:02 UTC (permalink / raw)
  To: Peter Moody; +Cc: linux-audit

On Thu, 4 Oct 2012 11:48:23 -0700
Peter Moody <pmoody@google.com> wrote:

> On Wed, Sep 26, 2012 at 6:50 AM, Alexander Viro <aviro@redhat.com> wrote:
> > On Tue, Sep 25, 2012 at 10:03:23AM -0700, Peter Moody wrote:
> >> Hey folks,
> >>
> >> following up on old patches, are there any comments on this? Did you
> >> get around to finding a better way to fix this bug, Al?
> >
> > Alas, I've found none ;-/  Looks like we'll have to go with this one,
> > at least until somebody comes up with better solution.
> 
> Not surprisingly, this patch doesn't actually fix the issue (or at
> least doesn't do it correctly).
> 
> I hadn't noticed that get_fs_pwd() actually calls path_get() on
> &context->pwd so the additional path_get() is useless and the
> reference doesn't ever actually get freed if audit_putname is called
> while we're in a syscall.
> 
> Al, Eric, Jeff; do any of you guys have an understanding of what the
> initial bug actually is since this clearly doesn't fix it?
> 
> Cheers,
> peter
> 

Hrm...so yeah. The chdir call should have pinned the dentry. I think
you'll need to track down where the oops actually occurred.

One way to determine that is to do something like this, but you'll need
to replace the kmod in the instructions with your vmlinux:

    https://wiki.samba.org/index.php/LinuxCIFS_troubleshooting#Oopses

-- 
Jeff Layton <jlayton@redhat.com>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH] audit: grab a reference to context->pwd when it's cached
  2012-10-04 18:48           ` Peter Moody
  2012-10-05 11:02             ` Jeff Layton
@ 2012-10-05 12:55             ` Jeff Layton
  2012-10-05 13:57               ` Peter Moody
  1 sibling, 1 reply; 26+ messages in thread
From: Jeff Layton @ 2012-10-05 12:55 UTC (permalink / raw)
  To: Peter Moody; +Cc: linux-audit

On Thu, 4 Oct 2012 11:48:23 -0700
Peter Moody <pmoody@google.com> wrote:

> On Wed, Sep 26, 2012 at 6:50 AM, Alexander Viro <aviro@redhat.com> wrote:
> > On Tue, Sep 25, 2012 at 10:03:23AM -0700, Peter Moody wrote:
> >> Hey folks,
> >>
> >> following up on old patches, are there any comments on this? Did you
> >> get around to finding a better way to fix this bug, Al?
> >
> > Alas, I've found none ;-/  Looks like we'll have to go with this one,
> > at least until somebody comes up with better solution.
> 
> Not surprisingly, this patch doesn't actually fix the issue (or at
> least doesn't do it correctly).
> 
> I hadn't noticed that get_fs_pwd() actually calls path_get() on
> &context->pwd so the additional path_get() is useless and the
> reference doesn't ever actually get freed if audit_putname is called
> while we're in a syscall.
> 
> Al, Eric, Jeff; do any of you guys have an understanding of what the
> initial bug actually is since this clearly doesn't fix it?
> 
> Cheers,
> peter
> 

BTW, I ran this test on one of my KVM guests and it ran just fine. That
one is an x86_64 guest running a 3.6.0+ kernel. The root fs on there is
ext4 though, not ext3. So perhaps that's a factor?

The oops message you posted at least looks like something down in the
bowels of ext3 or fs/buffer.c.

-- 
Jeff Layton <jlayton@redhat.com>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH] audit: grab a reference to context->pwd when it's cached
  2012-10-05 12:55             ` Jeff Layton
@ 2012-10-05 13:57               ` Peter Moody
  2012-10-05 14:10                 ` Jeff Layton
  2012-10-05 14:26                 ` Jeff Layton
  0 siblings, 2 replies; 26+ messages in thread
From: Peter Moody @ 2012-10-05 13:57 UTC (permalink / raw)
  To: Jeff Layton; +Cc: linux-audit

On Fri, Oct 5, 2012 at 5:55 AM, Jeff Layton <jlayton@redhat.com> wrote:
> On Thu, 4 Oct 2012 11:48:23 -0700
> Peter Moody <pmoody@google.com> wrote:
>
>> On Wed, Sep 26, 2012 at 6:50 AM, Alexander Viro <aviro@redhat.com> wrote:
>> > On Tue, Sep 25, 2012 at 10:03:23AM -0700, Peter Moody wrote:
>> >> Hey folks,
>> >>
>> >> following up on old patches, are there any comments on this? Did you
>> >> get around to finding a better way to fix this bug, Al?
>> >
>> > Alas, I've found none ;-/  Looks like we'll have to go with this one,
>> > at least until somebody comes up with better solution.
>>
>> Not surprisingly, this patch doesn't actually fix the issue (or at
>> least doesn't do it correctly).
>>
>> I hadn't noticed that get_fs_pwd() actually calls path_get() on
>> &context->pwd so the additional path_get() is useless and the
>> reference doesn't ever actually get freed if audit_putname is called
>> while we're in a syscall.
>>
>> Al, Eric, Jeff; do any of you guys have an understanding of what the
>> initial bug actually is since this clearly doesn't fix it?
>>
>> Cheers,
>> peter
>>
>
> BTW, I ran this test on one of my KVM guests and it ran just fine. That
> one is an x86_64 guest running a 3.6.0+ kernel. The root fs on there is
> ext4 though, not ext3. So perhaps that's a factor?
>
> The oops message you posted at least looks like something down in the
> bowels of ext3 or fs/buffer.c.

Yeah, the only place this actually happens for me on these giant xen
instances we have (6 cores, 32G ram) and it happens on both ext3 and
ext4 filesystems and it happens with 100% reliability.

The actual oops is from:

static inline void check_irqs_on(void)
{
#ifdef irqs_disabled
        BUG_ON(irqs_disabled());
#endif
}

with the code path looking like:

__find_get_block() -> lookup_bh_lru() -> check_irqs_on() -> BUG()

> --
> Jeff Layton <jlayton@redhat.com>



-- 
Peter Moody      Google    1.650.253.7306
Security Engineer  pgp:0xC3410038

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH] audit: grab a reference to context->pwd when it's cached
  2012-10-05 13:57               ` Peter Moody
@ 2012-10-05 14:10                 ` Jeff Layton
  2012-10-05 14:26                 ` Jeff Layton
  1 sibling, 0 replies; 26+ messages in thread
From: Jeff Layton @ 2012-10-05 14:10 UTC (permalink / raw)
  To: Peter Moody; +Cc: lczerner, linux-audit

On Fri, 5 Oct 2012 06:57:59 -0700
Peter Moody <pmoody@google.com> wrote:

> On Fri, Oct 5, 2012 at 5:55 AM, Jeff Layton <jlayton@redhat.com> wrote:
> > On Thu, 4 Oct 2012 11:48:23 -0700
> > Peter Moody <pmoody@google.com> wrote:
> >
> >> On Wed, Sep 26, 2012 at 6:50 AM, Alexander Viro <aviro@redhat.com> wrote:
> >> > On Tue, Sep 25, 2012 at 10:03:23AM -0700, Peter Moody wrote:
> >> >> Hey folks,
> >> >>
> >> >> following up on old patches, are there any comments on this? Did you
> >> >> get around to finding a better way to fix this bug, Al?
> >> >
> >> > Alas, I've found none ;-/  Looks like we'll have to go with this one,
> >> > at least until somebody comes up with better solution.
> >>
> >> Not surprisingly, this patch doesn't actually fix the issue (or at
> >> least doesn't do it correctly).
> >>
> >> I hadn't noticed that get_fs_pwd() actually calls path_get() on
> >> &context->pwd so the additional path_get() is useless and the
> >> reference doesn't ever actually get freed if audit_putname is called
> >> while we're in a syscall.
> >>
> >> Al, Eric, Jeff; do any of you guys have an understanding of what the
> >> initial bug actually is since this clearly doesn't fix it?
> >>
> >> Cheers,
> >> peter
> >>
> >
> > BTW, I ran this test on one of my KVM guests and it ran just fine. That
> > one is an x86_64 guest running a 3.6.0+ kernel. The root fs on there is
> > ext4 though, not ext3. So perhaps that's a factor?
> >
> > The oops message you posted at least looks like something down in the
> > bowels of ext3 or fs/buffer.c.
> 
> Yeah, the only place this actually happens for me on these giant xen
> instances we have (6 cores, 32G ram) and it happens on both ext3 and
> ext4 filesystems and it happens with 100% reliability.
> 
> The actual oops is from:
> 
> static inline void check_irqs_on(void)
> {
> #ifdef irqs_disabled
>         BUG_ON(irqs_disabled());
> #endif
> }
> 
> with the code path looking like:
> 
> __find_get_block() -> lookup_bh_lru() -> check_irqs_on() -> BUG()
> 

Weird -- I wonder what's disabling IRQs there?

Might be best to re-report this to linux-fsdevel and lkml so you can
reach a broader audience...

-- 
Jeff Layton <jlayton@redhat.com>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH] audit: grab a reference to context->pwd when it's cached
  2012-10-05 13:57               ` Peter Moody
  2012-10-05 14:10                 ` Jeff Layton
@ 2012-10-05 14:26                 ` Jeff Layton
  2012-10-05 15:16                   ` Peter Moody
  1 sibling, 1 reply; 26+ messages in thread
From: Jeff Layton @ 2012-10-05 14:26 UTC (permalink / raw)
  To: Peter Moody; +Cc: lczerner, linux-audit, sandeen

On Fri, 5 Oct 2012 06:57:59 -0700
Peter Moody <pmoody@google.com> wrote:

> On Fri, Oct 5, 2012 at 5:55 AM, Jeff Layton <jlayton@redhat.com> wrote:
> > On Thu, 4 Oct 2012 11:48:23 -0700
> > Peter Moody <pmoody@google.com> wrote:
> >
> >> On Wed, Sep 26, 2012 at 6:50 AM, Alexander Viro <aviro@redhat.com> wrote:
> >> > On Tue, Sep 25, 2012 at 10:03:23AM -0700, Peter Moody wrote:
> >> >> Hey folks,
> >> >>
> >> >> following up on old patches, are there any comments on this? Did you
> >> >> get around to finding a better way to fix this bug, Al?
> >> >
> >> > Alas, I've found none ;-/  Looks like we'll have to go with this one,
> >> > at least until somebody comes up with better solution.
> >>
> >> Not surprisingly, this patch doesn't actually fix the issue (or at
> >> least doesn't do it correctly).
> >>
> >> I hadn't noticed that get_fs_pwd() actually calls path_get() on
> >> &context->pwd so the additional path_get() is useless and the
> >> reference doesn't ever actually get freed if audit_putname is called
> >> while we're in a syscall.
> >>
> >> Al, Eric, Jeff; do any of you guys have an understanding of what the
> >> initial bug actually is since this clearly doesn't fix it?
> >>
> >> Cheers,
> >> peter
> >>
> >
> > BTW, I ran this test on one of my KVM guests and it ran just fine. That
> > one is an x86_64 guest running a 3.6.0+ kernel. The root fs on there is
> > ext4 though, not ext3. So perhaps that's a factor?
> >
> > The oops message you posted at least looks like something down in the
> > bowels of ext3 or fs/buffer.c.
> 
> Yeah, the only place this actually happens for me on these giant xen
> instances we have (6 cores, 32G ram) and it happens on both ext3 and
> ext4 filesystems and it happens with 100% reliability.
> 
> The actual oops is from:
> 
> static inline void check_irqs_on(void)
> {
> #ifdef irqs_disabled
>         BUG_ON(irqs_disabled());
> #endif
> }
> 
> with the code path looking like:
> 
> __find_get_block() -> lookup_bh_lru() -> check_irqs_on() -> BUG()
> 

Do you have a backtrace from a more recent kernel? I wonder if
something in the syscall exit codepath is disabling IRQs here?
-- 
Jeff Layton <jlayton@redhat.com>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH] audit: grab a reference to context->pwd when it's cached
  2012-10-05 14:26                 ` Jeff Layton
@ 2012-10-05 15:16                   ` Peter Moody
  2012-10-05 15:18                     ` Peter Moody
  0 siblings, 1 reply; 26+ messages in thread
From: Peter Moody @ 2012-10-05 15:16 UTC (permalink / raw)
  To: Jeff Layton; +Cc: lczerner, linux-audit, sandeen

On Fri, Oct 5, 2012 at 7:26 AM, Jeff Layton <jlayton@redhat.com> wrote:
> On Fri, 5 Oct 2012 06:57:59 -0700
> Peter Moody <pmoody@google.com> wrote:
>
>> On Fri, Oct 5, 2012 at 5:55 AM, Jeff Layton <jlayton@redhat.com> wrote:
>> > On Thu, 4 Oct 2012 11:48:23 -0700
>> > Peter Moody <pmoody@google.com> wrote:
>> >
>> >> On Wed, Sep 26, 2012 at 6:50 AM, Alexander Viro <aviro@redhat.com> wrote:
>> >> > On Tue, Sep 25, 2012 at 10:03:23AM -0700, Peter Moody wrote:
>> >> >> Hey folks,
>> >> >>
>> >> >> following up on old patches, are there any comments on this? Did you
>> >> >> get around to finding a better way to fix this bug, Al?
>> >> >
>> >> > Alas, I've found none ;-/  Looks like we'll have to go with this one,
>> >> > at least until somebody comes up with better solution.
>> >>
>> >> Not surprisingly, this patch doesn't actually fix the issue (or at
>> >> least doesn't do it correctly).
>> >>
>> >> I hadn't noticed that get_fs_pwd() actually calls path_get() on
>> >> &context->pwd so the additional path_get() is useless and the
>> >> reference doesn't ever actually get freed if audit_putname is called
>> >> while we're in a syscall.
>> >>
>> >> Al, Eric, Jeff; do any of you guys have an understanding of what the
>> >> initial bug actually is since this clearly doesn't fix it?
>> >>
>> >> Cheers,
>> >> peter
>> >>
>> >
>> > BTW, I ran this test on one of my KVM guests and it ran just fine. That
>> > one is an x86_64 guest running a 3.6.0+ kernel. The root fs on there is
>> > ext4 though, not ext3. So perhaps that's a factor?
>> >
>> > The oops message you posted at least looks like something down in the
>> > bowels of ext3 or fs/buffer.c.
>>
>> Yeah, the only place this actually happens for me on these giant xen
>> instances we have (6 cores, 32G ram) and it happens on both ext3 and
>> ext4 filesystems and it happens with 100% reliability.
>>
>> The actual oops is from:
>>
>> static inline void check_irqs_on(void)
>> {
>> #ifdef irqs_disabled
>>         BUG_ON(irqs_disabled());
>> #endif
>> }
>>
>> with the code path looking like:
>>
>> __find_get_block() -> lookup_bh_lru() -> check_irqs_on() -> BUG()
>>
>
> Do you have a backtrace from a more recent kernel? I wonder if
> something in the syscall exit codepath is disabling IRQs here?

is 3.6.0-rc1 recent enough or do you want something newer?

> --
> Jeff Layton <jlayton@redhat.com>



-- 
Peter Moody      Google    1.650.253.7306
Security Engineer  pgp:0xC3410038

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH] audit: grab a reference to context->pwd when it's cached
  2012-10-05 15:16                   ` Peter Moody
@ 2012-10-05 15:18                     ` Peter Moody
  2012-10-05 15:57                       ` Peter Moody
  0 siblings, 1 reply; 26+ messages in thread
From: Peter Moody @ 2012-10-05 15:18 UTC (permalink / raw)
  To: Jeff Layton; +Cc: lczerner, linux-audit, sandeen

On Fri, Oct 5, 2012 at 8:16 AM, Peter Moody <pmoody@google.com> wrote:
> On Fri, Oct 5, 2012 at 7:26 AM, Jeff Layton <jlayton@redhat.com> wrote:
>> On Fri, 5 Oct 2012 06:57:59 -0700
>> Peter Moody <pmoody@google.com> wrote:
>>
>>> On Fri, Oct 5, 2012 at 5:55 AM, Jeff Layton <jlayton@redhat.com> wrote:
>>> > On Thu, 4 Oct 2012 11:48:23 -0700
>>> > Peter Moody <pmoody@google.com> wrote:
>>> >
>>> >> On Wed, Sep 26, 2012 at 6:50 AM, Alexander Viro <aviro@redhat.com> wrote:
>>> >> > On Tue, Sep 25, 2012 at 10:03:23AM -0700, Peter Moody wrote:
>>> >> >> Hey folks,
>>> >> >>
>>> >> >> following up on old patches, are there any comments on this? Did you
>>> >> >> get around to finding a better way to fix this bug, Al?
>>> >> >
>>> >> > Alas, I've found none ;-/  Looks like we'll have to go with this one,
>>> >> > at least until somebody comes up with better solution.
>>> >>
>>> >> Not surprisingly, this patch doesn't actually fix the issue (or at
>>> >> least doesn't do it correctly).
>>> >>
>>> >> I hadn't noticed that get_fs_pwd() actually calls path_get() on
>>> >> &context->pwd so the additional path_get() is useless and the
>>> >> reference doesn't ever actually get freed if audit_putname is called
>>> >> while we're in a syscall.
>>> >>
>>> >> Al, Eric, Jeff; do any of you guys have an understanding of what the
>>> >> initial bug actually is since this clearly doesn't fix it?
>>> >>
>>> >> Cheers,
>>> >> peter
>>> >>
>>> >
>>> > BTW, I ran this test on one of my KVM guests and it ran just fine. That
>>> > one is an x86_64 guest running a 3.6.0+ kernel. The root fs on there is
>>> > ext4 though, not ext3. So perhaps that's a factor?
>>> >
>>> > The oops message you posted at least looks like something down in the
>>> > bowels of ext3 or fs/buffer.c.
>>>
>>> Yeah, the only place this actually happens for me on these giant xen
>>> instances we have (6 cores, 32G ram) and it happens on both ext3 and
>>> ext4 filesystems and it happens with 100% reliability.
>>>
>>> The actual oops is from:
>>>
>>> static inline void check_irqs_on(void)
>>> {
>>> #ifdef irqs_disabled
>>>         BUG_ON(irqs_disabled());
>>> #endif
>>> }
>>>
>>> with the code path looking like:
>>>
>>> __find_get_block() -> lookup_bh_lru() -> check_irqs_on() -> BUG()
>>>
>>
>> Do you have a backtrace from a more recent kernel? I wonder if
>> something in the syscall exit codepath is disabling IRQs here?
>
> is 3.6.0-rc1 recent enough or do you want something newer?

nevermind, that doesn't boot. One sec.

>> --
>> Jeff Layton <jlayton@redhat.com>
>
>
>
> --
> Peter Moody      Google    1.650.253.7306
> Security Engineer  pgp:0xC3410038



-- 
Peter Moody      Google    1.650.253.7306
Security Engineer  pgp:0xC3410038

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH] audit: grab a reference to context->pwd when it's cached
  2012-10-05 15:18                     ` Peter Moody
@ 2012-10-05 15:57                       ` Peter Moody
  2012-10-05 16:15                         ` Jeff Layton
  2012-10-05 20:23                         ` Eric Sandeen
  0 siblings, 2 replies; 26+ messages in thread
From: Peter Moody @ 2012-10-05 15:57 UTC (permalink / raw)
  To: Jeff Layton; +Cc: lczerner, linux-audit, sandeen

On Fri, Oct 5, 2012 at 8:18 AM, Peter Moody <pmoody@google.com> wrote:
> On Fri, Oct 5, 2012 at 8:16 AM, Peter Moody <pmoody@google.com> wrote:
>> On Fri, Oct 5, 2012 at 7:26 AM, Jeff Layton <jlayton@redhat.com> wrote:
>>> On Fri, 5 Oct 2012 06:57:59 -0700
>>> Peter Moody <pmoody@google.com> wrote:
>>>
>>>> On Fri, Oct 5, 2012 at 5:55 AM, Jeff Layton <jlayton@redhat.com> wrote:
>>>> > On Thu, 4 Oct 2012 11:48:23 -0700
>>>> > Peter Moody <pmoody@google.com> wrote:
>>>> >
>>>> >> On Wed, Sep 26, 2012 at 6:50 AM, Alexander Viro <aviro@redhat.com> wrote:
>>>> >> > On Tue, Sep 25, 2012 at 10:03:23AM -0700, Peter Moody wrote:
>>>> >> >> Hey folks,
>>>> >> >>
>>>> >> >> following up on old patches, are there any comments on this? Did you
>>>> >> >> get around to finding a better way to fix this bug, Al?
>>>> >> >
>>>> >> > Alas, I've found none ;-/  Looks like we'll have to go with this one,
>>>> >> > at least until somebody comes up with better solution.
>>>> >>
>>>> >> Not surprisingly, this patch doesn't actually fix the issue (or at
>>>> >> least doesn't do it correctly).
>>>> >>
>>>> >> I hadn't noticed that get_fs_pwd() actually calls path_get() on
>>>> >> &context->pwd so the additional path_get() is useless and the
>>>> >> reference doesn't ever actually get freed if audit_putname is called
>>>> >> while we're in a syscall.
>>>> >>
>>>> >> Al, Eric, Jeff; do any of you guys have an understanding of what the
>>>> >> initial bug actually is since this clearly doesn't fix it?
>>>> >>
>>>> >> Cheers,
>>>> >> peter
>>>> >>
>>>> >
>>>> > BTW, I ran this test on one of my KVM guests and it ran just fine. That
>>>> > one is an x86_64 guest running a 3.6.0+ kernel. The root fs on there is
>>>> > ext4 though, not ext3. So perhaps that's a factor?
>>>> >
>>>> > The oops message you posted at least looks like something down in the
>>>> > bowels of ext3 or fs/buffer.c.
>>>>
>>>> Yeah, the only place this actually happens for me on these giant xen
>>>> instances we have (6 cores, 32G ram) and it happens on both ext3 and
>>>> ext4 filesystems and it happens with 100% reliability.
>>>>
>>>> The actual oops is from:
>>>>
>>>> static inline void check_irqs_on(void)
>>>> {
>>>> #ifdef irqs_disabled
>>>>         BUG_ON(irqs_disabled());
>>>> #endif
>>>> }
>>>>
>>>> with the code path looking like:
>>>>
>>>> __find_get_block() -> lookup_bh_lru() -> check_irqs_on() -> BUG()
>>>>
>>>
>>> Do you have a backtrace from a more recent kernel? I wonder if
>>> something in the syscall exit codepath is disabling IRQs here?
>>
>> is 3.6.0-rc1 recent enough or do you want something newer?
>
> nevermind, that doesn't boot. One sec.


here's 3.5.0

 ------------[ cut here ]------------
kernel BUG at fs/buffer.c:1220!
invalid opcode: 0000 [#1] SMP
CPU 0
Pid: 3683, comm: a.out Not tainted 3.5.0 #3
RIP: e030:[<ffffffff816a99f4>]  [<ffffffff816a99f4>]
check_irqs_on.part.8+0x4/0x6
RSP: e02b:ffff8807b156dc28  EFLAGS: 00010046
RAX: ffff8807d0dd0000 RBX: ffff8807a7d6df28 RCX: 0000000005883396
RDX: 0000000000001000 RSI: 0000000005883396 RDI: ffff8807cfc0c000
RBP: ffff8807b156dc28 R08: 0000000000000001 R09: ffff8807a7d6de50
R10: f83a2b0a359bf007 R11: 0000000000000000 R12: ffff8807a7d6de54
R13: ffff8807a7d6de80 R14: ffff8807cfc1f120 R15: 0000000005883396
FS:  00007f97164ec700(0000) GS:ffff8807ffc00000(0063) knlGS:0000000000000000
CS:  e033 DS: 002b ES: 002b CR0: 000000008005003b
CR2: 00000000f76ca3b0 CR3: 00000007bbb53000 CR4: 0000000000002660
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process a.out (pid: 3683, threadinfo ffff8807b156c000, task ffff8807bbae8000)
Stack:
 ffff8807b156dc98 ffffffff8116a099 ffff8807b59f3000 ffff8807b156dd30
 ffff8807b156dd60 ffff8807b156de78 ffff8807b156dc78 ffffffff816af231
 ffff8807b156dcd8 ffff8807a7d6e538 ffff8807a7d6df28 ffff8807a7d6de54
Call Trace:
 [<ffffffff8116a099>] __find_get_block+0x1f9/0x200
 [<ffffffff816af231>] ? down_read+0x11/0x30
 [<ffffffff811d1405>] ext3_clear_blocks+0x75/0x140
 [<ffffffff811d15dc>] ext3_free_data+0x10c/0x150
 [<ffffffff811e2061>] ? ext3_journal_start_sb+0x31/0x60
 [<ffffffff811d1cb5>] ext3_truncate+0x4a5/0x600
 [<ffffffff8123d5b8>] ? journal_start+0xb8/0x100
 [<ffffffff8106f406>] ? bit_waitqueue+0x16/0xc0
 [<ffffffff811d4598>] ext3_evict_inode+0x248/0x2c0
 [<ffffffff81153b9a>] evict+0xaa/0x1b0
 [<ffffffff81154843>] iput+0x103/0x210
 [<ffffffff8114fc88>] dentry_iput+0x88/0xd0
 [<ffffffff811505ec>] dput+0x12c/0x250
 [<ffffffff81146275>] path_put+0x15/0x30
 [<ffffffff810b2f35>] __audit_syscall_exit+0x2e5/0x460
 [<ffffffff816b30be>] sysexit_audit+0x29/0x5b
Code: 04 00 00 4c 8d 88 c0 02 00 00 31 c0 e8 5f da ff ff 48 85 db 74
0c 80 43 5c 01 48 89 df e8 d5
6a aa ff 5b 41 5c 5d c3 55 48 89 e5 <0f> 0b 55 48 89 e5 0f 0b 55 48 89
e5 0f 0b 55 48 89 e5 41 54 53
RIP  [<ffffffff816a99f4>] check_irqs_on.part.8+0x4/0x6
 RSP <ffff8807b156dc28>
---[ end trace 8d09f8cfbb601c14 ]---


>>> --
>>> Jeff Layton <jlayton@redhat.com>
>>
>>
>>
>> --
>> Peter Moody      Google    1.650.253.7306
>> Security Engineer  pgp:0xC3410038
>
>
>
> --
> Peter Moody      Google    1.650.253.7306
> Security Engineer  pgp:0xC3410038



-- 
Peter Moody      Google    1.650.253.7306
Security Engineer  pgp:0xC3410038

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH] audit: grab a reference to context->pwd when it's cached
  2012-10-05 15:57                       ` Peter Moody
@ 2012-10-05 16:15                         ` Jeff Layton
  2012-10-05 17:04                           ` Peter Moody
  2012-10-05 20:23                         ` Eric Sandeen
  1 sibling, 1 reply; 26+ messages in thread
From: Jeff Layton @ 2012-10-05 16:15 UTC (permalink / raw)
  To: Peter Moody; +Cc: lczerner, linux-audit, sandeen

On Fri, 5 Oct 2012 08:57:13 -0700
Peter Moody <pmoody@google.com> wrote:

> On Fri, Oct 5, 2012 at 8:18 AM, Peter Moody <pmoody@google.com> wrote:
> > On Fri, Oct 5, 2012 at 8:16 AM, Peter Moody <pmoody@google.com> wrote:
> >> On Fri, Oct 5, 2012 at 7:26 AM, Jeff Layton <jlayton@redhat.com> wrote:
> >>> On Fri, 5 Oct 2012 06:57:59 -0700
> >>> Peter Moody <pmoody@google.com> wrote:
> >>>
> >>>> On Fri, Oct 5, 2012 at 5:55 AM, Jeff Layton <jlayton@redhat.com> wrote:
> >>>> > On Thu, 4 Oct 2012 11:48:23 -0700
> >>>> > Peter Moody <pmoody@google.com> wrote:
> >>>> >
> >>>> >> On Wed, Sep 26, 2012 at 6:50 AM, Alexander Viro <aviro@redhat.com> wrote:
> >>>> >> > On Tue, Sep 25, 2012 at 10:03:23AM -0700, Peter Moody wrote:
> >>>> >> >> Hey folks,
> >>>> >> >>
> >>>> >> >> following up on old patches, are there any comments on this? Did you
> >>>> >> >> get around to finding a better way to fix this bug, Al?
> >>>> >> >
> >>>> >> > Alas, I've found none ;-/  Looks like we'll have to go with this one,
> >>>> >> > at least until somebody comes up with better solution.
> >>>> >>
> >>>> >> Not surprisingly, this patch doesn't actually fix the issue (or at
> >>>> >> least doesn't do it correctly).
> >>>> >>
> >>>> >> I hadn't noticed that get_fs_pwd() actually calls path_get() on
> >>>> >> &context->pwd so the additional path_get() is useless and the
> >>>> >> reference doesn't ever actually get freed if audit_putname is called
> >>>> >> while we're in a syscall.
> >>>> >>
> >>>> >> Al, Eric, Jeff; do any of you guys have an understanding of what the
> >>>> >> initial bug actually is since this clearly doesn't fix it?
> >>>> >>
> >>>> >> Cheers,
> >>>> >> peter
> >>>> >>
> >>>> >
> >>>> > BTW, I ran this test on one of my KVM guests and it ran just fine. That
> >>>> > one is an x86_64 guest running a 3.6.0+ kernel. The root fs on there is
> >>>> > ext4 though, not ext3. So perhaps that's a factor?
> >>>> >
> >>>> > The oops message you posted at least looks like something down in the
> >>>> > bowels of ext3 or fs/buffer.c.
> >>>>
> >>>> Yeah, the only place this actually happens for me on these giant xen
> >>>> instances we have (6 cores, 32G ram) and it happens on both ext3 and
> >>>> ext4 filesystems and it happens with 100% reliability.
> >>>>
> >>>> The actual oops is from:
> >>>>
> >>>> static inline void check_irqs_on(void)
> >>>> {
> >>>> #ifdef irqs_disabled
> >>>>         BUG_ON(irqs_disabled());
> >>>> #endif
> >>>> }
> >>>>
> >>>> with the code path looking like:
> >>>>
> >>>> __find_get_block() -> lookup_bh_lru() -> check_irqs_on() -> BUG()
> >>>>
> >>>
> >>> Do you have a backtrace from a more recent kernel? I wonder if
> >>> something in the syscall exit codepath is disabling IRQs here?
> >>
> >> is 3.6.0-rc1 recent enough or do you want something newer?
> >
> > nevermind, that doesn't boot. One sec.
> 
> 
> here's 3.5.0
> 
>  ------------[ cut here ]------------
> kernel BUG at fs/buffer.c:1220!
> invalid opcode: 0000 [#1] SMP
> CPU 0
> Pid: 3683, comm: a.out Not tainted 3.5.0 #3
> RIP: e030:[<ffffffff816a99f4>]  [<ffffffff816a99f4>]
> check_irqs_on.part.8+0x4/0x6
> RSP: e02b:ffff8807b156dc28  EFLAGS: 00010046
> RAX: ffff8807d0dd0000 RBX: ffff8807a7d6df28 RCX: 0000000005883396
> RDX: 0000000000001000 RSI: 0000000005883396 RDI: ffff8807cfc0c000
> RBP: ffff8807b156dc28 R08: 0000000000000001 R09: ffff8807a7d6de50
> R10: f83a2b0a359bf007 R11: 0000000000000000 R12: ffff8807a7d6de54
> R13: ffff8807a7d6de80 R14: ffff8807cfc1f120 R15: 0000000005883396
> FS:  00007f97164ec700(0000) GS:ffff8807ffc00000(0063) knlGS:0000000000000000
> CS:  e033 DS: 002b ES: 002b CR0: 000000008005003b
> CR2: 00000000f76ca3b0 CR3: 00000007bbb53000 CR4: 0000000000002660
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> Process a.out (pid: 3683, threadinfo ffff8807b156c000, task ffff8807bbae8000)
> Stack:
>  ffff8807b156dc98 ffffffff8116a099 ffff8807b59f3000 ffff8807b156dd30
>  ffff8807b156dd60 ffff8807b156de78 ffff8807b156dc78 ffffffff816af231
>  ffff8807b156dcd8 ffff8807a7d6e538 ffff8807a7d6df28 ffff8807a7d6de54
> Call Trace:
>  [<ffffffff8116a099>] __find_get_block+0x1f9/0x200
>  [<ffffffff816af231>] ? down_read+0x11/0x30
>  [<ffffffff811d1405>] ext3_clear_blocks+0x75/0x140
>  [<ffffffff811d15dc>] ext3_free_data+0x10c/0x150
>  [<ffffffff811e2061>] ? ext3_journal_start_sb+0x31/0x60
>  [<ffffffff811d1cb5>] ext3_truncate+0x4a5/0x600
>  [<ffffffff8123d5b8>] ? journal_start+0xb8/0x100
>  [<ffffffff8106f406>] ? bit_waitqueue+0x16/0xc0
>  [<ffffffff811d4598>] ext3_evict_inode+0x248/0x2c0
>  [<ffffffff81153b9a>] evict+0xaa/0x1b0
>  [<ffffffff81154843>] iput+0x103/0x210
>  [<ffffffff8114fc88>] dentry_iput+0x88/0xd0
>  [<ffffffff811505ec>] dput+0x12c/0x250
>  [<ffffffff81146275>] path_put+0x15/0x30
>  [<ffffffff810b2f35>] __audit_syscall_exit+0x2e5/0x460
>  [<ffffffff816b30be>] sysexit_audit+0x29/0x5b
> Code: 04 00 00 4c 8d 88 c0 02 00 00 31 c0 e8 5f da ff ff 48 85 db 74
> 0c 80 43 5c 01 48 89 df e8 d5
> 6a aa ff 5b 41 5c 5d c3 55 48 89 e5 <0f> 0b 55 48 89 e5 0f 0b 55 48 89
> e5 0f 0b 55 48 89 e5 41 54 53
> RIP  [<ffffffff816a99f4>] check_irqs_on.part.8+0x4/0x6
>  RSP <ffff8807b156dc28>
> ---[ end trace 8d09f8cfbb601c14 ]---
> 
> 

I don't see a function called sysexit_audit in 3.5. I assume that's
created via some sort of macro goop?

Could this be a xen specific problem? Might be interesting to add a
check for irqs being disabled early in __audit_syscall_exit and see
whether its doing that universally in its syscall exit routine?

-- 
Jeff Layton <jlayton@redhat.com>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH] audit: grab a reference to context->pwd when it's cached
  2012-10-05 16:15                         ` Jeff Layton
@ 2012-10-05 17:04                           ` Peter Moody
  2012-10-05 19:41                             ` Peter Moody
  2012-10-08 14:10                             ` Jeff Layton
  0 siblings, 2 replies; 26+ messages in thread
From: Peter Moody @ 2012-10-05 17:04 UTC (permalink / raw)
  To: Jeff Layton; +Cc: lczerner, linux-audit, sandeen

On Fri, Oct 5, 2012 at 9:15 AM, Jeff Layton <jlayton@redhat.com> wrote:

>>
>>  ------------[ cut here ]------------
>> kernel BUG at fs/buffer.c:1220!
>> invalid opcode: 0000 [#1] SMP
>> CPU 0
>> Pid: 3683, comm: a.out Not tainted 3.5.0 #3
>> RIP: e030:[<ffffffff816a99f4>]  [<ffffffff816a99f4>]
>> check_irqs_on.part.8+0x4/0x6
>> RSP: e02b:ffff8807b156dc28  EFLAGS: 00010046
>> RAX: ffff8807d0dd0000 RBX: ffff8807a7d6df28 RCX: 0000000005883396
>> RDX: 0000000000001000 RSI: 0000000005883396 RDI: ffff8807cfc0c000
>> RBP: ffff8807b156dc28 R08: 0000000000000001 R09: ffff8807a7d6de50
>> R10: f83a2b0a359bf007 R11: 0000000000000000 R12: ffff8807a7d6de54
>> R13: ffff8807a7d6de80 R14: ffff8807cfc1f120 R15: 0000000005883396
>> FS:  00007f97164ec700(0000) GS:ffff8807ffc00000(0063) knlGS:0000000000000000
>> CS:  e033 DS: 002b ES: 002b CR0: 000000008005003b
>> CR2: 00000000f76ca3b0 CR3: 00000007bbb53000 CR4: 0000000000002660
>> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
>> Process a.out (pid: 3683, threadinfo ffff8807b156c000, task ffff8807bbae8000)
>> Stack:
>>  ffff8807b156dc98 ffffffff8116a099 ffff8807b59f3000 ffff8807b156dd30
>>  ffff8807b156dd60 ffff8807b156de78 ffff8807b156dc78 ffffffff816af231
>>  ffff8807b156dcd8 ffff8807a7d6e538 ffff8807a7d6df28 ffff8807a7d6de54
>> Call Trace:
>>  [<ffffffff8116a099>] __find_get_block+0x1f9/0x200
>>  [<ffffffff816af231>] ? down_read+0x11/0x30
>>  [<ffffffff811d1405>] ext3_clear_blocks+0x75/0x140
>>  [<ffffffff811d15dc>] ext3_free_data+0x10c/0x150
>>  [<ffffffff811e2061>] ? ext3_journal_start_sb+0x31/0x60
>>  [<ffffffff811d1cb5>] ext3_truncate+0x4a5/0x600
>>  [<ffffffff8123d5b8>] ? journal_start+0xb8/0x100
>>  [<ffffffff8106f406>] ? bit_waitqueue+0x16/0xc0
>>  [<ffffffff811d4598>] ext3_evict_inode+0x248/0x2c0
>>  [<ffffffff81153b9a>] evict+0xaa/0x1b0
>>  [<ffffffff81154843>] iput+0x103/0x210
>>  [<ffffffff8114fc88>] dentry_iput+0x88/0xd0
>>  [<ffffffff811505ec>] dput+0x12c/0x250
>>  [<ffffffff81146275>] path_put+0x15/0x30
>>  [<ffffffff810b2f35>] __audit_syscall_exit+0x2e5/0x460
>>  [<ffffffff816b30be>] sysexit_audit+0x29/0x5b
>> Code: 04 00 00 4c 8d 88 c0 02 00 00 31 c0 e8 5f da ff ff 48 85 db 74
>> 0c 80 43 5c 01 48 89 df e8 d5
>> 6a aa ff 5b 41 5c 5d c3 55 48 89 e5 <0f> 0b 55 48 89 e5 0f 0b 55 48 89
>> e5 0f 0b 55 48 89 e5 41 54 53
>> RIP  [<ffffffff816a99f4>] check_irqs_on.part.8+0x4/0x6
>>  RSP <ffff8807b156dc28>
>> ---[ end trace 8d09f8cfbb601c14 ]---
>>
>>
>
> I don't see a function called sysexit_audit in 3.5. I assume that's
> created via some sort of macro goop?

It looks like it's defined in arch/x86/kernel/entry_32.S

my asm is non-existent, but it looks like it's calling
__audit_syscall_exit and the disabling interrupts.

> Could this be a xen specific problem?

It could be. I asked the xen folks about this and they felt that it
was unlikely that the bug itself was their
(http://lists.xen.org/archives/html/xen-devel/2012-08/msg01052.html,
specifically http://lists.xen.org/archives/html/xen-devel/2012-08/msg01127.html)

xen could definitely be doing something wonky with irqs though.

> Might be interesting to add a
> check for irqs being disabled early in __audit_syscall_exit and see
> whether its doing that universally in its syscall exit routine?

I'll try this, thanks.

> --
> Jeff Layton <jlayton@redhat.com>



-- 
Peter Moody      Google    1.650.253.7306
Security Engineer  pgp:0xC3410038

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH] audit: grab a reference to context->pwd when it's cached
  2012-10-05 17:04                           ` Peter Moody
@ 2012-10-05 19:41                             ` Peter Moody
  2012-10-08 12:23                               ` Jeff Layton
  2012-10-08 14:10                             ` Jeff Layton
  1 sibling, 1 reply; 26+ messages in thread
From: Peter Moody @ 2012-10-05 19:41 UTC (permalink / raw)
  To: Jeff Layton; +Cc: lczerner, linux-audit, sandeen

On Fri, Oct 5, 2012 at 10:04 AM, Peter Moody <pmoody@google.com> wrote:
> On Fri, Oct 5, 2012 at 9:15 AM, Jeff Layton <jlayton@redhat.com> wrote:
>
>>>
>>>  ------------[ cut here ]------------
>>> kernel BUG at fs/buffer.c:1220!
>>> invalid opcode: 0000 [#1] SMP
>>> CPU 0
>>> Pid: 3683, comm: a.out Not tainted 3.5.0 #3
>>> RIP: e030:[<ffffffff816a99f4>]  [<ffffffff816a99f4>]
>>> check_irqs_on.part.8+0x4/0x6
>>> RSP: e02b:ffff8807b156dc28  EFLAGS: 00010046
>>> RAX: ffff8807d0dd0000 RBX: ffff8807a7d6df28 RCX: 0000000005883396
>>> RDX: 0000000000001000 RSI: 0000000005883396 RDI: ffff8807cfc0c000
>>> RBP: ffff8807b156dc28 R08: 0000000000000001 R09: ffff8807a7d6de50
>>> R10: f83a2b0a359bf007 R11: 0000000000000000 R12: ffff8807a7d6de54
>>> R13: ffff8807a7d6de80 R14: ffff8807cfc1f120 R15: 0000000005883396
>>> FS:  00007f97164ec700(0000) GS:ffff8807ffc00000(0063) knlGS:0000000000000000
>>> CS:  e033 DS: 002b ES: 002b CR0: 000000008005003b
>>> CR2: 00000000f76ca3b0 CR3: 00000007bbb53000 CR4: 0000000000002660
>>> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>>> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
>>> Process a.out (pid: 3683, threadinfo ffff8807b156c000, task ffff8807bbae8000)
>>> Stack:
>>>  ffff8807b156dc98 ffffffff8116a099 ffff8807b59f3000 ffff8807b156dd30
>>>  ffff8807b156dd60 ffff8807b156de78 ffff8807b156dc78 ffffffff816af231
>>>  ffff8807b156dcd8 ffff8807a7d6e538 ffff8807a7d6df28 ffff8807a7d6de54
>>> Call Trace:
>>>  [<ffffffff8116a099>] __find_get_block+0x1f9/0x200
>>>  [<ffffffff816af231>] ? down_read+0x11/0x30
>>>  [<ffffffff811d1405>] ext3_clear_blocks+0x75/0x140
>>>  [<ffffffff811d15dc>] ext3_free_data+0x10c/0x150
>>>  [<ffffffff811e2061>] ? ext3_journal_start_sb+0x31/0x60
>>>  [<ffffffff811d1cb5>] ext3_truncate+0x4a5/0x600
>>>  [<ffffffff8123d5b8>] ? journal_start+0xb8/0x100
>>>  [<ffffffff8106f406>] ? bit_waitqueue+0x16/0xc0
>>>  [<ffffffff811d4598>] ext3_evict_inode+0x248/0x2c0
>>>  [<ffffffff81153b9a>] evict+0xaa/0x1b0
>>>  [<ffffffff81154843>] iput+0x103/0x210
>>>  [<ffffffff8114fc88>] dentry_iput+0x88/0xd0
>>>  [<ffffffff811505ec>] dput+0x12c/0x250
>>>  [<ffffffff81146275>] path_put+0x15/0x30
>>>  [<ffffffff810b2f35>] __audit_syscall_exit+0x2e5/0x460
>>>  [<ffffffff816b30be>] sysexit_audit+0x29/0x5b
>>> Code: 04 00 00 4c 8d 88 c0 02 00 00 31 c0 e8 5f da ff ff 48 85 db 74
>>> 0c 80 43 5c 01 48 89 df e8 d5
>>> 6a aa ff 5b 41 5c 5d c3 55 48 89 e5 <0f> 0b 55 48 89 e5 0f 0b 55 48 89
>>> e5 0f 0b 55 48 89 e5 41 54 53
>>> RIP  [<ffffffff816a99f4>] check_irqs_on.part.8+0x4/0x6
>>>  RSP <ffff8807b156dc28>
>>> ---[ end trace 8d09f8cfbb601c14 ]---
>>>
>>>
>>
>> I don't see a function called sysexit_audit in 3.5. I assume that's
>> created via some sort of macro goop?
>
> It looks like it's defined in arch/x86/kernel/entry_32.S
>
> my asm is non-existent, but it looks like it's calling
> __audit_syscall_exit and the disabling interrupts.
>
>> Could this be a xen specific problem?
>
> It could be. I asked the xen folks about this and they felt that it
> was unlikely that the bug itself was their
> (http://lists.xen.org/archives/html/xen-devel/2012-08/msg01052.html,
> specifically http://lists.xen.org/archives/html/xen-devel/2012-08/msg01127.html)
>
> xen could definitely be doing something wonky with irqs though.
>
>> Might be interesting to add a
>> check for irqs being disabled early in __audit_syscall_exit and see
>> whether its doing that universally in its syscall exit routine?
>
> I'll try this, thanks.

So, I've added WARN_ON(irqs_disabled()) in various spots in the
audit_syscall_exit path all the way up to the path_put in
audit_free_names() that ends up triggering this and prior to Oops'ing,
I don't see anything.

Post oops, I get lots of messages like:

------------[ cut here ]------------
WARNING: at kernel/auditsc.c:1021 audit_free_names+0xb0/0xdb()
Pid: 3981, comm: python2.6 Tainted: G      D W    3.2.5-at17-ganetixenu #31
Call Trace:
 [<ffffffff81693483>] ? audit_free_names+0xb0/0xdb
 [<ffffffff8105f405>] warn_slowpath_common+0x75/0xb0
 [<ffffffff8105f535>] warn_slowpath_null+0x15/0x20
 [<ffffffff81693483>] audit_free_names+0xb0/0xdb
 [<ffffffff810ac629>] audit_syscall_exit+0x139/0x1e0
 [<ffffffff8169fd6a>] sysexit_audit+0x21/0x5f
---[ end trace ec7a15dbe40eaf1d ]---
------------[ cut here ]------------
WARNING: at kernel/auditsc.c:1021 audit_free_names+0xb0/0xdb()
Pid: 3981, comm: python2.6 Tainted: G      D W    3.2.5-at17-ganetixenu #31
Call Trace:
 [<ffffffff81693483>] ? audit_free_names+0xb0/0xdb
 [<ffffffff8105f405>] warn_slowpath_common+0x75/0xb0
 [<ffffffff8105f535>] warn_slowpath_null+0x15/0x20
 [<ffffffff81693483>] audit_free_names+0xb0/0xdb
 [<ffffffff810ac629>] audit_syscall_exit+0x139/0x1e0
 [<ffffffff8169fd6a>] sysexit_audit+0x21/0x5f
---[ end trace ec7a15dbe40eaf1e ]---

where that python command is a python par file that may be doing some
funny unlinking as well. I haven't tracked it down.

thoughts? does this not look like a bug in audit; maybe something
somewhere else is disabling interrupts?

>> --
>> Jeff Layton <jlayton@redhat.com>
>
>
>
> --
> Peter Moody      Google    1.650.253.7306
> Security Engineer  pgp:0xC3410038



-- 
Peter Moody      Google    1.650.253.7306
Security Engineer  pgp:0xC3410038

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH] audit: grab a reference to context->pwd when it's cached
  2012-10-05 15:57                       ` Peter Moody
  2012-10-05 16:15                         ` Jeff Layton
@ 2012-10-05 20:23                         ` Eric Sandeen
  2012-10-07  3:22                           ` Peter Moody
  1 sibling, 1 reply; 26+ messages in thread
From: Eric Sandeen @ 2012-10-05 20:23 UTC (permalink / raw)
  To: Peter Moody; +Cc: lczerner, linux-audit, Jeff Layton

On 10/5/12 10:57 AM, Peter Moody wrote:
> On Fri, Oct 5, 2012 at 8:18 AM, Peter Moody <pmoody@google.com> wrote:
>> On Fri, Oct 5, 2012 at 8:16 AM, Peter Moody <pmoody@google.com> wrote:
>>> On Fri, Oct 5, 2012 at 7:26 AM, Jeff Layton <jlayton@redhat.com> wrote:
>>>> On Fri, 5 Oct 2012 06:57:59 -0700
>>>> Peter Moody <pmoody@google.com> wrote:
>>>>
>>>>> On Fri, Oct 5, 2012 at 5:55 AM, Jeff Layton <jlayton@redhat.com> wrote:
>>>>>> On Thu, 4 Oct 2012 11:48:23 -0700
>>>>>> Peter Moody <pmoody@google.com> wrote:
>>>>>>
>>>>>>> On Wed, Sep 26, 2012 at 6:50 AM, Alexander Viro <aviro@redhat.com> wrote:
>>>>>>>> On Tue, Sep 25, 2012 at 10:03:23AM -0700, Peter Moody wrote:
>>>>>>>>> Hey folks,
>>>>>>>>>
>>>>>>>>> following up on old patches, are there any comments on this? Did you
>>>>>>>>> get around to finding a better way to fix this bug, Al?
>>>>>>>>
>>>>>>>> Alas, I've found none ;-/  Looks like we'll have to go with this one,
>>>>>>>> at least until somebody comes up with better solution.
>>>>>>>
>>>>>>> Not surprisingly, this patch doesn't actually fix the issue (or at
>>>>>>> least doesn't do it correctly).
>>>>>>>
>>>>>>> I hadn't noticed that get_fs_pwd() actually calls path_get() on
>>>>>>> &context->pwd so the additional path_get() is useless and the
>>>>>>> reference doesn't ever actually get freed if audit_putname is called
>>>>>>> while we're in a syscall.
>>>>>>>
>>>>>>> Al, Eric, Jeff; do any of you guys have an understanding of what the
>>>>>>> initial bug actually is since this clearly doesn't fix it?
>>>>>>>
>>>>>>> Cheers,
>>>>>>> peter
>>>>>>>
>>>>>>
>>>>>> BTW, I ran this test on one of my KVM guests and it ran just fine. That
>>>>>> one is an x86_64 guest running a 3.6.0+ kernel. The root fs on there is
>>>>>> ext4 though, not ext3. So perhaps that's a factor?
>>>>>>
>>>>>> The oops message you posted at least looks like something down in the
>>>>>> bowels of ext3 or fs/buffer.c.
>>>>>
>>>>> Yeah, the only place this actually happens for me on these giant xen
>>>>> instances we have (6 cores, 32G ram) and it happens on both ext3 and
>>>>> ext4 filesystems and it happens with 100% reliability.
>>>>>
>>>>> The actual oops is from:
>>>>>
>>>>> static inline void check_irqs_on(void)
>>>>> {
>>>>> #ifdef irqs_disabled
>>>>>         BUG_ON(irqs_disabled());
>>>>> #endif
>>>>> }
>>>>>
>>>>> with the code path looking like:
>>>>>
>>>>> __find_get_block() -> lookup_bh_lru() -> check_irqs_on() -> BUG()
>>>>>
>>>>
>>>> Do you have a backtrace from a more recent kernel? I wonder if
>>>> something in the syscall exit codepath is disabling IRQs here?
>>>
>>> is 3.6.0-rc1 recent enough or do you want something newer?
>>
>> nevermind, that doesn't boot. One sec.
> 
> 
> here's 3.5.0
> 
>  ------------[ cut here ]------------
> kernel BUG at fs/buffer.c:1220!
> invalid opcode: 0000 [#1] SMP
> CPU 0
> Pid: 3683, comm: a.out Not tainted 3.5.0 #3
> RIP: e030:[<ffffffff816a99f4>]  [<ffffffff816a99f4>]
> check_irqs_on.part.8+0x4/0x6
> RSP: e02b:ffff8807b156dc28  EFLAGS: 00010046
> RAX: ffff8807d0dd0000 RBX: ffff8807a7d6df28 RCX: 0000000005883396
> RDX: 0000000000001000 RSI: 0000000005883396 RDI: ffff8807cfc0c000
> RBP: ffff8807b156dc28 R08: 0000000000000001 R09: ffff8807a7d6de50
> R10: f83a2b0a359bf007 R11: 0000000000000000 R12: ffff8807a7d6de54
> R13: ffff8807a7d6de80 R14: ffff8807cfc1f120 R15: 0000000005883396
> FS:  00007f97164ec700(0000) GS:ffff8807ffc00000(0063) knlGS:0000000000000000
> CS:  e033 DS: 002b ES: 002b CR0: 000000008005003b
> CR2: 00000000f76ca3b0 CR3: 00000007bbb53000 CR4: 0000000000002660
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> Process a.out (pid: 3683, threadinfo ffff8807b156c000, task ffff8807bbae8000)

One thing that crossed my mind is that back in the 4k stacks days I
think sometimes a blown stack could corrupt threadinfo and we'd get
spurious warnings like this.

Is there any chance something like that has happened? (any stack
depth messages, etc?)

Maybe a crashdump on the BUG() for further poking around would
be in order.

-Eric 

> Stack:
>  ffff8807b156dc98 ffffffff8116a099 ffff8807b59f3000 ffff8807b156dd30
>  ffff8807b156dd60 ffff8807b156de78 ffff8807b156dc78 ffffffff816af231
>  ffff8807b156dcd8 ffff8807a7d6e538 ffff8807a7d6df28 ffff8807a7d6de54
> Call Trace:
>  [<ffffffff8116a099>] __find_get_block+0x1f9/0x200
>  [<ffffffff816af231>] ? down_read+0x11/0x30
>  [<ffffffff811d1405>] ext3_clear_blocks+0x75/0x140
>  [<ffffffff811d15dc>] ext3_free_data+0x10c/0x150
>  [<ffffffff811e2061>] ? ext3_journal_start_sb+0x31/0x60
>  [<ffffffff811d1cb5>] ext3_truncate+0x4a5/0x600
>  [<ffffffff8123d5b8>] ? journal_start+0xb8/0x100
>  [<ffffffff8106f406>] ? bit_waitqueue+0x16/0xc0
>  [<ffffffff811d4598>] ext3_evict_inode+0x248/0x2c0
>  [<ffffffff81153b9a>] evict+0xaa/0x1b0
>  [<ffffffff81154843>] iput+0x103/0x210
>  [<ffffffff8114fc88>] dentry_iput+0x88/0xd0
>  [<ffffffff811505ec>] dput+0x12c/0x250
>  [<ffffffff81146275>] path_put+0x15/0x30
>  [<ffffffff810b2f35>] __audit_syscall_exit+0x2e5/0x460
>  [<ffffffff816b30be>] sysexit_audit+0x29/0x5b
> Code: 04 00 00 4c 8d 88 c0 02 00 00 31 c0 e8 5f da ff ff 48 85 db 74
> 0c 80 43 5c 01 48 89 df e8 d5
> 6a aa ff 5b 41 5c 5d c3 55 48 89 e5 <0f> 0b 55 48 89 e5 0f 0b 55 48 89
> e5 0f 0b 55 48 89 e5 41 54 53
> RIP  [<ffffffff816a99f4>] check_irqs_on.part.8+0x4/0x6
>  RSP <ffff8807b156dc28>
> ---[ end trace 8d09f8cfbb601c14 ]---
> 
> 
>>>> --
>>>> Jeff Layton <jlayton@redhat.com>
>>>
>>>
>>>
>>> --
>>> Peter Moody      Google    1.650.253.7306
>>> Security Engineer  pgp:0xC3410038
>>
>>
>>
>> --
>> Peter Moody      Google    1.650.253.7306
>> Security Engineer  pgp:0xC3410038
> 
> 
> 

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH] audit: grab a reference to context->pwd when it's cached
  2012-10-05 20:23                         ` Eric Sandeen
@ 2012-10-07  3:22                           ` Peter Moody
  2012-10-08 18:48                             ` Eric Sandeen
  0 siblings, 1 reply; 26+ messages in thread
From: Peter Moody @ 2012-10-07  3:22 UTC (permalink / raw)
  To: Eric Sandeen; +Cc: lczerner, linux-audit, Jeff Layton

On Fri, Oct 5, 2012 at 1:23 PM, Eric Sandeen <sandeen@redhat.com> wrote:
> On 10/5/12 10:57 AM, Peter Moody wrote:
>> On Fri, Oct 5, 2012 at 8:18 AM, Peter Moody <pmoody@google.com> wrote:
>>> On Fri, Oct 5, 2012 at 8:16 AM, Peter Moody <pmoody@google.com> wrote:
>>>> On Fri, Oct 5, 2012 at 7:26 AM, Jeff Layton <jlayton@redhat.com> wrote:
>>>>> On Fri, 5 Oct 2012 06:57:59 -0700
>>>>> Peter Moody <pmoody@google.com> wrote:
>>>>>
>>>>>> On Fri, Oct 5, 2012 at 5:55 AM, Jeff Layton <jlayton@redhat.com> wrote:
>>>>>>> On Thu, 4 Oct 2012 11:48:23 -0700
>>>>>>> Peter Moody <pmoody@google.com> wrote:
>>>>>>>
>>>>>>>> On Wed, Sep 26, 2012 at 6:50 AM, Alexander Viro <aviro@redhat.com> wrote:
>>>>>>>>> On Tue, Sep 25, 2012 at 10:03:23AM -0700, Peter Moody wrote:
>>>>>>>>>> Hey folks,
>>>>>>>>>>
>>>>>>>>>> following up on old patches, are there any comments on this? Did you
>>>>>>>>>> get around to finding a better way to fix this bug, Al?
>>>>>>>>>
>>>>>>>>> Alas, I've found none ;-/  Looks like we'll have to go with this one,
>>>>>>>>> at least until somebody comes up with better solution.
>>>>>>>>
>>>>>>>> Not surprisingly, this patch doesn't actually fix the issue (or at
>>>>>>>> least doesn't do it correctly).
>>>>>>>>
>>>>>>>> I hadn't noticed that get_fs_pwd() actually calls path_get() on
>>>>>>>> &context->pwd so the additional path_get() is useless and the
>>>>>>>> reference doesn't ever actually get freed if audit_putname is called
>>>>>>>> while we're in a syscall.
>>>>>>>>
>>>>>>>> Al, Eric, Jeff; do any of you guys have an understanding of what the
>>>>>>>> initial bug actually is since this clearly doesn't fix it?
>>>>>>>>
>>>>>>>> Cheers,
>>>>>>>> peter
>>>>>>>>
>>>>>>>
>>>>>>> BTW, I ran this test on one of my KVM guests and it ran just fine. That
>>>>>>> one is an x86_64 guest running a 3.6.0+ kernel. The root fs on there is
>>>>>>> ext4 though, not ext3. So perhaps that's a factor?
>>>>>>>
>>>>>>> The oops message you posted at least looks like something down in the
>>>>>>> bowels of ext3 or fs/buffer.c.
>>>>>>
>>>>>> Yeah, the only place this actually happens for me on these giant xen
>>>>>> instances we have (6 cores, 32G ram) and it happens on both ext3 and
>>>>>> ext4 filesystems and it happens with 100% reliability.
>>>>>>
>>>>>> The actual oops is from:
>>>>>>
>>>>>> static inline void check_irqs_on(void)
>>>>>> {
>>>>>> #ifdef irqs_disabled
>>>>>>         BUG_ON(irqs_disabled());
>>>>>> #endif
>>>>>> }
>>>>>>
>>>>>> with the code path looking like:
>>>>>>
>>>>>> __find_get_block() -> lookup_bh_lru() -> check_irqs_on() -> BUG()
>>>>>>
>>>>>
>>>>> Do you have a backtrace from a more recent kernel? I wonder if
>>>>> something in the syscall exit codepath is disabling IRQs here?
>>>>
>>>> is 3.6.0-rc1 recent enough or do you want something newer?
>>>
>>> nevermind, that doesn't boot. One sec.
>>
>>
>> here's 3.5.0
>>
>>  ------------[ cut here ]------------
>> kernel BUG at fs/buffer.c:1220!
>> invalid opcode: 0000 [#1] SMP
>> CPU 0
>> Pid: 3683, comm: a.out Not tainted 3.5.0 #3
>> RIP: e030:[<ffffffff816a99f4>]  [<ffffffff816a99f4>]
>> check_irqs_on.part.8+0x4/0x6
>> RSP: e02b:ffff8807b156dc28  EFLAGS: 00010046
>> RAX: ffff8807d0dd0000 RBX: ffff8807a7d6df28 RCX: 0000000005883396
>> RDX: 0000000000001000 RSI: 0000000005883396 RDI: ffff8807cfc0c000
>> RBP: ffff8807b156dc28 R08: 0000000000000001 R09: ffff8807a7d6de50
>> R10: f83a2b0a359bf007 R11: 0000000000000000 R12: ffff8807a7d6de54
>> R13: ffff8807a7d6de80 R14: ffff8807cfc1f120 R15: 0000000005883396
>> FS:  00007f97164ec700(0000) GS:ffff8807ffc00000(0063) knlGS:0000000000000000
>> CS:  e033 DS: 002b ES: 002b CR0: 000000008005003b
>> CR2: 00000000f76ca3b0 CR3: 00000007bbb53000 CR4: 0000000000002660
>> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
>> Process a.out (pid: 3683, threadinfo ffff8807b156c000, task ffff8807bbae8000)
>
> One thing that crossed my mind is that back in the 4k stacks days I
> think sometimes a blown stack could corrupt threadinfo and we'd get
> spurious warnings like this.
>
> Is there any chance something like that has happened? (any stack
> depth messages, etc?)

I do'nt see any stack depth messages in there, no.

> Maybe a crashdump on the BUG() for further poking around would
> be in order.

when I add a painc() there, this is what I get. I'm not seeing
anything about stack depth, Is there something obvious I'm missing?

Kernel panic - not syncing: irqs disabled
Pid: 23011, comm: a.out Tainted: G        W    3.2.5-at18-ganetixenu #2
Call Trace:
 [<ffffffff81692b16>] panic+0x8c/0x1a3
 [<ffffffff8100926d>] ? xen_force_evtchn_callback+0xd/0x10
 [<ffffffff816960d6>] check_irqs_on.part.13+0x25/0x25
 [<ffffffff81161dc9>] __find_get_block+0x1f9/0x200
 [<ffffffff813c7a2f>] ? number.isra.2+0x31f/0x350
 [<ffffffff811c9d85>] ext3_clear_blocks+0x75/0x140
 [<ffffffff811c9f5c>] ext3_free_data+0x10c/0x150
 [<ffffffff811da961>] ? ext3_journal_start_sb+0x31/0x60
 [<ffffffff811ca635>] ext3_truncate+0x4a5/0x600
 [<ffffffff81232f38>] ? journal_start+0xb8/0x100
 [<ffffffff811ccf48>] ext3_evict_inode+0x228/0x2a0
 [<ffffffff8114bfb1>] evict+0xa1/0x1a0
 [<ffffffff8114cc61>] iput+0x101/0x210
 [<ffffffff81148040>] d_kill+0xf0/0x130
 [<ffffffff81148bd2>] dput+0xd2/0x1b0
 [<ffffffff8113eb85>] path_put+0x15/0x30
 [<ffffffff8169348f>] audit_free_names+0xbc/0xdb
 [<ffffffff810ac629>] audit_syscall_exit+0x139/0x1e0
 [<ffffffff8169fdaa>] sysexit_audit+0x21/0x5f

> -Eric
>
>> Stack:
>>  ffff8807b156dc98 ffffffff8116a099 ffff8807b59f3000 ffff8807b156dd30
>>  ffff8807b156dd60 ffff8807b156de78 ffff8807b156dc78 ffffffff816af231
>>  ffff8807b156dcd8 ffff8807a7d6e538 ffff8807a7d6df28 ffff8807a7d6de54
>> Call Trace:
>>  [<ffffffff8116a099>] __find_get_block+0x1f9/0x200
>>  [<ffffffff816af231>] ? down_read+0x11/0x30
>>  [<ffffffff811d1405>] ext3_clear_blocks+0x75/0x140
>>  [<ffffffff811d15dc>] ext3_free_data+0x10c/0x150
>>  [<ffffffff811e2061>] ? ext3_journal_start_sb+0x31/0x60
>>  [<ffffffff811d1cb5>] ext3_truncate+0x4a5/0x600
>>  [<ffffffff8123d5b8>] ? journal_start+0xb8/0x100
>>  [<ffffffff8106f406>] ? bit_waitqueue+0x16/0xc0
>>  [<ffffffff811d4598>] ext3_evict_inode+0x248/0x2c0
>>  [<ffffffff81153b9a>] evict+0xaa/0x1b0
>>  [<ffffffff81154843>] iput+0x103/0x210
>>  [<ffffffff8114fc88>] dentry_iput+0x88/0xd0
>>  [<ffffffff811505ec>] dput+0x12c/0x250
>>  [<ffffffff81146275>] path_put+0x15/0x30
>>  [<ffffffff810b2f35>] __audit_syscall_exit+0x2e5/0x460
>>  [<ffffffff816b30be>] sysexit_audit+0x29/0x5b
>> Code: 04 00 00 4c 8d 88 c0 02 00 00 31 c0 e8 5f da ff ff 48 85 db 74
>> 0c 80 43 5c 01 48 89 df e8 d5
>> 6a aa ff 5b 41 5c 5d c3 55 48 89 e5 <0f> 0b 55 48 89 e5 0f 0b 55 48 89
>> e5 0f 0b 55 48 89 e5 41 54 53
>> RIP  [<ffffffff816a99f4>] check_irqs_on.part.8+0x4/0x6
>>  RSP <ffff8807b156dc28>
>> ---[ end trace 8d09f8cfbb601c14 ]---
>>
>>
>>>>> --
>>>>> Jeff Layton <jlayton@redhat.com>
>>>>
>>>>
>>>>
>>>> --
>>>> Peter Moody      Google    1.650.253.7306
>>>> Security Engineer  pgp:0xC3410038
>>>
>>>
>>>
>>> --
>>> Peter Moody      Google    1.650.253.7306
>>> Security Engineer  pgp:0xC3410038
>>
>>
>>
>



-- 
Peter Moody      Google    1.650.253.7306
Security Engineer  pgp:0xC3410038

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH] audit: grab a reference to context->pwd when it's cached
  2012-10-05 19:41                             ` Peter Moody
@ 2012-10-08 12:23                               ` Jeff Layton
  2012-10-08 15:22                                 ` Peter Moody
  0 siblings, 1 reply; 26+ messages in thread
From: Jeff Layton @ 2012-10-08 12:23 UTC (permalink / raw)
  To: Peter Moody; +Cc: lczerner, linux-audit, sandeen

On Fri, 5 Oct 2012 12:41:46 -0700
Peter Moody <pmoody@google.com> wrote:

> On Fri, Oct 5, 2012 at 10:04 AM, Peter Moody <pmoody@google.com> wrote:
> > On Fri, Oct 5, 2012 at 9:15 AM, Jeff Layton <jlayton@redhat.com> wrote:
> >
> >>>
> >>>  ------------[ cut here ]------------
> >>> kernel BUG at fs/buffer.c:1220!
> >>> invalid opcode: 0000 [#1] SMP
> >>> CPU 0
> >>> Pid: 3683, comm: a.out Not tainted 3.5.0 #3
> >>> RIP: e030:[<ffffffff816a99f4>]  [<ffffffff816a99f4>]
> >>> check_irqs_on.part.8+0x4/0x6
> >>> RSP: e02b:ffff8807b156dc28  EFLAGS: 00010046
> >>> RAX: ffff8807d0dd0000 RBX: ffff8807a7d6df28 RCX: 0000000005883396
> >>> RDX: 0000000000001000 RSI: 0000000005883396 RDI: ffff8807cfc0c000
> >>> RBP: ffff8807b156dc28 R08: 0000000000000001 R09: ffff8807a7d6de50
> >>> R10: f83a2b0a359bf007 R11: 0000000000000000 R12: ffff8807a7d6de54
> >>> R13: ffff8807a7d6de80 R14: ffff8807cfc1f120 R15: 0000000005883396
> >>> FS:  00007f97164ec700(0000) GS:ffff8807ffc00000(0063) knlGS:0000000000000000
> >>> CS:  e033 DS: 002b ES: 002b CR0: 000000008005003b
> >>> CR2: 00000000f76ca3b0 CR3: 00000007bbb53000 CR4: 0000000000002660
> >>> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> >>> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> >>> Process a.out (pid: 3683, threadinfo ffff8807b156c000, task ffff8807bbae8000)
> >>> Stack:
> >>>  ffff8807b156dc98 ffffffff8116a099 ffff8807b59f3000 ffff8807b156dd30
> >>>  ffff8807b156dd60 ffff8807b156de78 ffff8807b156dc78 ffffffff816af231
> >>>  ffff8807b156dcd8 ffff8807a7d6e538 ffff8807a7d6df28 ffff8807a7d6de54
> >>> Call Trace:
> >>>  [<ffffffff8116a099>] __find_get_block+0x1f9/0x200
> >>>  [<ffffffff816af231>] ? down_read+0x11/0x30
> >>>  [<ffffffff811d1405>] ext3_clear_blocks+0x75/0x140
> >>>  [<ffffffff811d15dc>] ext3_free_data+0x10c/0x150
> >>>  [<ffffffff811e2061>] ? ext3_journal_start_sb+0x31/0x60
> >>>  [<ffffffff811d1cb5>] ext3_truncate+0x4a5/0x600
> >>>  [<ffffffff8123d5b8>] ? journal_start+0xb8/0x100
> >>>  [<ffffffff8106f406>] ? bit_waitqueue+0x16/0xc0
> >>>  [<ffffffff811d4598>] ext3_evict_inode+0x248/0x2c0
> >>>  [<ffffffff81153b9a>] evict+0xaa/0x1b0
> >>>  [<ffffffff81154843>] iput+0x103/0x210
> >>>  [<ffffffff8114fc88>] dentry_iput+0x88/0xd0
> >>>  [<ffffffff811505ec>] dput+0x12c/0x250
> >>>  [<ffffffff81146275>] path_put+0x15/0x30
> >>>  [<ffffffff810b2f35>] __audit_syscall_exit+0x2e5/0x460
> >>>  [<ffffffff816b30be>] sysexit_audit+0x29/0x5b
> >>> Code: 04 00 00 4c 8d 88 c0 02 00 00 31 c0 e8 5f da ff ff 48 85 db 74
> >>> 0c 80 43 5c 01 48 89 df e8 d5
> >>> 6a aa ff 5b 41 5c 5d c3 55 48 89 e5 <0f> 0b 55 48 89 e5 0f 0b 55 48 89
> >>> e5 0f 0b 55 48 89 e5 41 54 53
> >>> RIP  [<ffffffff816a99f4>] check_irqs_on.part.8+0x4/0x6
> >>>  RSP <ffff8807b156dc28>
> >>> ---[ end trace 8d09f8cfbb601c14 ]---
> >>>
> >>>
> >>
> >> I don't see a function called sysexit_audit in 3.5. I assume that's
> >> created via some sort of macro goop?
> >
> > It looks like it's defined in arch/x86/kernel/entry_32.S
> >
> > my asm is non-existent, but it looks like it's calling
> > __audit_syscall_exit and the disabling interrupts.
> >
> >> Could this be a xen specific problem?
> >
> > It could be. I asked the xen folks about this and they felt that it
> > was unlikely that the bug itself was their
> > (http://lists.xen.org/archives/html/xen-devel/2012-08/msg01052.html,
> > specifically http://lists.xen.org/archives/html/xen-devel/2012-08/msg01127.html)
> >
> > xen could definitely be doing something wonky with irqs though.
> >
> >> Might be interesting to add a
> >> check for irqs being disabled early in __audit_syscall_exit and see
> >> whether its doing that universally in its syscall exit routine?
> >
> > I'll try this, thanks.
> 
> So, I've added WARN_ON(irqs_disabled()) in various spots in the
> audit_syscall_exit path all the way up to the path_put in
> audit_free_names() that ends up triggering this and prior to Oops'ing,
> I don't see anything.
> 
> Post oops, I get lots of messages like:
> 
> ------------[ cut here ]------------
> WARNING: at kernel/auditsc.c:1021 audit_free_names+0xb0/0xdb()
> Pid: 3981, comm: python2.6 Tainted: G      D W    3.2.5-at17-ganetixenu #31
> Call Trace:
>  [<ffffffff81693483>] ? audit_free_names+0xb0/0xdb
>  [<ffffffff8105f405>] warn_slowpath_common+0x75/0xb0
>  [<ffffffff8105f535>] warn_slowpath_null+0x15/0x20
>  [<ffffffff81693483>] audit_free_names+0xb0/0xdb
>  [<ffffffff810ac629>] audit_syscall_exit+0x139/0x1e0
>  [<ffffffff8169fd6a>] sysexit_audit+0x21/0x5f
> ---[ end trace ec7a15dbe40eaf1d ]---
> ------------[ cut here ]------------
> WARNING: at kernel/auditsc.c:1021 audit_free_names+0xb0/0xdb()
> Pid: 3981, comm: python2.6 Tainted: G      D W    3.2.5-at17-ganetixenu #31
> Call Trace:
>  [<ffffffff81693483>] ? audit_free_names+0xb0/0xdb
>  [<ffffffff8105f405>] warn_slowpath_common+0x75/0xb0
>  [<ffffffff8105f535>] warn_slowpath_null+0x15/0x20
>  [<ffffffff81693483>] audit_free_names+0xb0/0xdb
>  [<ffffffff810ac629>] audit_syscall_exit+0x139/0x1e0
>  [<ffffffff8169fd6a>] sysexit_audit+0x21/0x5f
> ---[ end trace ec7a15dbe40eaf1e ]---
> 
> where that python command is a python par file that may be doing some
> funny unlinking as well. I haven't tracked it down.
> 
> thoughts? does this not look like a bug in audit; maybe something
> somewhere else is disabling interrupts?
> 

I'm not sure whether it's a bug in audit or not. Tracking down what's
disabling interrupts here is probably the key though...

So I'm guessing the WARN_ON above was in audit_free_names. Was there a
similar WARN_ON in audit_syscall_exit that did not fire?
 
-- 
Jeff Layton <jlayton@redhat.com>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH] audit: grab a reference to context->pwd when it's cached
  2012-10-05 17:04                           ` Peter Moody
  2012-10-05 19:41                             ` Peter Moody
@ 2012-10-08 14:10                             ` Jeff Layton
  1 sibling, 0 replies; 26+ messages in thread
From: Jeff Layton @ 2012-10-08 14:10 UTC (permalink / raw)
  To: Peter Moody; +Cc: lczerner, linux-audit, sandeen

On Fri, 5 Oct 2012 10:04:42 -0700
Peter Moody <pmoody@google.com> wrote:

> On Fri, Oct 5, 2012 at 9:15 AM, Jeff Layton <jlayton@redhat.com> wrote:
> 
> >>
> >>  ------------[ cut here ]------------
> >> kernel BUG at fs/buffer.c:1220!
> >> invalid opcode: 0000 [#1] SMP
> >> CPU 0
> >> Pid: 3683, comm: a.out Not tainted 3.5.0 #3
> >> RIP: e030:[<ffffffff816a99f4>]  [<ffffffff816a99f4>]
> >> check_irqs_on.part.8+0x4/0x6
> >> RSP: e02b:ffff8807b156dc28  EFLAGS: 00010046
> >> RAX: ffff8807d0dd0000 RBX: ffff8807a7d6df28 RCX: 0000000005883396
> >> RDX: 0000000000001000 RSI: 0000000005883396 RDI: ffff8807cfc0c000
> >> RBP: ffff8807b156dc28 R08: 0000000000000001 R09: ffff8807a7d6de50
> >> R10: f83a2b0a359bf007 R11: 0000000000000000 R12: ffff8807a7d6de54
> >> R13: ffff8807a7d6de80 R14: ffff8807cfc1f120 R15: 0000000005883396
> >> FS:  00007f97164ec700(0000) GS:ffff8807ffc00000(0063) knlGS:0000000000000000
> >> CS:  e033 DS: 002b ES: 002b CR0: 000000008005003b
> >> CR2: 00000000f76ca3b0 CR3: 00000007bbb53000 CR4: 0000000000002660
> >> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> >> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> >> Process a.out (pid: 3683, threadinfo ffff8807b156c000, task ffff8807bbae8000)
> >> Stack:
> >>  ffff8807b156dc98 ffffffff8116a099 ffff8807b59f3000 ffff8807b156dd30
> >>  ffff8807b156dd60 ffff8807b156de78 ffff8807b156dc78 ffffffff816af231
> >>  ffff8807b156dcd8 ffff8807a7d6e538 ffff8807a7d6df28 ffff8807a7d6de54
> >> Call Trace:
> >>  [<ffffffff8116a099>] __find_get_block+0x1f9/0x200
> >>  [<ffffffff816af231>] ? down_read+0x11/0x30
> >>  [<ffffffff811d1405>] ext3_clear_blocks+0x75/0x140
> >>  [<ffffffff811d15dc>] ext3_free_data+0x10c/0x150
> >>  [<ffffffff811e2061>] ? ext3_journal_start_sb+0x31/0x60
> >>  [<ffffffff811d1cb5>] ext3_truncate+0x4a5/0x600
> >>  [<ffffffff8123d5b8>] ? journal_start+0xb8/0x100
> >>  [<ffffffff8106f406>] ? bit_waitqueue+0x16/0xc0
> >>  [<ffffffff811d4598>] ext3_evict_inode+0x248/0x2c0
> >>  [<ffffffff81153b9a>] evict+0xaa/0x1b0
> >>  [<ffffffff81154843>] iput+0x103/0x210
> >>  [<ffffffff8114fc88>] dentry_iput+0x88/0xd0
> >>  [<ffffffff811505ec>] dput+0x12c/0x250
> >>  [<ffffffff81146275>] path_put+0x15/0x30
> >>  [<ffffffff810b2f35>] __audit_syscall_exit+0x2e5/0x460
> >>  [<ffffffff816b30be>] sysexit_audit+0x29/0x5b
> >> Code: 04 00 00 4c 8d 88 c0 02 00 00 31 c0 e8 5f da ff ff 48 85 db 74
> >> 0c 80 43 5c 01 48 89 df e8 d5
> >> 6a aa ff 5b 41 5c 5d c3 55 48 89 e5 <0f> 0b 55 48 89 e5 0f 0b 55 48 89
> >> e5 0f 0b 55 48 89 e5 41 54 53
> >> RIP  [<ffffffff816a99f4>] check_irqs_on.part.8+0x4/0x6
> >>  RSP <ffff8807b156dc28>
> >> ---[ end trace 8d09f8cfbb601c14 ]---
> >>
> >>
> >
> > I don't see a function called sysexit_audit in 3.5. I assume that's
> > created via some sort of macro goop?
> 
> It looks like it's defined in arch/x86/kernel/entry_32.S
> 
> my asm is non-existent, but it looks like it's calling
> __audit_syscall_exit and the disabling interrupts.
> 

BTW, this is not really my area of expertise either, but I think that's
the asm goop for "native" 32-bit x86. The asm for 32-bit binaries
running on a 64-bit kernel are in arch/x86/ia32/ia32entry.S.

>From what I can tell though, it looks like it should be calling
__audit_syscall_exit with interrupts enabled:

--------------[snip]-----------------
        .macro auditsys_exit exit
        testl $(_TIF_ALLWORK_MASK & ~_TIF_SYSCALL_AUDIT),TI_flags+THREAD_INFO(%rsp,RIP-ARGOFFSET)
        jnz ia32_ret_from_sys_call
        TRACE_IRQS_ON
        sti
        movl %eax,%esi          /* second arg, syscall return value */
        cmpl $-MAX_ERRNO,%eax   /* is it an error ? */
        jbe 1f
        movslq %eax, %rsi       /* if error sign extend to 64 bits */
1:      setbe %al               /* 1 if error, 0 if not */
        movzbl %al,%edi         /* zero-extend that into %edi */
        call __audit_syscall_exit 
        movq RAX-ARGOFFSET(%rsp),%rax   /* reload syscall return value */
        movl $(_TIF_ALLWORK_MASK & ~_TIF_SYSCALL_AUDIT),%edi
        cli
        TRACE_IRQS_OFF
        testl %edi,TI_flags+THREAD_INFO(%rsp,RIP-ARGOFFSET)
        jz \exit
        CLEAR_RREGS -ARGOFFSET
        jmp int_with_check
        .endm
--------------[snip]-----------------

AIUI, "sti" is what enables interrupts, and I don't see a call to
"cli" (which disables interrupts) until after __audit_syscall_exit.
Either something is reordering those instructions (which seems very
unlikely), or we're not looking at the right syscall code?

-- 
Jeff Layton <jlayton@redhat.com>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH] audit: grab a reference to context->pwd when it's cached
  2012-10-08 12:23                               ` Jeff Layton
@ 2012-10-08 15:22                                 ` Peter Moody
  2012-10-08 16:45                                   ` Peter Moody
  0 siblings, 1 reply; 26+ messages in thread
From: Peter Moody @ 2012-10-08 15:22 UTC (permalink / raw)
  To: Jeff Layton; +Cc: lczerner, linux-audit, sandeen

On Mon, Oct 8, 2012 at 5:23 AM, Jeff Layton <jlayton@redhat.com> wrote:
> On Fri, 5 Oct 2012 12:41:46 -0700
> Peter Moody <pmoody@google.com> wrote:
>
>> On Fri, Oct 5, 2012 at 10:04 AM, Peter Moody <pmoody@google.com> wrote:
>> > On Fri, Oct 5, 2012 at 9:15 AM, Jeff Layton <jlayton@redhat.com> wrote:
>> >
>> >>>
>> >>>  ------------[ cut here ]------------
>> >>> kernel BUG at fs/buffer.c:1220!
>> >>> invalid opcode: 0000 [#1] SMP
>> >>> CPU 0
>> >>> Pid: 3683, comm: a.out Not tainted 3.5.0 #3
>> >>> RIP: e030:[<ffffffff816a99f4>]  [<ffffffff816a99f4>]
>> >>> check_irqs_on.part.8+0x4/0x6
>> >>> RSP: e02b:ffff8807b156dc28  EFLAGS: 00010046
>> >>> RAX: ffff8807d0dd0000 RBX: ffff8807a7d6df28 RCX: 0000000005883396
>> >>> RDX: 0000000000001000 RSI: 0000000005883396 RDI: ffff8807cfc0c000
>> >>> RBP: ffff8807b156dc28 R08: 0000000000000001 R09: ffff8807a7d6de50
>> >>> R10: f83a2b0a359bf007 R11: 0000000000000000 R12: ffff8807a7d6de54
>> >>> R13: ffff8807a7d6de80 R14: ffff8807cfc1f120 R15: 0000000005883396
>> >>> FS:  00007f97164ec700(0000) GS:ffff8807ffc00000(0063) knlGS:0000000000000000
>> >>> CS:  e033 DS: 002b ES: 002b CR0: 000000008005003b
>> >>> CR2: 00000000f76ca3b0 CR3: 00000007bbb53000 CR4: 0000000000002660
>> >>> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>> >>> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
>> >>> Process a.out (pid: 3683, threadinfo ffff8807b156c000, task ffff8807bbae8000)
>> >>> Stack:
>> >>>  ffff8807b156dc98 ffffffff8116a099 ffff8807b59f3000 ffff8807b156dd30
>> >>>  ffff8807b156dd60 ffff8807b156de78 ffff8807b156dc78 ffffffff816af231
>> >>>  ffff8807b156dcd8 ffff8807a7d6e538 ffff8807a7d6df28 ffff8807a7d6de54
>> >>> Call Trace:
>> >>>  [<ffffffff8116a099>] __find_get_block+0x1f9/0x200
>> >>>  [<ffffffff816af231>] ? down_read+0x11/0x30
>> >>>  [<ffffffff811d1405>] ext3_clear_blocks+0x75/0x140
>> >>>  [<ffffffff811d15dc>] ext3_free_data+0x10c/0x150
>> >>>  [<ffffffff811e2061>] ? ext3_journal_start_sb+0x31/0x60
>> >>>  [<ffffffff811d1cb5>] ext3_truncate+0x4a5/0x600
>> >>>  [<ffffffff8123d5b8>] ? journal_start+0xb8/0x100
>> >>>  [<ffffffff8106f406>] ? bit_waitqueue+0x16/0xc0
>> >>>  [<ffffffff811d4598>] ext3_evict_inode+0x248/0x2c0
>> >>>  [<ffffffff81153b9a>] evict+0xaa/0x1b0
>> >>>  [<ffffffff81154843>] iput+0x103/0x210
>> >>>  [<ffffffff8114fc88>] dentry_iput+0x88/0xd0
>> >>>  [<ffffffff811505ec>] dput+0x12c/0x250
>> >>>  [<ffffffff81146275>] path_put+0x15/0x30
>> >>>  [<ffffffff810b2f35>] __audit_syscall_exit+0x2e5/0x460
>> >>>  [<ffffffff816b30be>] sysexit_audit+0x29/0x5b
>> >>> Code: 04 00 00 4c 8d 88 c0 02 00 00 31 c0 e8 5f da ff ff 48 85 db 74
>> >>> 0c 80 43 5c 01 48 89 df e8 d5
>> >>> 6a aa ff 5b 41 5c 5d c3 55 48 89 e5 <0f> 0b 55 48 89 e5 0f 0b 55 48 89
>> >>> e5 0f 0b 55 48 89 e5 41 54 53
>> >>> RIP  [<ffffffff816a99f4>] check_irqs_on.part.8+0x4/0x6
>> >>>  RSP <ffff8807b156dc28>
>> >>> ---[ end trace 8d09f8cfbb601c14 ]---
>> >>>
>> >>>
>> >>
>> >> I don't see a function called sysexit_audit in 3.5. I assume that's
>> >> created via some sort of macro goop?
>> >
>> > It looks like it's defined in arch/x86/kernel/entry_32.S
>> >
>> > my asm is non-existent, but it looks like it's calling
>> > __audit_syscall_exit and the disabling interrupts.
>> >
>> >> Could this be a xen specific problem?
>> >
>> > It could be. I asked the xen folks about this and they felt that it
>> > was unlikely that the bug itself was their
>> > (http://lists.xen.org/archives/html/xen-devel/2012-08/msg01052.html,
>> > specifically http://lists.xen.org/archives/html/xen-devel/2012-08/msg01127.html)
>> >
>> > xen could definitely be doing something wonky with irqs though.
>> >
>> >> Might be interesting to add a
>> >> check for irqs being disabled early in __audit_syscall_exit and see
>> >> whether its doing that universally in its syscall exit routine?
>> >
>> > I'll try this, thanks.
>>
>> So, I've added WARN_ON(irqs_disabled()) in various spots in the
>> audit_syscall_exit path all the way up to the path_put in
>> audit_free_names() that ends up triggering this and prior to Oops'ing,
>> I don't see anything.
>>
>> Post oops, I get lots of messages like:
>>
>> ------------[ cut here ]------------
>> WARNING: at kernel/auditsc.c:1021 audit_free_names+0xb0/0xdb()
>> Pid: 3981, comm: python2.6 Tainted: G      D W    3.2.5-at17-ganetixenu #31
>> Call Trace:
>>  [<ffffffff81693483>] ? audit_free_names+0xb0/0xdb
>>  [<ffffffff8105f405>] warn_slowpath_common+0x75/0xb0
>>  [<ffffffff8105f535>] warn_slowpath_null+0x15/0x20
>>  [<ffffffff81693483>] audit_free_names+0xb0/0xdb
>>  [<ffffffff810ac629>] audit_syscall_exit+0x139/0x1e0
>>  [<ffffffff8169fd6a>] sysexit_audit+0x21/0x5f
>> ---[ end trace ec7a15dbe40eaf1d ]---
>> ------------[ cut here ]------------
>> WARNING: at kernel/auditsc.c:1021 audit_free_names+0xb0/0xdb()
>> Pid: 3981, comm: python2.6 Tainted: G      D W    3.2.5-at17-ganetixenu #31
>> Call Trace:
>>  [<ffffffff81693483>] ? audit_free_names+0xb0/0xdb
>>  [<ffffffff8105f405>] warn_slowpath_common+0x75/0xb0
>>  [<ffffffff8105f535>] warn_slowpath_null+0x15/0x20
>>  [<ffffffff81693483>] audit_free_names+0xb0/0xdb
>>  [<ffffffff810ac629>] audit_syscall_exit+0x139/0x1e0
>>  [<ffffffff8169fd6a>] sysexit_audit+0x21/0x5f
>> ---[ end trace ec7a15dbe40eaf1e ]---
>>
>> where that python command is a python par file that may be doing some
>> funny unlinking as well. I haven't tracked it down.
>>
>> thoughts? does this not look like a bug in audit; maybe something
>> somewhere else is disabling interrupts?
>>
>
> I'm not sure whether it's a bug in audit or not. Tracking down what's
> disabling interrupts here is probably the key though...

Yeah. I'm trying to get a reproducer to work on a different version of Xen.

> So I'm guessing the WARN_ON above was in audit_free_names. Was there a
> similar WARN_ON in audit_syscall_exit that did not fire?

Correct. I booted a few kernels with the WARN_ON successively further
down the stack in audit land and it only started firing after BUG_ON()
in fs/buffer.c

Thanks for looking into this.

Cheers,
peter
-- 
Peter Moody      Google    1.650.253.7306
Security Engineer  pgp:0xC3410038

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH] audit: grab a reference to context->pwd when it's cached
  2012-10-08 15:22                                 ` Peter Moody
@ 2012-10-08 16:45                                   ` Peter Moody
  0 siblings, 0 replies; 26+ messages in thread
From: Peter Moody @ 2012-10-08 16:45 UTC (permalink / raw)
  To: Jeff Layton; +Cc: lczerner, linux-audit, sandeen

On Mon, Oct 8, 2012 at 8:22 AM, Peter Moody <pmoody@google.com> wrote:
> On Mon, Oct 8, 2012 at 5:23 AM, Jeff Layton <jlayton@redhat.com> wrote:
>> On Fri, 5 Oct 2012 12:41:46 -0700
>> Peter Moody <pmoody@google.com> wrote:
>>
>>> On Fri, Oct 5, 2012 at 10:04 AM, Peter Moody <pmoody@google.com> wrote:
>>> > On Fri, Oct 5, 2012 at 9:15 AM, Jeff Layton <jlayton@redhat.com> wrote:
>>> >
>>> >>>
>>> >>>  ------------[ cut here ]------------
>>> >>> kernel BUG at fs/buffer.c:1220!
>>> >>> invalid opcode: 0000 [#1] SMP
>>> >>> CPU 0
>>> >>> Pid: 3683, comm: a.out Not tainted 3.5.0 #3
>>> >>> RIP: e030:[<ffffffff816a99f4>]  [<ffffffff816a99f4>]
>>> >>> check_irqs_on.part.8+0x4/0x6
>>> >>> RSP: e02b:ffff8807b156dc28  EFLAGS: 00010046
>>> >>> RAX: ffff8807d0dd0000 RBX: ffff8807a7d6df28 RCX: 0000000005883396
>>> >>> RDX: 0000000000001000 RSI: 0000000005883396 RDI: ffff8807cfc0c000
>>> >>> RBP: ffff8807b156dc28 R08: 0000000000000001 R09: ffff8807a7d6de50
>>> >>> R10: f83a2b0a359bf007 R11: 0000000000000000 R12: ffff8807a7d6de54
>>> >>> R13: ffff8807a7d6de80 R14: ffff8807cfc1f120 R15: 0000000005883396
>>> >>> FS:  00007f97164ec700(0000) GS:ffff8807ffc00000(0063) knlGS:0000000000000000
>>> >>> CS:  e033 DS: 002b ES: 002b CR0: 000000008005003b
>>> >>> CR2: 00000000f76ca3b0 CR3: 00000007bbb53000 CR4: 0000000000002660
>>> >>> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>>> >>> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
>>> >>> Process a.out (pid: 3683, threadinfo ffff8807b156c000, task ffff8807bbae8000)
>>> >>> Stack:
>>> >>>  ffff8807b156dc98 ffffffff8116a099 ffff8807b59f3000 ffff8807b156dd30
>>> >>>  ffff8807b156dd60 ffff8807b156de78 ffff8807b156dc78 ffffffff816af231
>>> >>>  ffff8807b156dcd8 ffff8807a7d6e538 ffff8807a7d6df28 ffff8807a7d6de54
>>> >>> Call Trace:
>>> >>>  [<ffffffff8116a099>] __find_get_block+0x1f9/0x200
>>> >>>  [<ffffffff816af231>] ? down_read+0x11/0x30
>>> >>>  [<ffffffff811d1405>] ext3_clear_blocks+0x75/0x140
>>> >>>  [<ffffffff811d15dc>] ext3_free_data+0x10c/0x150
>>> >>>  [<ffffffff811e2061>] ? ext3_journal_start_sb+0x31/0x60
>>> >>>  [<ffffffff811d1cb5>] ext3_truncate+0x4a5/0x600
>>> >>>  [<ffffffff8123d5b8>] ? journal_start+0xb8/0x100
>>> >>>  [<ffffffff8106f406>] ? bit_waitqueue+0x16/0xc0
>>> >>>  [<ffffffff811d4598>] ext3_evict_inode+0x248/0x2c0
>>> >>>  [<ffffffff81153b9a>] evict+0xaa/0x1b0
>>> >>>  [<ffffffff81154843>] iput+0x103/0x210
>>> >>>  [<ffffffff8114fc88>] dentry_iput+0x88/0xd0
>>> >>>  [<ffffffff811505ec>] dput+0x12c/0x250
>>> >>>  [<ffffffff81146275>] path_put+0x15/0x30
>>> >>>  [<ffffffff810b2f35>] __audit_syscall_exit+0x2e5/0x460
>>> >>>  [<ffffffff816b30be>] sysexit_audit+0x29/0x5b
>>> >>> Code: 04 00 00 4c 8d 88 c0 02 00 00 31 c0 e8 5f da ff ff 48 85 db 74
>>> >>> 0c 80 43 5c 01 48 89 df e8 d5
>>> >>> 6a aa ff 5b 41 5c 5d c3 55 48 89 e5 <0f> 0b 55 48 89 e5 0f 0b 55 48 89
>>> >>> e5 0f 0b 55 48 89 e5 41 54 53
>>> >>> RIP  [<ffffffff816a99f4>] check_irqs_on.part.8+0x4/0x6
>>> >>>  RSP <ffff8807b156dc28>
>>> >>> ---[ end trace 8d09f8cfbb601c14 ]---
>>> >>>
>>> >>>
>>> >>
>>> >> I don't see a function called sysexit_audit in 3.5. I assume that's
>>> >> created via some sort of macro goop?
>>> >
>>> > It looks like it's defined in arch/x86/kernel/entry_32.S
>>> >
>>> > my asm is non-existent, but it looks like it's calling
>>> > __audit_syscall_exit and the disabling interrupts.
>>> >
>>> >> Could this be a xen specific problem?
>>> >
>>> > It could be. I asked the xen folks about this and they felt that it
>>> > was unlikely that the bug itself was their
>>> > (http://lists.xen.org/archives/html/xen-devel/2012-08/msg01052.html,
>>> > specifically http://lists.xen.org/archives/html/xen-devel/2012-08/msg01127.html)
>>> >
>>> > xen could definitely be doing something wonky with irqs though.
>>> >
>>> >> Might be interesting to add a
>>> >> check for irqs being disabled early in __audit_syscall_exit and see
>>> >> whether its doing that universally in its syscall exit routine?
>>> >
>>> > I'll try this, thanks.
>>>
>>> So, I've added WARN_ON(irqs_disabled()) in various spots in the
>>> audit_syscall_exit path all the way up to the path_put in
>>> audit_free_names() that ends up triggering this and prior to Oops'ing,
>>> I don't see anything.
>>>
>>> Post oops, I get lots of messages like:
>>>
>>> ------------[ cut here ]------------
>>> WARNING: at kernel/auditsc.c:1021 audit_free_names+0xb0/0xdb()
>>> Pid: 3981, comm: python2.6 Tainted: G      D W    3.2.5-at17-ganetixenu #31
>>> Call Trace:
>>>  [<ffffffff81693483>] ? audit_free_names+0xb0/0xdb
>>>  [<ffffffff8105f405>] warn_slowpath_common+0x75/0xb0
>>>  [<ffffffff8105f535>] warn_slowpath_null+0x15/0x20
>>>  [<ffffffff81693483>] audit_free_names+0xb0/0xdb
>>>  [<ffffffff810ac629>] audit_syscall_exit+0x139/0x1e0
>>>  [<ffffffff8169fd6a>] sysexit_audit+0x21/0x5f
>>> ---[ end trace ec7a15dbe40eaf1d ]---
>>> ------------[ cut here ]------------
>>> WARNING: at kernel/auditsc.c:1021 audit_free_names+0xb0/0xdb()
>>> Pid: 3981, comm: python2.6 Tainted: G      D W    3.2.5-at17-ganetixenu #31
>>> Call Trace:
>>>  [<ffffffff81693483>] ? audit_free_names+0xb0/0xdb
>>>  [<ffffffff8105f405>] warn_slowpath_common+0x75/0xb0
>>>  [<ffffffff8105f535>] warn_slowpath_null+0x15/0x20
>>>  [<ffffffff81693483>] audit_free_names+0xb0/0xdb
>>>  [<ffffffff810ac629>] audit_syscall_exit+0x139/0x1e0
>>>  [<ffffffff8169fd6a>] sysexit_audit+0x21/0x5f
>>> ---[ end trace ec7a15dbe40eaf1e ]---
>>>
>>> where that python command is a python par file that may be doing some
>>> funny unlinking as well. I haven't tracked it down.
>>>
>>> thoughts? does this not look like a bug in audit; maybe something
>>> somewhere else is disabling interrupts?
>>>
>>
>> I'm not sure whether it's a bug in audit or not. Tracking down what's
>> disabling interrupts here is probably the key though...
>
> Yeah. I'm trying to get a reproducer to work on a different version of Xen.
>
>> So I'm guessing the WARN_ON above was in audit_free_names. Was there a
>> similar WARN_ON in audit_syscall_exit that did not fire?
>
> Correct. I booted a few kernels with the WARN_ON successively further
> down the stack in audit land and it only started firing after BUG_ON()
> in fs/buffer.c
>
> Thanks for looking into this.

So one more random data point: I can't repro this with ext2, but ext3
and ext4 both crash immediately.

off to bug fs-devel..

> Cheers,
> peter
> --
> Peter Moody      Google    1.650.253.7306
> Security Engineer  pgp:0xC3410038



-- 
Peter Moody      Google    1.650.253.7306
Security Engineer  pgp:0xC3410038

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH] audit: grab a reference to context->pwd when it's cached
  2012-10-07  3:22                           ` Peter Moody
@ 2012-10-08 18:48                             ` Eric Sandeen
  0 siblings, 0 replies; 26+ messages in thread
From: Eric Sandeen @ 2012-10-08 18:48 UTC (permalink / raw)
  To: Peter Moody; +Cc: lczerner, linux-audit, Jeff Layton

On 10/6/12 10:22 PM, Peter Moody wrote:
> On Fri, Oct 5, 2012 at 1:23 PM, Eric Sandeen <sandeen@redhat.com> wrote:
>> On 10/5/12 10:57 AM, Peter Moody wrote:
>>> On Fri, Oct 5, 2012 at 8:18 AM, Peter Moody <pmoody@google.com> wrote:
>>>> On Fri, Oct 5, 2012 at 8:16 AM, Peter Moody <pmoody@google.com> wrote:
>>>>> On Fri, Oct 5, 2012 at 7:26 AM, Jeff Layton <jlayton@redhat.com> wrote:
>>>>>> On Fri, 5 Oct 2012 06:57:59 -0700
>>>>>> Peter Moody <pmoody@google.com> wrote:
>>>>>>
>>>>>>> On Fri, Oct 5, 2012 at 5:55 AM, Jeff Layton <jlayton@redhat.com> wrote:
>>>>>>>> On Thu, 4 Oct 2012 11:48:23 -0700
>>>>>>>> Peter Moody <pmoody@google.com> wrote:
>>>>>>>>
>>>>>>>>> On Wed, Sep 26, 2012 at 6:50 AM, Alexander Viro <aviro@redhat.com> wrote:
>>>>>>>>>> On Tue, Sep 25, 2012 at 10:03:23AM -0700, Peter Moody wrote:
>>>>>>>>>>> Hey folks,
>>>>>>>>>>>
>>>>>>>>>>> following up on old patches, are there any comments on this? Did you
>>>>>>>>>>> get around to finding a better way to fix this bug, Al?
>>>>>>>>>>
>>>>>>>>>> Alas, I've found none ;-/  Looks like we'll have to go with this one,
>>>>>>>>>> at least until somebody comes up with better solution.
>>>>>>>>>
>>>>>>>>> Not surprisingly, this patch doesn't actually fix the issue (or at
>>>>>>>>> least doesn't do it correctly).
>>>>>>>>>
>>>>>>>>> I hadn't noticed that get_fs_pwd() actually calls path_get() on
>>>>>>>>> &context->pwd so the additional path_get() is useless and the
>>>>>>>>> reference doesn't ever actually get freed if audit_putname is called
>>>>>>>>> while we're in a syscall.
>>>>>>>>>
>>>>>>>>> Al, Eric, Jeff; do any of you guys have an understanding of what the
>>>>>>>>> initial bug actually is since this clearly doesn't fix it?
>>>>>>>>>
>>>>>>>>> Cheers,
>>>>>>>>> peter
>>>>>>>>>
>>>>>>>>
>>>>>>>> BTW, I ran this test on one of my KVM guests and it ran just fine. That
>>>>>>>> one is an x86_64 guest running a 3.6.0+ kernel. The root fs on there is
>>>>>>>> ext4 though, not ext3. So perhaps that's a factor?
>>>>>>>>
>>>>>>>> The oops message you posted at least looks like something down in the
>>>>>>>> bowels of ext3 or fs/buffer.c.
>>>>>>>
>>>>>>> Yeah, the only place this actually happens for me on these giant xen
>>>>>>> instances we have (6 cores, 32G ram) and it happens on both ext3 and
>>>>>>> ext4 filesystems and it happens with 100% reliability.
>>>>>>>
>>>>>>> The actual oops is from:
>>>>>>>
>>>>>>> static inline void check_irqs_on(void)
>>>>>>> {
>>>>>>> #ifdef irqs_disabled
>>>>>>>         BUG_ON(irqs_disabled());
>>>>>>> #endif
>>>>>>> }
>>>>>>>
>>>>>>> with the code path looking like:
>>>>>>>
>>>>>>> __find_get_block() -> lookup_bh_lru() -> check_irqs_on() -> BUG()
>>>>>>>
>>>>>>
>>>>>> Do you have a backtrace from a more recent kernel? I wonder if
>>>>>> something in the syscall exit codepath is disabling IRQs here?
>>>>>
>>>>> is 3.6.0-rc1 recent enough or do you want something newer?
>>>>
>>>> nevermind, that doesn't boot. One sec.
>>>
>>>
>>> here's 3.5.0
>>>
>>>  ------------[ cut here ]------------
>>> kernel BUG at fs/buffer.c:1220!
>>> invalid opcode: 0000 [#1] SMP
>>> CPU 0
>>> Pid: 3683, comm: a.out Not tainted 3.5.0 #3
>>> RIP: e030:[<ffffffff816a99f4>]  [<ffffffff816a99f4>]
>>> check_irqs_on.part.8+0x4/0x6
>>> RSP: e02b:ffff8807b156dc28  EFLAGS: 00010046
>>> RAX: ffff8807d0dd0000 RBX: ffff8807a7d6df28 RCX: 0000000005883396
>>> RDX: 0000000000001000 RSI: 0000000005883396 RDI: ffff8807cfc0c000
>>> RBP: ffff8807b156dc28 R08: 0000000000000001 R09: ffff8807a7d6de50
>>> R10: f83a2b0a359bf007 R11: 0000000000000000 R12: ffff8807a7d6de54
>>> R13: ffff8807a7d6de80 R14: ffff8807cfc1f120 R15: 0000000005883396
>>> FS:  00007f97164ec700(0000) GS:ffff8807ffc00000(0063) knlGS:0000000000000000
>>> CS:  e033 DS: 002b ES: 002b CR0: 000000008005003b
>>> CR2: 00000000f76ca3b0 CR3: 00000007bbb53000 CR4: 0000000000002660
>>> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>>> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
>>> Process a.out (pid: 3683, threadinfo ffff8807b156c000, task ffff8807bbae8000)
>>
>> One thing that crossed my mind is that back in the 4k stacks days I
>> think sometimes a blown stack could corrupt threadinfo and we'd get
>> spurious warnings like this.
>>
>> Is there any chance something like that has happened? (any stack
>> depth messages, etc?)
> 
> I do'nt see any stack depth messages in there, no.
> 
>> Maybe a crashdump on the BUG() for further poking around would
>> be in order.
> 
> when I add a painc() there, this is what I get. I'm not seeing
> anything about stack depth, Is there something obvious I'm missing?

Not missing anything really; it's a long shot I suppose anyway.

It's possible that there was some deep stack excursion that corrupted
threadinfo, and then we went down the path below and the first bad news
we received was when we tested whether we were in an interrupt.

Hm, your other email said that you only started getting the in_interrupt()
warnings AFTER the BUG_ON()?  That makes me think I might be right after
all:

> I booted a few kernels with the WARN_ON successively further
> down the stack in audit land and it only started firing after BUG_ON()
> in fs/buffer.c

On the other hand, there is a "canary" at the top of the stack which
would usually get overwritten, and a BUG_ON() or panic() should print
that fact, something like "Thread overran stack, or stack corrupted."

What if you build with config STACK_TRACER on, and:

# mount -t debugfs none /sys/kernel/debug
# echo 1 > /proc/sys/kernel/stack_tracer_enabled
# <run your test w/o the panic, get the BUG_ON>
# cat /sys/kernel/debug/tracing/stack_trace

If CONFIG_DEBUG_STACK_USAGE isn't on, that might help too.

I could be way off here but might be worth a test.

-Eric


> Kernel panic - not syncing: irqs disabled
> Pid: 23011, comm: a.out Tainted: G        W    3.2.5-at18-ganetixenu #2
> Call Trace:
>  [<ffffffff81692b16>] panic+0x8c/0x1a3
>  [<ffffffff8100926d>] ? xen_force_evtchn_callback+0xd/0x10
>  [<ffffffff816960d6>] check_irqs_on.part.13+0x25/0x25
>  [<ffffffff81161dc9>] __find_get_block+0x1f9/0x200
>  [<ffffffff813c7a2f>] ? number.isra.2+0x31f/0x350
>  [<ffffffff811c9d85>] ext3_clear_blocks+0x75/0x140
>  [<ffffffff811c9f5c>] ext3_free_data+0x10c/0x150
>  [<ffffffff811da961>] ? ext3_journal_start_sb+0x31/0x60
>  [<ffffffff811ca635>] ext3_truncate+0x4a5/0x600
>  [<ffffffff81232f38>] ? journal_start+0xb8/0x100
>  [<ffffffff811ccf48>] ext3_evict_inode+0x228/0x2a0
>  [<ffffffff8114bfb1>] evict+0xa1/0x1a0
>  [<ffffffff8114cc61>] iput+0x101/0x210
>  [<ffffffff81148040>] d_kill+0xf0/0x130
>  [<ffffffff81148bd2>] dput+0xd2/0x1b0
>  [<ffffffff8113eb85>] path_put+0x15/0x30
>  [<ffffffff8169348f>] audit_free_names+0xbc/0xdb
>  [<ffffffff810ac629>] audit_syscall_exit+0x139/0x1e0
>  [<ffffffff8169fdaa>] sysexit_audit+0x21/0x5f
> 
>> -Eric
>>
>>> Stack:
>>>  ffff8807b156dc98 ffffffff8116a099 ffff8807b59f3000 ffff8807b156dd30
>>>  ffff8807b156dd60 ffff8807b156de78 ffff8807b156dc78 ffffffff816af231
>>>  ffff8807b156dcd8 ffff8807a7d6e538 ffff8807a7d6df28 ffff8807a7d6de54
>>> Call Trace:
>>>  [<ffffffff8116a099>] __find_get_block+0x1f9/0x200
>>>  [<ffffffff816af231>] ? down_read+0x11/0x30
>>>  [<ffffffff811d1405>] ext3_clear_blocks+0x75/0x140
>>>  [<ffffffff811d15dc>] ext3_free_data+0x10c/0x150
>>>  [<ffffffff811e2061>] ? ext3_journal_start_sb+0x31/0x60
>>>  [<ffffffff811d1cb5>] ext3_truncate+0x4a5/0x600
>>>  [<ffffffff8123d5b8>] ? journal_start+0xb8/0x100
>>>  [<ffffffff8106f406>] ? bit_waitqueue+0x16/0xc0
>>>  [<ffffffff811d4598>] ext3_evict_inode+0x248/0x2c0
>>>  [<ffffffff81153b9a>] evict+0xaa/0x1b0
>>>  [<ffffffff81154843>] iput+0x103/0x210
>>>  [<ffffffff8114fc88>] dentry_iput+0x88/0xd0
>>>  [<ffffffff811505ec>] dput+0x12c/0x250
>>>  [<ffffffff81146275>] path_put+0x15/0x30
>>>  [<ffffffff810b2f35>] __audit_syscall_exit+0x2e5/0x460
>>>  [<ffffffff816b30be>] sysexit_audit+0x29/0x5b
>>> Code: 04 00 00 4c 8d 88 c0 02 00 00 31 c0 e8 5f da ff ff 48 85 db 74
>>> 0c 80 43 5c 01 48 89 df e8 d5
>>> 6a aa ff 5b 41 5c 5d c3 55 48 89 e5 <0f> 0b 55 48 89 e5 0f 0b 55 48 89
>>> e5 0f 0b 55 48 89 e5 41 54 53
>>> RIP  [<ffffffff816a99f4>] check_irqs_on.part.8+0x4/0x6
>>>  RSP <ffff8807b156dc28>
>>> ---[ end trace 8d09f8cfbb601c14 ]---
>>>
>>>

^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2012-10-08 18:48 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-08-16  1:13 [PATCH] audit: grab a reference to context->pwd when it's cached Peter Moody
2012-08-16  1:26 ` Alexander Viro
2012-08-16  1:41   ` Peter Moody
2012-08-20 17:19     ` Peter Moody
2012-09-25 17:03       ` Peter Moody
2012-09-26 13:50         ` Alexander Viro
2012-09-26 18:20           ` Peter Moody
2012-10-04 18:48           ` Peter Moody
2012-10-05 11:02             ` Jeff Layton
2012-10-05 12:55             ` Jeff Layton
2012-10-05 13:57               ` Peter Moody
2012-10-05 14:10                 ` Jeff Layton
2012-10-05 14:26                 ` Jeff Layton
2012-10-05 15:16                   ` Peter Moody
2012-10-05 15:18                     ` Peter Moody
2012-10-05 15:57                       ` Peter Moody
2012-10-05 16:15                         ` Jeff Layton
2012-10-05 17:04                           ` Peter Moody
2012-10-05 19:41                             ` Peter Moody
2012-10-08 12:23                               ` Jeff Layton
2012-10-08 15:22                                 ` Peter Moody
2012-10-08 16:45                                   ` Peter Moody
2012-10-08 14:10                             ` Jeff Layton
2012-10-05 20:23                         ` Eric Sandeen
2012-10-07  3:22                           ` Peter Moody
2012-10-08 18:48                             ` Eric Sandeen

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.