All of lore.kernel.org
 help / color / mirror / Atom feed
From: Dave Chinner <david@fromorbit.com>
To: Christian Brauner <brauner@kernel.org>
Cc: Zorro Lang <zlang@redhat.com>, fstests <fstests@vger.kernel.org>,
	Eryu Guan <guaneryu@gmail.com>,
	Amir Goldstein <amir73il@gmail.com>,
	Christoph Hellwig <hch@lst.de>,
	"Darrick J. Wong" <djwong@kernel.org>,
	io-uring@vger.kernel.org
Subject: Re: [PATCH v2 00/13] rename & split tests
Date: Sun, 22 May 2022 09:13:50 +1000	[thread overview]
Message-ID: <20220521231350.GY2306852@dread.disaster.area> (raw)
In-Reply-To: <20220512165250.450989-1-brauner@kernel.org>

[cc io_uring]

On Thu, May 12, 2022 at 06:52:37PM +0200, Christian Brauner wrote:
> From: "Christian Brauner (Microsoft)" <brauner@kernel.org>
> 
> Hey everyone,
> 
> Please note that this patch series contains patches that will be
> rejected by the fstests mailing list because of the amount of changes
> they contain. So tools like b4 will not be able to find the whole patch
> series on a mailing list. In case it's helpful I've added the
> "fstests.vfstest.for-next" tag which can be pulled. Otherwise it's
> possible to simply use the patch series as it appears in your inbox.
> 
> All vfstests pass:

[...]

> #### xfs ####
> ubuntu@imp1-vm:~/src/git/xfstests$ sudo ./check -g idmapped
> FSTYP         -- xfs (debug)
> PLATFORM      -- Linux/x86_64 imp1-vm 5.18.0-rc4-fs-mnt-hold-writers-8a2e2350494f #107 SMP PREEMPT_DYNAMIC Mon May 9 12:12:34 UTC 2022
> MKFS_OPTIONS  -- -f /dev/sda4
> MOUNT_OPTIONS -- /dev/sda4 /mnt/scratch
> 
> generic/633 58s ...  58s
> generic/644 62s ...  60s
> generic/645 161s ...  161s
> generic/656 62s ...  63s
> xfs/152 133s ...  133s
> xfs/153 94s ...  92s
> Ran: generic/633 generic/644 generic/645 generic/656 xfs/152 xfs/153
> Passed all 6 tests

I'm not sure if it's this series that has introduced a test bug or
triggered a latent issue in the kernel, but I've started seeing
generic/633 throw audit subsystem warnings on a single test machine
as of late Friday:

[ 7285.015888] WARNING: CPU: 3 PID: 2147118 at kernel/auditsc.c:2035 __audit_syscall_entry+0x113/0x140
[ 7285.019973] Modules linked in:
[ 7285.021281] CPU: 3 PID: 2147118 Comm: vfstest Not tainted 5.18.0-rc7-dgc+ #1250
[ 7285.024341] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014
[ 7285.027782] RIP: 0010:__audit_syscall_entry+0x113/0x140
[ 7285.029923] Code: 24 e8 c1 6b ff ff 48 8b 34 24 85 c0 48 8b 54 24 08 48 8b 4c 24 10 4c 8b 44 24 18 0f 84 72 ff ff ff 48 83 c4 20 5b 5d 41 5c c3 <0f> 0b 85 c0 75 14 48 83 c4 20 48 c7 c7 70 45 7f 82 5b 5d 41 5c e9
[ 7285.037563] RSP: 0018:ffffc900012f7ed0 EFLAGS: 00010202
[ 7285.039748] RAX: 0000000000000000 RBX: ffff8880aaf18800 RCX: 000000000000003c
[ 7285.043126] RDX: 00000000000000e7 RSI: 0000000000000000 RDI: ffff888102104f00
[ 7285.046120] RBP: 00000000000000e7 R08: fffffffffffffe2c R09: 0000000000000002
[ 7285.049108] R10: 0000000000000001 R11: 0000000000000000 R12: 0000000000000000
[ 7285.052058] R13: ffffc900012f7f58 R14: 00000000000000e7 R15: 0000000000000000
[ 7285.055030] FS:  00007f7906d6c740(0000) GS:ffff88813bd80000(0000) knlGS:0000000000000000
[ 7285.058396] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 7285.060788] CR2: 00007f3ffa7e9bb8 CR3: 000000010bb00000 CR4: 00000000000006e0
[ 7285.063735] Call Trace:
[ 7285.064796]  <TASK>
[ 7285.065759]  syscall_trace_enter.constprop.0+0x122/0x1a0
[ 7285.067978]  do_syscall_64+0x16/0x80
[ 7285.069497]  entry_SYSCALL_64_after_hwframe+0x44/0xae
[ 7285.071590] RIP: 0033:0x7f7906e35f49
[ 7285.073118] Code: 00 4c 8b 05 29 6f 10 00 be e7 00 00 00 ba 3c 00 00 00 eb 12 0f 1f 44 00 00 89 d0 0f 05 48 3d 00 f0 ff ff 77 1c f4 89 f0 0f 05 <48> 3d 00 f0 ff ff 76 e7 f7 d8 64 41 89 00 eb df 0f 1f 80 00 00 00
[ 7285.078021] RSP: 002b:00007ffeee52db88 EFLAGS: 00000246 ORIG_RAX: 00000000000000e7
[ 7285.079995] RAX: ffffffffffffffda RBX: 00007f7906f39920 RCX: 00007f7906e35f49
[ 7285.081869] RDX: 000000000000003c RSI: 00000000000000e7 RDI: 0000000000000000
[ 7285.083729] RBP: 0000000000000000 R08: ffffffffffffff88 R09: 0000000000000001
[ 7285.085594] R10: fffffffffffffe2c R11: 0000000000000246 R12: 00007f7906f39920
[ 7285.087457] R13: 0000000000000001 R14: 00007f7906f3ee28 R15: 0000000000000000
[ 7285.089320]  </TASK>
[ 7285.089949] ---[ end trace 0000000000000000 ]---
[ 7285.091182] audit_panic: 22 callbacks suppressed
[ 7285.091185] audit: unrecoverable error in audit_syscall_entry()

Adn faddr_to_line tells me it is this:

void __audit_syscall_entry(int major, unsigned long a1, unsigned long a2,
                           unsigned long a3, unsigned long a4)
{
        struct audit_context *context = audit_context();
        enum audit_state     state;

        if (!audit_enabled || !context)
                return;

>>>>>>  WARN_ON(context->context != AUDIT_CTX_UNUSED);  <<<<<<
        WARN_ON(context->name_count);
        if (context->context != AUDIT_CTX_UNUSED || context->name_count) {
                audit_panic("unrecoverable error in audit_syscall_entry()");
                return;
        }
.....

I have no clue how the audit subsystem works, I don't even know how
to enable it, and I've never seen audit messages on the console of
this test machine. I have other test machines that have audit
enabled, and they do not dump warnings like this - the only thing I
see in the logs for those machines is this:

 run xfstest generic/633
 process 'vfstest' launched '/dev/fd/4/file1' with NULL argv: empty string added
 XFS (pmem0): Unmounting Filesystem
 XFS (pmem0): Mounting V5 Filesystem
 XFS (pmem0): Ending clean mount
 run xfstest generic/634

That's waht I was seeing from this test machine earlier in the week,
too - I've been running 5.18-rc7 as the base kernel all week - so
I'm not sure .....

Oooooohhhh.

/* The per-task audit context. */
struct audit_context {
        int                 dummy;      /* must be the first element */
        enum {
                AUDIT_CTX_UNUSED,       /* audit_context is currently unused */
                AUDIT_CTX_SYSCALL,      /* in use by syscall */
                AUDIT_CTX_URING,        /* in use by io_uring */
        } context;
....

And that reminded me of something. I commented on #xfs on Friday
afternoon:

[20/5/22 15:04] <dchinner> so of the 3.5 hours run time on the machine that jsut completed, it looks like about a dozen tests are responsible for an hour of that runtime...
[20/5/22 15:05] <dchinner> but it was a clean run with no failures in 1055 tests run.
[20/5/22 15:06] <dchinner> But there's some WTFs like this in it:
[20/5/22 15:06] <dchinner> generic/678     [not run] kernel does not support IO_URING
[20/5/22 15:08] <dchinner> yet the same kernel on a different machine:
[20/5/22 15:08] <dchinner> generic/678 11s ...  3s
[20/5/22 15:08] <dchinner> and they have the same userspace, too....

Yeah, the machine that complained about "kernel does not support
IO_URING" is the one that is throwing these warnings now. It wasn't
that the kernel didn't support io-uring, it was that the machine was
missing the liburing-dev library. I installed it and rebuilt
fstests. These audit failures co-incide with the first test runs
with io-uring enabled. And vfstest uses io_uring if fstests enables
it.

Hence this now smells like a pre-existing issue -  either a test bug
or an io_uring task audit context leak. Anyone got any ideas?

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

  parent reply	other threads:[~2022-05-21 23:13 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-05-12 16:52 [PATCH v2 00/13] rename & split tests Christian Brauner
2022-05-12 16:52 ` [PATCH v2 01/13] idmapped-mounts: make all tests set their required feature flags Christian Brauner
2022-05-16  5:50   ` Christoph Hellwig
2022-05-12 16:52 ` [PATCH v2 02/13] src: rename idmapped-mounts folder Christian Brauner
2022-05-12 16:52 ` [PATCH v2 03/13] src/vfs: rename idmapped-mounts.c file Christian Brauner
2022-05-12 16:52 ` [PATCH v2 04/13] vfstest: rename struct t_idmapped_mounts Christian Brauner
2022-05-12 16:52 ` [PATCH v2 05/13] utils: add missing global.h include Christian Brauner
2022-05-12 16:52 ` [PATCH v2 07/13] utils: move helpers into utils Christian Brauner
2022-05-12 16:52 ` [PATCH v2 08/13] missing: move sys_execveat() to missing.h Christian Brauner
2022-05-12 16:52 ` [PATCH v2 09/13] utils: add struct test_suite Christian Brauner
2022-05-12 16:52 ` [PATCH v2 12/13] vfstest: split out remaining idmapped mount tests Christian Brauner
2022-05-12 16:52 ` [PATCH v2 13/13] vfs/idmapped-mounts: remove invalid test Christian Brauner
2022-05-16  5:53   ` Christoph Hellwig
2022-05-12 20:19 ` [PATCH v2 00/13] rename & split tests Zorro Lang
2022-05-14 17:59 ` Zorro Lang
2022-05-16  8:02   ` Christian Brauner
2022-05-21 23:13 ` Dave Chinner [this message]
2022-05-22  1:07   ` Jens Axboe
2022-05-22  2:19     ` Jens Axboe
2022-05-23  0:13       ` Dave Chinner
2022-05-23  0:57         ` Jens Axboe
2022-05-23 10:41           ` Dave Chinner
2022-05-23 12:28             ` Jens Axboe
2022-05-23  9:44   ` Christian Brauner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220521231350.GY2306852@dread.disaster.area \
    --to=david@fromorbit.com \
    --cc=amir73il@gmail.com \
    --cc=brauner@kernel.org \
    --cc=djwong@kernel.org \
    --cc=fstests@vger.kernel.org \
    --cc=guaneryu@gmail.com \
    --cc=hch@lst.de \
    --cc=io-uring@vger.kernel.org \
    --cc=zlang@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.