All of lore.kernel.org
 help / color / mirror / Atom feed
From: Daniel Jordan <daniel.m.jordan@oracle.com>
To: Robin Murphy <robin.murphy@arm.com>
Cc: Daniel Jordan <daniel.m.jordan@oracle.com>,
	Corentin Labbe <clabbe.montjoie@gmail.com>,
	mark.rutland@arm.com, jiangshanlai@gmail.com,
	linux-kernel@vger.kernel.org, linux-crypto@vger.kernel.org,
	tj@kernel.org, Will Deacon <will@kernel.org>,
	linux-arm-kernel@lists.infradead.org
Subject: Re: WARNING: at kernel/workqueue.c:1473 __queue_work+0x3b8/0x3d0
Date: Tue, 3 Mar 2020 16:30:17 -0500	[thread overview]
Message-ID: <20200303213017.tanczhqd3nhpeeak@ca-dmjordan1.us.oracle.com> (raw)
In-Reply-To: <e7c92da2-42c0-a97d-7427-6fdc769b41b9@arm.com>

On Mon, Mar 02, 2020 at 06:00:10PM +0000, Robin Murphy wrote:
> On 02/03/2020 5:25 pm, Daniel Jordan wrote:
> Something smelled familiar about this discussion, and sure enough that merge
> contains c4741b230597 ("crypto: run initcalls for generic implementations
> earlier"), which has raised its head before[1].

Yep, that looks suspicious.

The bisect didn't point to that specific commit, even though my version of git
tries commits in the merge.  I'm probably missing something.

> > Does this fix it?  I can't verify but figure it's worth trying the simplest
> > explanation first, which is that the work isn't initialized by the time it's
> > queued.
> 
> The relative initcall levels would appear to explain the symptom - I guess
> the question is whether this represents a bug in a particular test/algorithm
> (as with the unaligned accesses) or a fundamental problem in the
> infrastructure now being able to poke the module loader too early.

I'm not familiar with the crypto code.  Could it be that the commit moved some
request_module() calls before modules_wq_init()?

And, is it "too early" or just "earlier"?  When is it too early for modprobe?

Barring other ideas, Corentin, would you be willing to boot with

    trace_event=initcall:*,module:* trace_options=stacktrace

and

diff --git a/kernel/module.c b/kernel/module.c
index 33569a01d6e1..393be6979a27 100644
--- a/kernel/module.c
+++ b/kernel/module.c
@@ -3604,8 +3604,11 @@ static noinline int do_init_module(struct module *mod)
 	 * be cleaned up needs to sync with the queued work - ie
 	 * rcu_barrier()
 	 */
-	if (llist_add(&freeinit->node, &init_free_list))
+	if (llist_add(&freeinit->node, &init_free_list)) {
+		pr_warn("%s: schedule_work for mod=%s\n", __func__, mod->name);
+		dump_stack();
 		schedule_work(&init_free_wq);
+	}
 
 	mutex_unlock(&module_mutex);
 	wake_up_all(&module_wq);

but not my earlier fix and share the dmesg and ftrace output to see if the
theory holds?

Also, could you attach your config?  Curious now what your crypto options look
like after fiddling with some of them today while trying and failing to see
this on x86.

thanks,
Daniel

WARNING: multiple messages have this Message-ID (diff)
From: Daniel Jordan <daniel.m.jordan@oracle.com>
To: Robin Murphy <robin.murphy@arm.com>
Cc: mark.rutland@arm.com, jiangshanlai@gmail.com,
	linux-kernel@vger.kernel.org,
	Daniel Jordan <daniel.m.jordan@oracle.com>,
	Corentin Labbe <clabbe.montjoie@gmail.com>,
	linux-crypto@vger.kernel.org, tj@kernel.org,
	Will Deacon <will@kernel.org>,
	linux-arm-kernel@lists.infradead.org
Subject: Re: WARNING: at kernel/workqueue.c:1473 __queue_work+0x3b8/0x3d0
Date: Tue, 3 Mar 2020 16:30:17 -0500	[thread overview]
Message-ID: <20200303213017.tanczhqd3nhpeeak@ca-dmjordan1.us.oracle.com> (raw)
In-Reply-To: <e7c92da2-42c0-a97d-7427-6fdc769b41b9@arm.com>

On Mon, Mar 02, 2020 at 06:00:10PM +0000, Robin Murphy wrote:
> On 02/03/2020 5:25 pm, Daniel Jordan wrote:
> Something smelled familiar about this discussion, and sure enough that merge
> contains c4741b230597 ("crypto: run initcalls for generic implementations
> earlier"), which has raised its head before[1].

Yep, that looks suspicious.

The bisect didn't point to that specific commit, even though my version of git
tries commits in the merge.  I'm probably missing something.

> > Does this fix it?  I can't verify but figure it's worth trying the simplest
> > explanation first, which is that the work isn't initialized by the time it's
> > queued.
> 
> The relative initcall levels would appear to explain the symptom - I guess
> the question is whether this represents a bug in a particular test/algorithm
> (as with the unaligned accesses) or a fundamental problem in the
> infrastructure now being able to poke the module loader too early.

I'm not familiar with the crypto code.  Could it be that the commit moved some
request_module() calls before modules_wq_init()?

And, is it "too early" or just "earlier"?  When is it too early for modprobe?

Barring other ideas, Corentin, would you be willing to boot with

    trace_event=initcall:*,module:* trace_options=stacktrace

and

diff --git a/kernel/module.c b/kernel/module.c
index 33569a01d6e1..393be6979a27 100644
--- a/kernel/module.c
+++ b/kernel/module.c
@@ -3604,8 +3604,11 @@ static noinline int do_init_module(struct module *mod)
 	 * be cleaned up needs to sync with the queued work - ie
 	 * rcu_barrier()
 	 */
-	if (llist_add(&freeinit->node, &init_free_list))
+	if (llist_add(&freeinit->node, &init_free_list)) {
+		pr_warn("%s: schedule_work for mod=%s\n", __func__, mod->name);
+		dump_stack();
 		schedule_work(&init_free_wq);
+	}
 
 	mutex_unlock(&module_mutex);
 	wake_up_all(&module_wq);

but not my earlier fix and share the dmesg and ftrace output to see if the
theory holds?

Also, could you attach your config?  Curious now what your crypto options look
like after fiddling with some of them today while trying and failing to see
this on x86.

thanks,
Daniel

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  reply	other threads:[~2020-03-03 21:32 UTC|newest]

Thread overview: 39+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-02-17 20:48 WARNING: at kernel/workqueue.c:1473 __queue_work+0x3b8/0x3d0 Corentin Labbe
2020-02-17 20:48 ` Corentin Labbe
2020-02-18 16:35 ` Daniel Jordan
2020-02-18 16:35   ` Daniel Jordan
2020-02-20  9:03   ` Corentin Labbe
2020-02-20  9:03     ` Corentin Labbe
2020-02-21 17:42     ` Daniel Jordan
2020-02-21 17:42       ` Daniel Jordan
2020-02-28 12:33       ` Will Deacon
2020-02-28 12:33         ` Will Deacon
2020-02-28 15:33         ` Daniel Jordan
2020-02-28 15:33           ` Daniel Jordan
2020-03-01 17:53           ` Corentin Labbe
2020-03-01 17:53             ` Corentin Labbe
2020-03-02 17:25             ` Daniel Jordan
2020-03-02 17:25               ` Daniel Jordan
2020-03-02 18:00               ` Robin Murphy
2020-03-02 18:00                 ` Robin Murphy
2020-03-03 21:30                 ` Daniel Jordan [this message]
2020-03-03 21:30                   ` Daniel Jordan
2020-03-03 22:43                   ` Eric Biggers
2020-03-03 22:43                     ` Eric Biggers
2020-03-06 16:12                     ` Daniel Jordan
2020-03-06 16:12                       ` Daniel Jordan
2020-10-01 17:50                   ` Corentin Labbe
2020-10-05 17:09                     ` Daniel Jordan
2020-10-05 17:09                       ` Daniel Jordan
2020-10-07 19:41                       ` Corentin Labbe
2020-10-07 19:41                         ` Corentin Labbe
2020-10-08 17:07                         ` Daniel Jordan
2020-10-08 17:07                           ` Daniel Jordan
2020-03-03  7:48               ` Corentin Labbe
2020-03-03  7:48                 ` Corentin Labbe
2020-03-03 21:31                 ` Daniel Jordan
2020-03-03 21:31                   ` Daniel Jordan
2020-09-25 18:12                   ` Corentin Labbe
2020-09-25 18:12                     ` Corentin Labbe
2020-09-30 18:18                     ` Daniel Jordan
2020-09-30 18:18                       ` Daniel Jordan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200303213017.tanczhqd3nhpeeak@ca-dmjordan1.us.oracle.com \
    --to=daniel.m.jordan@oracle.com \
    --cc=clabbe.montjoie@gmail.com \
    --cc=jiangshanlai@gmail.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-crypto@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mark.rutland@arm.com \
    --cc=robin.murphy@arm.com \
    --cc=tj@kernel.org \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.