All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ard Biesheuvel <ardb@kernel.org>
To: Yegor Yefremov <yegorslists@googlemail.com>
Cc: Arnd Bergmann <arnd@arndb.de>, Tony Lindgren <tony@atomide.com>,
	Linux-OMAP <linux-omap@vger.kernel.org>,
	linux-clk <linux-clk@vger.kernel.org>,
	Stephen Boyd <sboyd@kernel.org>,
	Linux ARM <linux-arm-kernel@lists.infradead.org>
Subject: Re: am335x: 5.18.x: system stalling
Date: Thu, 2 Jun 2022 12:37:32 +0200	[thread overview]
Message-ID: <CAMj1kXEfKLYYxt9imEO155oxWTzXtWPpF8txGZ-xCs_6vez-WA@mail.gmail.com> (raw)
In-Reply-To: <CAGm1_kvZ_6tPgfrTc3pH+6TedoU+mvuEXb+7aEp5mXfx516fmA@mail.gmail.com>

On Thu, 2 Jun 2022 at 12:17, Yegor Yefremov <yegorslists@googlemail.com> wrote:
>
> On Wed, Jun 1, 2022 at 12:50 PM Ard Biesheuvel <ardb@kernel.org> wrote:
> >
> > On Wed, 1 Jun 2022 at 12:46, Yegor Yefremov <yegorslists@googlemail.com> wrote:
> > >
> > > On Wed, Jun 1, 2022 at 12:06 PM Ard Biesheuvel <ardb@kernel.org> wrote:
> > > >
> > > > On Wed, 1 Jun 2022 at 12:04, Yegor Yefremov <yegorslists@googlemail.com> wrote:
> > > > >
> > > > > On Wed, Jun 1, 2022 at 11:28 AM Ard Biesheuvel <ardb@kernel.org> wrote:
> > > > > >
> > > > > > On Wed, 1 Jun 2022 at 10:08, Ard Biesheuvel <ardb@kernel.org> wrote:
> > > > > > >
> > > > > > > On Wed, 1 Jun 2022 at 09:59, Arnd Bergmann <arnd@arndb.de> wrote:
> > > > > > > >
> > > > > > > > On Wed, Jun 1, 2022 at 9:36 AM Yegor Yefremov
> > > > > > > > <yegorslists@googlemail.com> wrote:
> > > > > > > > > On Tue, May 31, 2022 at 5:23 PM Arnd Bergmann <arnd@arndb.de> wrote:
> > > > > > > > > > I've pushed a modified branch now, with that fix on the broken commit,
> > > > > > > > > > and another change to make CONFIG_IRQSTACKS user-selectable rather
> > > > > > > > > > than always enabled. That should tell us if the problem is in the SMP
> > > > > > > > > > patching or in the irqstacks.
> > > > > > > > > >
> > > > > > > > > > Can you test the top of this branch with CONFIG_IRQSTACKS disabled,
> > > > > > > > > > and (if that still stalls) retest the fixed commit f0191ea5c2e5 ("[PART 1]
> > > > > > > > > > ARM: implement THREAD_INFO_IN_TASK for uniprocessor systems")?
> > > > > > > > >
> > > > > > > > > 1. the top of this branch with CONFIG_IRQSTACKS disabled stalls
> > > > > > > > > 2. f0191ea5c2e5 with the same config - not
> > > > > > > >
> > > > > > > > Ok, perfect, that does narrow down the problem quite a bit: The final
> > > > > > > > patch has seven changes, all of which can be done individually because
> > > > > > > > in each case the simplified version in f0191ea5c2e5 is meant to run
> > > > > > > > the exact same instructions as the version after the change, when running
> > > > > > > > on a uniprocessor machine such as your am335x.
> > > > > > > >
> > > > > > > > You have already shown earlier that the get_current() and
> > > > > > > > __my_cpu_offset() functions are not to blame here, as reverting
> > > > > > > > only those does not change the behavior.
> > > > > > > >
> > > > > > > > This leaves the is_smp() check in set_current(), and the
> > > > > > > > four macros in <asm/assembler.h>. I don't see anything obviously
> > > > > > > > wrong with any of those five, but I would bet on the macros
> > > > > > > > here. Can you try bisecting into this commit, maybe reverting
> > > > > > > > the changes to set_current and get_current first, and then
> > > > > > > > narrowing it down to (hopefully) a single macro that causes the
> > > > > > > > problem?
> > > > > > > >
> > > > > > >
> > > > > > > set_current() is never called by the primary CPU, which is why the
> > > > > > > is_smp() check was removed from there in 57a420435edcb0b94 ("ARM: drop
> > > > > > > pointless SMP check on secondary startup path").
> > > > > > >
> > > > > > > So that leaves only the four macros in asm/assembler.h, but I don't
> > > > > > > see anything obviously wrong with those either.
> > > > > >
> > > > > > I pushed a patch on top of Arnd's branch at the link below that gets
> > > > > > rid of the subsections, and uses normal branches (and code patching)
> > > > > > to switch between the thread ID register and the LDR to retrieve the
> > > > > > CPU offset and the current pointer. I have no explanation whether or
> > > > > > why it could make a difference, but I think it's worth a try.
> > > > >
> > > > > The link to your repo is missing.
> > > > >
> > > >
> > > > Oops, sorry :-)
> > > >
> > > > https://git.kernel.org/pub/scm/linux/kernel/git/ardb/linux.git/log/?h=am335x-stall-test
> > >
> > > I have tested your branch and it stalls:
> > >
> >
> > OK, thanks for verifying.
>
> My bisection results for f0191ea5c2e5aab29484ede0493ca385eec5472f as a base:
>
> percpu.h: sporadic stalls
> current.h: always stalls
> assembler.h: no stalls
> smp.c: no stalls
>

So you mean that applying the changes to each of those files in
isolation to the baseline in f0191ea5c2e5aab29484ede0493ca385eec5472f
produces those results, right?

That confirms my statement that smp.c cannot be the culprit, and
appears to exonerate the pure asm pieces. I wonder if this is related
to insufficient asm constraints on the C helpers, or just the cost
model taking different decisions because the inline asm string is much
longer. In any case, this opens up a couple of avenues we could
explore to narrow this down further.

As a quick check, can you try the below snippet applied onto the
broken current.h build?

--- a/arch/arm/include/asm/current.h
+++ b/arch/arm/include/asm/current.h
@@ -53,7 +53,8 @@ static __always_inline __attribute_const__ struct
task_struct *get_current(void)
            "   b       . + (2b - 0b)                           \n\t"
            "   .popsection                                     \n\t"
 #endif
-           : "=r"(cur));
+           : "=r"(cur)
+           : "Q" (*(const unsigned long *)current_stack_pointer));
 #elif __LINUX_ARM_ARCH__>= 7 || \
       !defined(CONFIG_ARM_HAS_GROUP_RELOCS) || \
       (defined(MODULE) && defined(CONFIG_ARM_MODULE_PLTS))

Given that the problematic sequence appears to be in C code, could you
please confirm whether or not the stall is reproducible when all the
pieces that are used by the CAN stack (musb, slcan, ftdio-sio, etc)
are built into the kernel rather than built as modules? Also, which
GCC version are you using?

WARNING: multiple messages have this Message-ID (diff)
From: Ard Biesheuvel <ardb@kernel.org>
To: Yegor Yefremov <yegorslists@googlemail.com>
Cc: Arnd Bergmann <arnd@arndb.de>, Tony Lindgren <tony@atomide.com>,
	 Linux-OMAP <linux-omap@vger.kernel.org>,
	linux-clk <linux-clk@vger.kernel.org>,
	 Stephen Boyd <sboyd@kernel.org>,
	Linux ARM <linux-arm-kernel@lists.infradead.org>
Subject: Re: am335x: 5.18.x: system stalling
Date: Thu, 2 Jun 2022 12:37:32 +0200	[thread overview]
Message-ID: <CAMj1kXEfKLYYxt9imEO155oxWTzXtWPpF8txGZ-xCs_6vez-WA@mail.gmail.com> (raw)
In-Reply-To: <CAGm1_kvZ_6tPgfrTc3pH+6TedoU+mvuEXb+7aEp5mXfx516fmA@mail.gmail.com>

On Thu, 2 Jun 2022 at 12:17, Yegor Yefremov <yegorslists@googlemail.com> wrote:
>
> On Wed, Jun 1, 2022 at 12:50 PM Ard Biesheuvel <ardb@kernel.org> wrote:
> >
> > On Wed, 1 Jun 2022 at 12:46, Yegor Yefremov <yegorslists@googlemail.com> wrote:
> > >
> > > On Wed, Jun 1, 2022 at 12:06 PM Ard Biesheuvel <ardb@kernel.org> wrote:
> > > >
> > > > On Wed, 1 Jun 2022 at 12:04, Yegor Yefremov <yegorslists@googlemail.com> wrote:
> > > > >
> > > > > On Wed, Jun 1, 2022 at 11:28 AM Ard Biesheuvel <ardb@kernel.org> wrote:
> > > > > >
> > > > > > On Wed, 1 Jun 2022 at 10:08, Ard Biesheuvel <ardb@kernel.org> wrote:
> > > > > > >
> > > > > > > On Wed, 1 Jun 2022 at 09:59, Arnd Bergmann <arnd@arndb.de> wrote:
> > > > > > > >
> > > > > > > > On Wed, Jun 1, 2022 at 9:36 AM Yegor Yefremov
> > > > > > > > <yegorslists@googlemail.com> wrote:
> > > > > > > > > On Tue, May 31, 2022 at 5:23 PM Arnd Bergmann <arnd@arndb.de> wrote:
> > > > > > > > > > I've pushed a modified branch now, with that fix on the broken commit,
> > > > > > > > > > and another change to make CONFIG_IRQSTACKS user-selectable rather
> > > > > > > > > > than always enabled. That should tell us if the problem is in the SMP
> > > > > > > > > > patching or in the irqstacks.
> > > > > > > > > >
> > > > > > > > > > Can you test the top of this branch with CONFIG_IRQSTACKS disabled,
> > > > > > > > > > and (if that still stalls) retest the fixed commit f0191ea5c2e5 ("[PART 1]
> > > > > > > > > > ARM: implement THREAD_INFO_IN_TASK for uniprocessor systems")?
> > > > > > > > >
> > > > > > > > > 1. the top of this branch with CONFIG_IRQSTACKS disabled stalls
> > > > > > > > > 2. f0191ea5c2e5 with the same config - not
> > > > > > > >
> > > > > > > > Ok, perfect, that does narrow down the problem quite a bit: The final
> > > > > > > > patch has seven changes, all of which can be done individually because
> > > > > > > > in each case the simplified version in f0191ea5c2e5 is meant to run
> > > > > > > > the exact same instructions as the version after the change, when running
> > > > > > > > on a uniprocessor machine such as your am335x.
> > > > > > > >
> > > > > > > > You have already shown earlier that the get_current() and
> > > > > > > > __my_cpu_offset() functions are not to blame here, as reverting
> > > > > > > > only those does not change the behavior.
> > > > > > > >
> > > > > > > > This leaves the is_smp() check in set_current(), and the
> > > > > > > > four macros in <asm/assembler.h>. I don't see anything obviously
> > > > > > > > wrong with any of those five, but I would bet on the macros
> > > > > > > > here. Can you try bisecting into this commit, maybe reverting
> > > > > > > > the changes to set_current and get_current first, and then
> > > > > > > > narrowing it down to (hopefully) a single macro that causes the
> > > > > > > > problem?
> > > > > > > >
> > > > > > >
> > > > > > > set_current() is never called by the primary CPU, which is why the
> > > > > > > is_smp() check was removed from there in 57a420435edcb0b94 ("ARM: drop
> > > > > > > pointless SMP check on secondary startup path").
> > > > > > >
> > > > > > > So that leaves only the four macros in asm/assembler.h, but I don't
> > > > > > > see anything obviously wrong with those either.
> > > > > >
> > > > > > I pushed a patch on top of Arnd's branch at the link below that gets
> > > > > > rid of the subsections, and uses normal branches (and code patching)
> > > > > > to switch between the thread ID register and the LDR to retrieve the
> > > > > > CPU offset and the current pointer. I have no explanation whether or
> > > > > > why it could make a difference, but I think it's worth a try.
> > > > >
> > > > > The link to your repo is missing.
> > > > >
> > > >
> > > > Oops, sorry :-)
> > > >
> > > > https://git.kernel.org/pub/scm/linux/kernel/git/ardb/linux.git/log/?h=am335x-stall-test
> > >
> > > I have tested your branch and it stalls:
> > >
> >
> > OK, thanks for verifying.
>
> My bisection results for f0191ea5c2e5aab29484ede0493ca385eec5472f as a base:
>
> percpu.h: sporadic stalls
> current.h: always stalls
> assembler.h: no stalls
> smp.c: no stalls
>

So you mean that applying the changes to each of those files in
isolation to the baseline in f0191ea5c2e5aab29484ede0493ca385eec5472f
produces those results, right?

That confirms my statement that smp.c cannot be the culprit, and
appears to exonerate the pure asm pieces. I wonder if this is related
to insufficient asm constraints on the C helpers, or just the cost
model taking different decisions because the inline asm string is much
longer. In any case, this opens up a couple of avenues we could
explore to narrow this down further.

As a quick check, can you try the below snippet applied onto the
broken current.h build?

--- a/arch/arm/include/asm/current.h
+++ b/arch/arm/include/asm/current.h
@@ -53,7 +53,8 @@ static __always_inline __attribute_const__ struct
task_struct *get_current(void)
            "   b       . + (2b - 0b)                           \n\t"
            "   .popsection                                     \n\t"
 #endif
-           : "=r"(cur));
+           : "=r"(cur)
+           : "Q" (*(const unsigned long *)current_stack_pointer));
 #elif __LINUX_ARM_ARCH__>= 7 || \
       !defined(CONFIG_ARM_HAS_GROUP_RELOCS) || \
       (defined(MODULE) && defined(CONFIG_ARM_MODULE_PLTS))

Given that the problematic sequence appears to be in C code, could you
please confirm whether or not the stall is reproducible when all the
pieces that are used by the CAN stack (musb, slcan, ftdio-sio, etc)
are built into the kernel rather than built as modules? Also, which
GCC version are you using?

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  reply	other threads:[~2022-06-02 10:37 UTC|newest]

Thread overview: 115+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-05-04 10:35 am335x: 5.18.x: system stalling Yegor Yefremov
2022-05-05  5:08 ` Tony Lindgren
2022-05-11 14:16   ` Yegor Yefremov
2022-05-12  5:41     ` Tony Lindgren
2022-05-12  5:41       ` Tony Lindgren
2022-05-12  8:14       ` Arnd Bergmann
2022-05-12  8:14         ` Arnd Bergmann
2022-05-12  8:42       ` Arnd Bergmann
2022-05-12  8:42         ` Arnd Bergmann
2022-05-12 10:20         ` Yegor Yefremov
2022-05-12 10:20           ` Yegor Yefremov
2022-05-19 16:52           ` Yegor Yefremov
2022-05-19 16:52             ` Yegor Yefremov
2022-05-21 19:41             ` Arnd Bergmann
2022-05-21 19:41               ` Arnd Bergmann
2022-05-24 13:38               ` Yegor Yefremov
2022-05-24 13:38                 ` Yegor Yefremov
2022-05-24 14:19                 ` Tony Lindgren
2022-05-24 14:19                   ` Tony Lindgren
2022-05-26  5:49                   ` Yegor Yefremov
2022-05-26  5:49                     ` Yegor Yefremov
2022-05-26  6:20                     ` Tony Lindgren
2022-05-26  6:20                       ` Tony Lindgren
2022-05-26  8:19                       ` Ard Biesheuvel
2022-05-26  8:19                         ` Ard Biesheuvel
2022-05-26 12:37                         ` Yegor Yefremov
2022-05-26 12:37                           ` Yegor Yefremov
2022-05-26 14:15                           ` Arnd Bergmann
2022-05-26 14:15                             ` Arnd Bergmann
2022-05-27  4:44                             ` Yegor Yefremov
2022-05-27  4:44                               ` Yegor Yefremov
2022-05-27  6:38                               ` Arnd Bergmann
2022-05-27  6:38                                 ` Arnd Bergmann
2022-05-27  6:50                                 ` Tony Lindgren
2022-05-27  6:50                                   ` Tony Lindgren
2022-05-27  6:57                                   ` Arnd Bergmann
2022-05-27  6:57                                     ` Arnd Bergmann
2022-05-27  8:17                                     ` Yegor Yefremov
2022-05-27  8:17                                       ` Yegor Yefremov
2022-05-27  8:38                                       ` Arnd Bergmann
2022-05-27  8:38                                         ` Arnd Bergmann
2022-05-27  9:50                                         ` Yegor Yefremov
2022-05-27  9:50                                           ` Yegor Yefremov
2022-05-27 12:53                                           ` Arnd Bergmann
2022-05-27 12:53                                             ` Arnd Bergmann
2022-05-27 13:12                                             ` Ard Biesheuvel
2022-05-27 13:12                                               ` Ard Biesheuvel
2022-05-27 14:12                                               ` Arnd Bergmann
2022-05-27 14:12                                                 ` Arnd Bergmann
2022-05-28  5:48                                                 ` Yegor Yefremov
2022-05-28  5:48                                                   ` Yegor Yefremov
2022-05-28  7:53                                                   ` Arnd Bergmann
2022-05-28  7:53                                                     ` Arnd Bergmann
2022-05-28  8:29                                                     ` Yegor Yefremov
2022-05-28  8:29                                                       ` Yegor Yefremov
2022-05-28  9:07                                                       ` Ard Biesheuvel
2022-05-28  9:07                                                         ` Ard Biesheuvel
2022-05-28 13:01                                                         ` Yegor Yefremov
2022-05-28 13:01                                                           ` Yegor Yefremov
2022-05-28 13:13                                                           ` Arnd Bergmann
2022-05-28 13:13                                                             ` Arnd Bergmann
2022-05-28 19:28                                                             ` Yegor Yefremov
2022-05-28 19:28                                                               ` Yegor Yefremov
2022-05-30 10:16                                                               ` Ard Biesheuvel
2022-05-30 10:16                                                                 ` Ard Biesheuvel
2022-05-30 12:09                                                                 ` Yegor Yefremov
2022-05-30 12:09                                                                   ` Yegor Yefremov
2022-05-30 13:54                                                               ` Arnd Bergmann
2022-05-30 13:54                                                                 ` Arnd Bergmann
2022-05-30 15:14                                                                 ` Ard Biesheuvel
2022-05-30 15:14                                                                   ` Ard Biesheuvel
2022-05-31  8:36                                                                   ` Yegor Yefremov
2022-05-31  8:36                                                                     ` Yegor Yefremov
2022-05-31 14:16                                                                     ` Yegor Yefremov
2022-05-31 14:16                                                                       ` Yegor Yefremov
2022-05-31 15:22                                                                       ` Arnd Bergmann
2022-05-31 15:22                                                                         ` Arnd Bergmann
2022-06-01  7:36                                                                         ` Yegor Yefremov
2022-06-01  7:36                                                                           ` Yegor Yefremov
2022-06-01  7:59                                                                           ` Arnd Bergmann
2022-06-01  7:59                                                                             ` Arnd Bergmann
2022-06-01  8:08                                                                             ` Ard Biesheuvel
2022-06-01  8:08                                                                               ` Ard Biesheuvel
2022-06-01  9:27                                                                               ` Ard Biesheuvel
2022-06-01  9:27                                                                                 ` Ard Biesheuvel
2022-06-01 10:03                                                                                 ` Yegor Yefremov
2022-06-01 10:03                                                                                   ` Yegor Yefremov
2022-06-01 10:06                                                                                   ` Ard Biesheuvel
2022-06-01 10:06                                                                                     ` Ard Biesheuvel
2022-06-01 10:46                                                                                     ` Yegor Yefremov
2022-06-01 10:46                                                                                       ` Yegor Yefremov
2022-06-01 10:49                                                                                       ` Ard Biesheuvel
2022-06-01 10:49                                                                                         ` Ard Biesheuvel
2022-06-02 10:17                                                                                         ` Yegor Yefremov
2022-06-02 10:17                                                                                           ` Yegor Yefremov
2022-06-02 10:37                                                                                           ` Ard Biesheuvel [this message]
2022-06-02 10:37                                                                                             ` Ard Biesheuvel
2022-06-02 12:27                                                                                             ` Yegor Yefremov
2022-06-02 12:27                                                                                               ` Yegor Yefremov
2022-06-03  8:54                                                                                               ` Yegor Yefremov
2022-06-03  8:54                                                                                                 ` Yegor Yefremov
2022-06-03  9:32                                                                                                 ` Arnd Bergmann
2022-06-03  9:32                                                                                                   ` Arnd Bergmann
2022-06-03 19:11                                                                                                   ` Yegor Yefremov
2022-06-03 19:11                                                                                                     ` Yegor Yefremov
2022-06-03 20:46                                                                                                     ` Arnd Bergmann
2022-06-03 20:46                                                                                                       ` Arnd Bergmann
2022-06-05 14:59                                                                                                       ` Ard Biesheuvel
2022-06-05 14:59                                                                                                         ` Ard Biesheuvel
2022-06-07  8:55                                                                                                         ` Yegor Yefremov
2022-06-07  8:55                                                                                                           ` Yegor Yefremov
2022-08-12  7:35                                                                                                           ` Arnd Bergmann
2022-08-12  7:35                                                                                                             ` Arnd Bergmann
2022-05-24 14:36                 ` Arnd Bergmann
2022-05-24 14:36                   ` Arnd Bergmann

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAMj1kXEfKLYYxt9imEO155oxWTzXtWPpF8txGZ-xCs_6vez-WA@mail.gmail.com \
    --to=ardb@kernel.org \
    --cc=arnd@arndb.de \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-clk@vger.kernel.org \
    --cc=linux-omap@vger.kernel.org \
    --cc=sboyd@kernel.org \
    --cc=tony@atomide.com \
    --cc=yegorslists@googlemail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.