linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Giancarlo Ferrari <giancarlo.ferrari89@gmail.com>
To: Mark Rutland <mark.rutland@arm.com>
Cc: linux-arm-kernel@lists.infradead.org, linux@armlinux.org.uk,
	linux-kernel@vger.kernel.org, akpm@linux-foundation.org,
	rppt@kernel.org, penberg@kernel.org, geert@linux-m68k.org,
	giancarlo.ferrari@nokia.com
Subject: Re: [PATCH] ARM: kexec: Fix panic after TLB are invalidated
Date: Mon, 1 Feb 2021 19:09:41 +0000	[thread overview]
Message-ID: <20210201190938.GB15399@p4> (raw)
In-Reply-To: <20210201153012.GC66060@C02TD0UTHF1T.local>

Hi,

On Mon, Feb 01, 2021 at 03:30:12PM +0000, Mark Rutland wrote:
> Hi,
> 
> On Mon, Feb 01, 2021 at 02:39:46PM +0000, Giancarlo Ferrari wrote:
> > On Mon, Feb 01, 2021 at 12:47:20PM +0000, Mark Rutland wrote:
> > > On Mon, Feb 01, 2021 at 12:44:56AM +0000, Giancarlo Ferrari wrote:
> > > > machine_kexec() need to set rw permission in text and rodata sections
> > > > to assign some variables (e.g. kexec_start_address). To do that at
> > > > the end (after flushing pdm in memory, etc.) it needs to invalidate
> > > > TLB [section] entries.
> > > 
> > > It'd be worth noting explicitly that set_kernel_text_rw() alters
> > > current->active_mm...
> > > 
> > > > If during the TLB invalidation an interrupt occours, which might cause
> > > > a context switch, there is the risk to inject invalid TLBs, with ro
> > > > permissions.
> > > 
> > > ... which is why if there's a context switch things can go wrong, since
> > > active_mm isn't stable, and so it's possible that set_kernel_text_rw()
> > > updates multiple tables, none of which might be the active table at the
> > > point we try to make an access.
> > 
> > Maybe the behaviour causing issue is not completely clear to me, and I do
> > apologize for that (moreover I haven't eougth debug capabilities).
> 
> I think we're in rough agreement that the issue is likely related to the
> context switch, but our understanding of the specifics differs (and I
> think we're missing a detail here).
> 
> > However, current-active_mm is switched among context switches. Correct ?
> 
> In some cases current->active_mm is not switched, and can be inherited
> over a context switch. When switching to a user task, we always switch
> to its mm (which becomes the active_mm), but when switching to a kthread
> we retain the previous task's mm as the active_mm as part of the lazy
> context switch.
> 
> So while a kthread is preemptible, its active_mm (and active ASID) can
> change under its feet. That could happen anywhere while the task is
> preemptible, e.g. in the middle of set_kernel_text_rw(), or
> mid-modification to the kexec variables.
> 

Yes.

In my understanding, even in the case of user process, when current->active_mm is switched,
we can run into trouble. For instance:

- Process A issue kexec (PageTables entry of A has 0x8000_0000-0x8010_0000 with ro
  permission and section is global, no NG bit set).

- A context switch happens in the middle of set_kernel_text_rw(), right after the
  section 0x8000_0000-0x8010_0000 has been invalidated.

- Process B, in its execution, re-inject its own PageTable entry with ro permission, which
  is not shared with Process A (and is not touched by the previous invalidation) in the MMU
  cache.

- When Process A, is rescheduled, it carries on with the invalidation, but unfortunately I have
  "wrong" entries in the MMU.

> > So, in principle, the invalidation, if stopped, is carried on where it
> > left.
> 
> That's true so long as all the processes we switch between share the
> same leaf tables for the region we're altering. If not, then the lazy
> context switch means that those tables can change under our feet.
> 
> I believe the tables mapping the kernel text are shared by all threads,
> and if so this _should_ be true. Russell might be able to confirm that
> or correct me if I have that wrong.
> 

I am not ready to put my hand on the fire, but I seen the behaviour described above.

> > I thought the issue was that the PageTable entry for the section 0x8010_0000
> > is global, thus not indexed by ASID (Address Space ID). By the fact that each
> > process has its own version of that entry, is the cause of the issue, as the
> > schedule process might bringing a spurious entry (with ro permission) in the
> > MMU cache.
> 
> The TLB invalidation performed under set_kernel_text_rw() affects all
> ASIDs on the current CPU, so there shouldn't be any stale RO TLB entries
> to hit unless the kthread is migrated to another CPU.
> 
> > If the entry is not global holds the ASID, and the issue cannot happen.
> 
> I don't think that's true, since switching to a different active_mm
> would also change ASID, and we'd have no additional guarantee.
> 

I agree, my fault.

> Thanks,
> Mark.

Thanks,


GF

  reply	other threads:[~2021-02-01 19:10 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-02-01  0:44 [PATCH] ARM: kexec: Fix panic after TLB are invalidated Giancarlo Ferrari
2021-02-01 11:34 ` Russell King - ARM Linux admin
2021-02-01 12:47 ` Mark Rutland
2021-02-01 13:03   ` Russell King - ARM Linux admin
2021-02-01 13:57     ` Mark Rutland
2021-02-01 16:08       ` Russell King - ARM Linux admin
2021-02-01 16:32         ` Mark Rutland
2021-02-01 16:37           ` Russell King - ARM Linux admin
2021-02-01 20:07         ` Giancarlo Ferrari
2021-02-01 20:16           ` Russell King - ARM Linux admin
2021-02-01 22:18             ` Giancarlo Ferrari
2021-02-04 23:48               ` Giancarlo Ferrari
2021-02-05  0:18                 ` Russell King - ARM Linux admin
2021-02-05  0:40                   ` Giancarlo Ferrari
2021-02-05  0:45                     ` Giancarlo Ferrari
2021-02-05  9:44                     ` Russell King - ARM Linux admin
2021-02-05 14:36                       ` Giancarlo Ferrari
2021-02-01 14:39   ` Giancarlo Ferrari
2021-02-01 15:30     ` Mark Rutland
2021-02-01 19:09       ` Giancarlo Ferrari [this message]
  -- strict thread matches above, loose matches on Subject: below --
2021-01-12 16:49 Giancarlo Ferrari
2021-02-01 10:10 ` Giancarlo Ferrari

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210201190938.GB15399@p4 \
    --to=giancarlo.ferrari89@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=geert@linux-m68k.org \
    --cc=giancarlo.ferrari@nokia.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux@armlinux.org.uk \
    --cc=mark.rutland@arm.com \
    --cc=penberg@kernel.org \
    --cc=rppt@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).