linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Pavel Tatashin <pasha.tatashin@soleen.com>
To: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: James Morse <james.morse@arm.com>,
	James Morris <jmorris@namei.org>, Sasha Levin <sashal@kernel.org>,
	kexec mailing list <kexec@lists.infradead.org>,
	LKML <linux-kernel@vger.kernel.org>,
	Jonathan Corbet <corbet@lwn.net>,
	Catalin Marinas <catalin.marinas@arm.com>,
	Will Deacon <will@kernel.org>,
	Linux ARM <linux-arm-kernel@lists.infradead.org>,
	Marc Zyngier <maz@kernel.org>,
	Vladimir Murzin <vladimir.murzin@arm.com>,
	Matthias Brugger <matthias.bgg@gmail.com>,
	linux-mm <linux-mm@kvack.org>,
	Mark Rutland <mark.rutland@arm.com>,
	steve.capper@arm.com, rfontana@redhat.com,
	Thomas Gleixner <tglx@linutronix.de>,
	Selin Dag <selindag@gmail.com>,
	Tyler Hicks <tyhicks@linux.microsoft.com>
Subject: Re: [PATCH v11 0/6] arm64: MMU enabled kexec relocation
Date: Thu, 4 Feb 2021 10:23:03 -0500	[thread overview]
Message-ID: <CA+CK2bD9NQ=waaoi4=Ub9o9uKWaAs_ZO8LEwxB2XFhacRhnbOQ@mail.gmail.com> (raw)
In-Reply-To: <871rdwocwh.fsf@x220.int.ebiederm.org>

> > I understand that having an extra set of page tables could potentially
> > waste memory, especially if VAs are sparse, but in this case we use
> > page tables exclusively for contiguous VA space (copy [src, src +
> > size]). Therefore, the extra memory usage is tiny. The ratio for
> > kernels with  4K page_size is (size of relocated memory) / 512.  A
> > normal initrd + kernel is usually under 64M, an extra space which
> > means ~128K for the page table. Even with a huge relocation, where
> > initrd is ~512M the extra memory usage in the worst case is just ~1M.
> > I really doubt we will have any problem from users because of such
> > small overhead in comparison to the total kexec-load size.

Hi Eric,

>
> Foolish question.

Thank you for your e-mail, you gave some interesting insights.

>
> Does arm64 have something like 2M pages that it can use for the
> linear map?

Yes, with 4K pages arm64 as well has 2M pages, but arm64 also has a
choice of 16K and 64K and second level pages are bigger there.

> On x86_64 we always generate page tables, because they are necessary to
> be in 64bit mode.  As I recall on x86_64 we always use 2M pages which
> means for each 4K of page tables we map 1GiB of memory.   Which is very
> tiny.
>
> If you do as well as x86_64 for arm64 I suspect that will be good enough
> for people to not claim regression.
>
> Would a variation on the x86_64 implementation that allocates page
> tables work for arm64?
...
>
> As long as the page table provided is a linear mapping of physical
> memory (aka it looks like paging is disabled).  The the code that
> relocates memory should be pretty much the same.
>
> My experience with other architectures suggests only a couple of
> instructions need to be different to deal with a MMU being enabled.

I think what you are proposing is similar to what James proposed. Yes,
for a linear map relocation should be pretty much the same as we do
relocation as with MMU disabled.

Linear map still uses memory, because page tables must be outside of
destination addresses of segments of the next kernel. Therefore, we
must allocate a page table for the linear map. It might be a little
smaller, but in reality the difference is small with 4K pages, and
insignificant with 64K pages. The benefit of my approach is that the
assembly copy loop is simpler, and allows hardware prefetching to
work.

The regular relocation loop works like this:

for (entry = head; !(entry & IND_DONE); entry = *ptr++) {
        addr = __va(entry & PAGE_MASK);

        switch (entry & IND_FLAGS) {
        case IND_DESTINATION:
                dest = addr;
                break;
        case IND_INDIRECTION:
                ptr = addr;
                break;
        case IND_SOURCE:
                copy_page(dest, addr);
                dest += PAGE_SIZE;
        }
}

The entry for the next relocation page has to be always fetched, and
therefore prefetching cannot help with the actual loop.

In comparison, the loop that I am proposing is like this:

for (addr = head; addr < end; addr += PAGE_SIZE, dst += PAGE_SIZE)
        copy_page(dest, addr);

Here is assembly code for my loop:

1: copy_page x1, x2, x3, x4, x5, x6, x7, x8, x9, x10
    sub x11, x11, #PAGE_SIZE
    cbnz x11, 1b

That said, if James and you agree that linear map is the way to go
forward, I am OK with that as well, as it is still much better than
having no caching at all.

Thank you,
Pasha

  reply	other threads:[~2021-02-04 15:27 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-01-27 17:27 [PATCH v11 0/6] arm64: MMU enabled kexec relocation Pavel Tatashin
2021-01-27 17:27 ` [PATCH v11 1/6] arm64: kexec: add expandable argument to relocation function Pavel Tatashin
2021-01-27 17:27 ` [PATCH v11 2/6] arm64: kexec: use ld script for " Pavel Tatashin
2021-01-27 17:27 ` [PATCH v11 3/6] arm64: kexec: kexec may require EL2 vectors Pavel Tatashin
2021-01-27 17:27 ` [PATCH v11 4/6] arm64: kexec: configure trans_pgd page table for kexec Pavel Tatashin
2021-01-27 17:27 ` [PATCH v11 5/6] arm64: kexec: enable MMU during kexec relocation Pavel Tatashin
2021-01-27 17:27 ` [PATCH v11 6/6] arm64: kexec: remove head from relocation argument Pavel Tatashin
2021-02-01 18:32 ` [PATCH v11 0/6] arm64: MMU enabled kexec relocation James Morse
2021-02-01 19:59   ` Pavel Tatashin
2021-02-04  1:11     ` Eric W. Biederman
2021-02-04 15:23       ` Pavel Tatashin [this message]
2021-02-04 22:02         ` Eric W. Biederman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CA+CK2bD9NQ=waaoi4=Ub9o9uKWaAs_ZO8LEwxB2XFhacRhnbOQ@mail.gmail.com' \
    --to=pasha.tatashin@soleen.com \
    --cc=catalin.marinas@arm.com \
    --cc=corbet@lwn.net \
    --cc=ebiederm@xmission.com \
    --cc=james.morse@arm.com \
    --cc=jmorris@namei.org \
    --cc=kexec@lists.infradead.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mark.rutland@arm.com \
    --cc=matthias.bgg@gmail.com \
    --cc=maz@kernel.org \
    --cc=rfontana@redhat.com \
    --cc=sashal@kernel.org \
    --cc=selindag@gmail.com \
    --cc=steve.capper@arm.com \
    --cc=tglx@linutronix.de \
    --cc=tyhicks@linux.microsoft.com \
    --cc=vladimir.murzin@arm.com \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).