From: Gerald Schaefer <gerald.schaefer@de.ibm.com>
To: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>,
linux-ia64@vger.kernel.org, linux-sh@vger.kernel.org,
Peter Zijlstra <peterz@infradead.org>,
James Hogan <jhogan@kernel.org>,
Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>,
Heiko Carstens <heiko.carstens@de.ibm.com>,
Michal Hocko <mhocko@kernel.org>,
linux-mm@kvack.org, Dave Hansen <dave.hansen@intel.com>,
Paul Mackerras <paulus@samba.org>,
sparclinux@vger.kernel.org, Thomas Gleixner <tglx@linutronix.de>,
linux-s390@vger.kernel.org, x86@kernel.org,
Russell King - ARM Linux <linux@armlinux.org.uk>,
Matthew Wilcox <willy@infradead.org>,
Steven Price <Steven.Price@arm.com>,
Jason Gunthorpe <jgg@ziepe.ca>,
linux-arm-kernel@lists.infradead.org,
linux-snps-arc@lists.infradead.org,
Kees Cook <keescook@chromium.org>,
Masahiro Yamada <yamada.masahiro@socionext.com>,
Mark Brown <broonie@kernel.org>,
Dan Williams <dan.j.williams@intel.com>,
Vlastimil Babka <vbabka@suse.cz>,
Sri Krishna chowdary <schowdary@nvidia.com>,
Ard Biesheuvel <ard.biesheuvel@linaro.org>,
Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
linux-mips@vger.kernel.org, Ralf Baechle <ralf@linux-mips.org>,
linux-kernel@vger.kernel.org, Paul Burton <paul.burton@mips.com>,
Mike Rapoport <rppt@linux.vnet.ibm.com>,
Vineet Gupta <vgupta@synopsys.com>,
Martin Schwidefsky <schwidefsky@de.ibm.com>,
Andrew Morton <akpm@linux-foundation.org>,
linuxppc-dev@lists.ozlabs.org,
"David S. Miller" <davem@davemloft.net>
Subject: Re: [PATCH 1/1] mm/pgtable/debug: Add test validating architecture page table helpers
Date: Fri, 6 Sep 2019 21:03:46 +0200 [thread overview]
Message-ID: <20190906210346.5ecbff01@thinkpad> (raw)
In-Reply-To: <3c609e33-afbb-ffaf-481a-6d225a06d1d0@arm.com>
On Fri, 6 Sep 2019 11:58:59 +0530
Anshuman Khandual <anshuman.khandual@arm.com> wrote:
> On 09/05/2019 10:36 PM, Gerald Schaefer wrote:
> > On Thu, 5 Sep 2019 14:48:14 +0530
> > Anshuman Khandual <anshuman.khandual@arm.com> wrote:
> >
> >>> [...]
> >>>> +
> >>>> +#if !defined(__PAGETABLE_PMD_FOLDED) && !defined(__ARCH_HAS_4LEVEL_HACK)
> >>>> +static void pud_clear_tests(pud_t *pudp)
> >>>> +{
> >>>> + memset(pudp, RANDOM_NZVALUE, sizeof(pud_t));
> >>>> + pud_clear(pudp);
> >>>> + WARN_ON(!pud_none(READ_ONCE(*pudp)));
> >>>> +}
> >>>
> >>> For pgd/p4d/pud_clear(), we only clear if the page table level is present
> >>> and not folded. The memset() here overwrites the table type bits, so
> >>> pud_clear() will not clear anything on s390 and the pud_none() check will
> >>> fail.
> >>> Would it be possible to OR a (larger) random value into the table, so that
> >>> the lower 12 bits would be preserved?
> >>
> >> So the suggestion is instead of doing memset() on entry with RANDOM_NZVALUE,
> >> it should OR a large random value preserving lower 12 bits. Hmm, this should
> >> still do the trick for other platforms, they just need non zero value. So on
> >> s390, the lower 12 bits on the page table entry already has valid value while
> >> entering this function which would make sure that pud_clear() really does
> >> clear the entry ?
> >
> > Yes, in theory the table entry on s390 would have the type set in the last
> > 4 bits, so preserving those would be enough. If it does not conflict with
> > others, I would still suggest preserving all 12 bits since those would contain
> > arch-specific flags in general, just to be sure. For s390, the pte/pmd tests
> > would also work with the memset, but for consistency I think the same logic
> > should be used in all pxd_clear_tests.
>
> Makes sense but..
>
> There is a small challenge with this. Modifying individual bits on a given
> page table entry from generic code like this test case is bit tricky. That
> is because there are not enough helpers to create entries with an absolute
> value. This would have been easier if all the platforms provided functions
> like __pxx() which is not the case now. Otherwise something like this should
> have worked.
>
>
> pud_t pud = READ_ONCE(*pudp);
> pud = __pud(pud_val(pud) | RANDOM_VALUE (keeping lower 12 bits 0))
> WRITE_ONCE(*pudp, pud);
>
> But __pud() will fail to build in many platforms.
Hmm, I simply used this on my system to make pud_clear_tests() work, not
sure if it works on all archs:
pud_val(*pudp) |= RANDOM_NZVALUE;
>
> The other alternative will be to make sure memset() happens on all other
> bits except the lower 12 bits which will depend on endianness. If s390
> has a fixed endianness, we can still use either of them which will hold
> good for others as well.
>
> memset(pudp, RANDOM_NZVALUE, sizeof(pud_t) - 3);
>
> OR
>
> memset(pudp + 3, RANDOM_NZVALUE, sizeof(pud_t) - 3);
>
> >
> > However, there is another issue on s390 which will make this only work
> > for pud_clear_tests(), and not for the p4d/pgd_tests. The problem is that
> > mm_alloc() will only give you a 3-level page table initially on s390.
> > This means that pudp == p4dp == pgdp, and so the p4d/pgd_tests will
> > both see the pud level (of course this also affects other tests).
>
> Got it.
>
> >
> > Not sure yet how to fix this, i.e. how to initialize/update the page table
> > to 5 levels. We can handle 5 level page tables, and it would be good if
> > all levels could be tested, but using mm_alloc() to establish the page
> > tables might not work on s390. One option could be to provide an arch-hook
> > or weak function to allocate/initialize the mm.
>
> Sure, got it. Though I plan to do add some arch specific tests or init sequence
> like the above later on but for now the idea is to get the smallest possible set
> of test cases which builds and runs on all platforms without requiring any arch
> specific hooks or special casing (#ifdef) to be agreed upon broadly and accepted.
>
> Do you think this is absolutely necessary on s390 for the very first set of test
> cases or we can add this later on as an improvement ?
It can be added later, no problem. I did not expect this to work flawlessly
on s390 right from the start anyway, with all our peculiarities, so don't
let this hinder you. I might come up with an add-on patch later.
Actually, using get_unmapped_area() as suggested by Kirill could also
solve this issue. We do create a new mm with 3-level page tables on s390,
and the dynamic upgrade to 4 or 5 levels is then triggered exactly by
arch_get_unmapped_area(), depending on the addr. But I currently don't
see how / where arch_get_unmapped_area() is set up for such a dummy mm
created by mm_alloc().
Regards,
Gerald
next prev parent reply other threads:[~2019-09-06 19:06 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-09-03 8:01 [PATCH 0/1] mm/debug: Add tests for architecture exported page table helpers Anshuman Khandual
2019-09-03 8:01 ` [PATCH 1/1] mm/pgtable/debug: Add test validating architecture " Anshuman Khandual
2019-09-03 11:13 ` kbuild test robot
2019-09-04 6:14 ` Anshuman Khandual
2019-09-04 14:19 ` Kirill A. Shutemov
2019-09-05 8:18 ` Anshuman Khandual
2019-09-05 8:59 ` Kirill A. Shutemov
2019-09-06 7:03 ` Anshuman Khandual
2019-09-04 20:16 ` Gerald Schaefer
2019-09-05 9:18 ` Anshuman Khandual
2019-09-05 17:06 ` Gerald Schaefer
2019-09-06 6:28 ` Anshuman Khandual
2019-09-06 19:03 ` Gerald Schaefer [this message]
2019-09-09 6:26 ` Anshuman Khandual
2019-09-09 15:13 ` Kirill A. Shutemov
2019-09-10 3:56 ` Anshuman Khandual
2019-09-10 4:45 ` Christophe Leroy
2019-09-10 5:43 ` Anshuman Khandual
2019-09-09 16:51 ` Gerald Schaefer
2019-09-04 23:14 ` Dave Hansen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190906210346.5ecbff01@thinkpad \
--to=gerald.schaefer@de.ibm.com \
--cc=Steven.Price@arm.com \
--cc=akpm@linux-foundation.org \
--cc=anshuman.khandual@arm.com \
--cc=ard.biesheuvel@linaro.org \
--cc=broonie@kernel.org \
--cc=dan.j.williams@intel.com \
--cc=dave.hansen@intel.com \
--cc=davem@davemloft.net \
--cc=gregkh@linuxfoundation.org \
--cc=heiko.carstens@de.ibm.com \
--cc=jgg@ziepe.ca \
--cc=jhogan@kernel.org \
--cc=keescook@chromium.org \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-ia64@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mips@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linux-s390@vger.kernel.org \
--cc=linux-sh@vger.kernel.org \
--cc=linux-snps-arc@lists.infradead.org \
--cc=linux@armlinux.org.uk \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=mark.rutland@arm.com \
--cc=mhocko@kernel.org \
--cc=paul.burton@mips.com \
--cc=paulus@samba.org \
--cc=penguin-kernel@i-love.sakura.ne.jp \
--cc=peterz@infradead.org \
--cc=ralf@linux-mips.org \
--cc=rppt@linux.vnet.ibm.com \
--cc=schowdary@nvidia.com \
--cc=schwidefsky@de.ibm.com \
--cc=sparclinux@vger.kernel.org \
--cc=tglx@linutronix.de \
--cc=vbabka@suse.cz \
--cc=vgupta@synopsys.com \
--cc=willy@infradead.org \
--cc=x86@kernel.org \
--cc=yamada.masahiro@socionext.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).