From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.2 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3A0CCC35240 for ; Tue, 28 Jan 2020 04:58:07 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id BED8A24682 for ; Tue, 28 Jan 2020 04:58:06 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org BED8A24682 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=arm.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 15E776B0007; Mon, 27 Jan 2020 23:58:06 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 0E9866B0008; Mon, 27 Jan 2020 23:58:06 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id ECA856B000A; Mon, 27 Jan 2020 23:58:05 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0190.hostedemail.com [216.40.44.190]) by kanga.kvack.org (Postfix) with ESMTP id D146E6B0007 for ; Mon, 27 Jan 2020 23:58:05 -0500 (EST) Received: from smtpin11.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with SMTP id 6C2908248047 for ; Tue, 28 Jan 2020 04:58:05 +0000 (UTC) X-FDA: 76425836130.11.camp29_467e9b6204305 X-HE-Tag: camp29_467e9b6204305 X-Filterd-Recvd-Size: 11621 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf14.hostedemail.com (Postfix) with ESMTP for ; Tue, 28 Jan 2020 04:58:04 +0000 (UTC) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 2B96C31B; Mon, 27 Jan 2020 20:58:03 -0800 (PST) Received: from [10.163.1.151] (unknown [10.163.1.151]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 97BA23F68E; Mon, 27 Jan 2020 20:57:49 -0800 (PST) Subject: Re: [PATCH V12] mm/debug: Add tests validating architecture page table helpers To: Qian Cai Cc: Linux-MM , Andrew Morton , Vlastimil Babka , Greg Kroah-Hartman , Thomas Gleixner , Mike Rapoport , Jason Gunthorpe , Dan Williams , Peter Zijlstra , Michal Hocko , Mark Rutland , Mark Brown , Steven Price , Ard Biesheuvel , Masahiro Yamada , Kees Cook , Tetsuo Handa , Matthew Wilcox , Sri Krishna chowdary , Dave Hansen , Russell King - ARM Linux , Michael Ellerman , Paul Mackerras , Martin Schwidefsky , Heiko Carstens , "David S. Miller" , Vineet Gupta , James Hogan , Paul Burton , Ralf Baechle , "Kirill A . Shutemov" , Gerald Schaefer , Christophe Leroy , Ingo Molnar , linux-snps-arc@lists.infradead.org, linux-mips@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-ia64@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, linux-s390@vger.kernel.org, linux-sh@vger.kernel.org, sparclinux@vger.kernel.org, x86@kernel.org, linux-kernel@vger.kernel.org References: <1580174873-18117-1-git-send-email-anshuman.khandual@arm.com> <14882A91-17DE-4ABD-ABF2-08E7CCEDF660@lca.pw> <214c0d53-eb34-9b0c-2e4e-1aa005146331@arm.com> <016A776F-EFD9-4D2B-A3A9-788008617D95@lca.pw> From: Anshuman Khandual Message-ID: <012158b7-a40e-050f-cd1b-d6ce7faf042f@arm.com> Date: Tue, 28 Jan 2020 10:27:46 +0530 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.9.1 MIME-Version: 1.0 In-Reply-To: <016A776F-EFD9-4D2B-A3A9-788008617D95@lca.pw> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 01/28/2020 09:03 AM, Qian Cai wrote: >=20 >=20 >> On Jan 27, 2020, at 10:06 PM, Anshuman Khandual wrote: >> >> >> >> On 01/28/2020 07:41 AM, Qian Cai wrote: >>> >>> >>>> On Jan 27, 2020, at 8:28 PM, Anshuman Khandual wrote: >>>> >>>> This adds tests which will validate architecture page table helpers = and >>>> other accessors in their compliance with expected generic MM semanti= cs. >>>> This will help various architectures in validating changes to existi= ng >>>> page table helpers or addition of new ones. >>>> >>>> This test covers basic page table entry transformations including bu= t not >>>> limited to old, young, dirty, clean, write, write protect etc at var= ious >>>> level along with populating intermediate entries with next page tabl= e page >>>> and validating them. >>>> >>>> Test page table pages are allocated from system memory with required= size >>>> and alignments. The mapped pfns at page table levels are derived fro= m a >>>> real pfn representing a valid kernel text symbol. This test gets cal= led >>>> right after page_alloc_init_late(). >>>> >>>> This gets build and run when CONFIG_DEBUG_VM_PGTABLE is selected alo= ng with >>>> CONFIG_VM_DEBUG. Architectures willing to subscribe this test also n= eed to >>>> select CONFIG_ARCH_HAS_DEBUG_VM_PGTABLE which for now is limited to = x86 and >>>> arm64. Going forward, other architectures too can enable this after = fixing >>>> build or runtime problems (if any) with their page table helpers. >> >> Hello Qian, >> >>> >>> What=E2=80=99s the value of this block of new code? It only supports = x86 and arm64 >>> which are supposed to be good now. >> >> We have been over the usefulness of this code many times before as the= patch is >> already in it's V12. Currently it is enabled on arm64, x86 (except PAE= ), arc and >> ppc32. There are build time or runtime problems with other archs which= prevent >=20 > I am not sure if I care too much about arc and ppc32 which are pretty m= uch legacy > platforms. Okay but FWIW the maintainers for all these enabled platforms cared for t= his test at the least and really helped in shaping the test to it's current state.= Besides I am still failing to understand your point here about evaluating particu= lar feature's usefulness based on it's support on relative and perceived importance of = some platforms compared to others. Again the idea is to integrate all platforms eventual= ly but we had discovered build and runtime issues which needs to be resolved at platfor= m level first. Unless I am mistaken, debug feature like this which is putting down a fra= mework while also benefiting some initial platforms to start with, will be a potential= candidate for eventual inclusion in the mainline. Otherwise, please point to any other = agreed upon community criteria for debug feature's mainline inclusion which I will tr= y to adhere. I wonder if all other similar debug features from the past ever met 'the = all inclusive at the beginning' criteria which you are trying to propose here. This tes= t also adds a feature file, enlisting all supported archs as suggested by Ingo for the = exact same reason. This is not the first time, a feature is listing out archs which = are supported and archs which are not. >=20 >> enablement of this test (for the moment) but then the goal is to integ= rate all >> of them going forward. The test not only validates platform's adherenc= e to the >> expected semantics from generic MM but also helps in keeping it that w= ay during >> code changes in future as well. >=20 > Another option maybe to get some decent arches on board first before me= rging this > thing, so it have more changes to catch regressions for developers who = might run this.=20 >=20 >> >>> Did those tests ever find any regression or this is almost only usefu= l for new >> >> The test has already found problems with s390 page table helpers. >=20 > Hmm, that is pretty weak where s390 is not even official supported with= this version. And there were valid reasons why s390 could not be enabled just yet as ex= plained by s390 folks during our previous discussions. I just pointed out an example wher= e this test was useful as you had asked previously. Not being official supported in this = version does not take away the fact the it was indeed useful for that platform in disc= overing a bug. >=20 >> >>> architectures which only happened once in a few years? >> >> Again, not only it validates what exist today but its also a tool to m= ake >> sure that all platforms continue adhere to a common agreed upon semant= ics >> as reflected through the tests here. >> >>> The worry if not many people will use this config and code those that= much in >> >> Debug features or tests in the kernel are used when required. These ar= e never or >> should not be enabled by default. AFAICT this is true even for entire = DEBUG_VM >> packaged tests. Do you have any particular data or precedence to subst= antiate >> the fact that this test will be used any less often than the other sim= ilar ones >> in the tree ? I can only speak for arm64 platform but the very idea fo= r this >> test came from Catalin when we were trying to understand the semantics= for THP >> helpers while enabling THP migration without split. Apart from going o= ver the >> commit messages from the past, there were no other way to figure out h= ow any >> particular page table helper is suppose to change given page table ent= ry. This >> test tries to formalize those semantics. >=20 > I am thinking about how we made so many mistakes before by merging too = many of > those debugging options that many of them have been broken for many rel= eases > proving that nobody actually used them regularly. We don=E2=80=99t need= to repeat the same Again will ask for some data to substantiate these claims. Though I am no= t really sure but believe that there are integration test frameworks out there whi= ch regularly validates each of these code path on multiple platforms. One such automat= ion found that V11 of the test was broken on X86 PAE platform which I fixed. Noneth= eless, I can speak only for arm64 platform and we intend to use this test to validate = arm64 exported page table helpers. Citing unsubstantiated past examples should not reall= y block these enabled platforms (arm64 at the very least) from getting this debug featu= re which has already demonstrated it's usefulness during arm64 THP migration developme= nt and on s390 platforms as well. > mistake again. I am actually thinking about to remove things like page= _poisoning often > which is almost are never found any bug recently and only cause pains w= hen interacting > with other new features that almost nobody will test them together to b= egin with. > We even have some SLUB debugging code sit there for almost 15 years tha= t almost > nobody used it and maintainers refused to remove it. Unlike those, the proposed test here is isolated as a stand alone test an= d stays clear off from any other code path. I have not been involved in or aware of the= usefulness of existing MM debug features and hence will just leave them upto the judgme= nt of the maintainers whether to keep or discard them. >=20 >> >>> the future because it is inefficient to find bugs, it will simply be = rotten >> Could you be more specific here ? What parts of the test are inefficie= nt ? I >> am happy to improve upon the test. Do let me know you if you have sugg= estions. >> >>> like a few other debugging options out there we have in the mainline = that >> will be a pain to remove later on. >>> >> >> Even though I am not agreeing to your assessment about the usefulness = of the >> test without any substantial data backing up the claims, the test case= in >> itself is very much compartmentalized, staying clear from generic MM a= nd >> debug_vm_pgtable() is only function executing the test which is gettin= g >> called from kernel_init_freeable() path. >=20 > I am thinking exactly the other way around. You are proposing to merge = this tests > without proving how useful it will be able to find regressions for futu= re developers > to make sure it will actually get used. As I had mentioned before, the test attempts to formalize page table help= er semantics as expected from generic MM code paths and intend to catch deviations whe= n enabled on a given platform. How else should we test semantics errors otherwise ? Th= ere are past examples of usefulness for this procedure on arm64 and on s390. I am wond= ering how else to prove the usefulness of a debug feature if these references are n= ot enough. >=20