All of lore.kernel.org
 help / color / mirror / Atom feed
From: Anshuman Khandual <anshuman.khandual@arm.com>
To: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>,
	"Russell King (Oracle)" <linux@armlinux.org.uk>,
	"linux-ia64@vger.kernel.org" <linux-ia64@vger.kernel.org>,
	"linux-sh@vger.kernel.org" <linux-sh@vger.kernel.org>,
	"linux-mips@vger.kernel.org" <linux-mips@vger.kernel.org>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	"sparclinux@vger.kernel.org" <sparclinux@vger.kernel.org>,
	"linux-riscv@lists.infradead.org"
	<linux-riscv@lists.infradead.org>,
	"linux-arch@vger.kernel.org" <linux-arch@vger.kernel.org>,
	"linux-s390@vger.kernel.org" <linux-s390@vger.kernel.org>,
	"linux-hexagon@vger.kernel.org" <linux-hexagon@vger.kernel.org>,
	"linux-csky@vger.kernel.org" <linux-csky@vger.kernel.org>,
	Christoph Hellwig <hch@infradead.org>,
	"linux-snps-arc@lists.infradead.org" 
	<linux-snps-arc@lists.infradead.org>,
	"linux-xtensa@linux-xtensa.org" <linux-xtensa@linux-xtensa.org>,
	Arnd Bergmann <arnd@arndb.de>,
	"linux-um@lists.infradead.org" <linux-um@lists.infradead.org>,
	"linux-m68k@lists.linux-m68k.org"
	<linux-m68k@lists.linux-m68k.org>,
	"openrisc@lists.librecores.org" <openrisc@lists.librecores.org>,
	"linux-arm-kernel@lists.infradead.org" 
	<linux-arm-kernel@lists.infradead.org>,
	"linux-parisc@vger.kernel.org" <linux-parisc@vger.kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"linux-alpha@vger.kernel.org" <linux-alpha@vger.kernel.org>,
	"akpm@linux-foundation.org" <akpm@linux-foundation.org>,
	"linuxppc-dev@lists.ozlabs.org" <linuxppc-dev@lists.ozlabs.org>
Subject: Re: [PATCH V3 09/30] arm/mm: Enable ARCH_HAS_VM_GET_PAGE_PROT
Date: Wed, 9 Mar 2022 17:03:28 +0530	[thread overview]
Message-ID: <70a99f28-ea69-58e3-f919-85551943c0a3@arm.com> (raw)
In-Reply-To: <CAMuHMdUPw_yBMe9XxQHHuf40it3V5mGuOA10KnMpXt124Ay8Tw@mail.gmail.com>



On 3/2/22 16:44, Geert Uytterhoeven wrote:
> Hi Anshuman,
> 
> On Wed, Mar 2, 2022 at 12:07 PM Anshuman Khandual
> <anshuman.khandual@arm.com> wrote:
>> On 3/2/22 3:35 PM, Geert Uytterhoeven wrote:
>>> On Wed, Mar 2, 2022 at 10:51 AM Anshuman Khandual
>>> <anshuman.khandual@arm.com> wrote:
>>>> On 3/2/22 12:35 PM, Christophe Leroy wrote:
>>>>> Le 02/03/2022 à 04:22, Anshuman Khandual a écrit :
>>>>>> On 3/1/22 1:46 PM, Christophe Leroy wrote:
>>>>>>> Le 01/03/2022 à 01:31, Russell King (Oracle) a écrit :
>>>>>>>> On Tue, Mar 01, 2022 at 05:30:41AM +0530, Anshuman Khandual wrote:
>>>>>>>>> On 2/28/22 4:27 PM, Russell King (Oracle) wrote:
>>>>>>>>>> On Mon, Feb 28, 2022 at 04:17:32PM +0530, Anshuman Khandual wrote:
>>>>>>>>>>> This defines and exports a platform specific custom vm_get_page_prot() via
>>>>>>>>>>> subscribing ARCH_HAS_VM_GET_PAGE_PROT. Subsequently all __SXXX and __PXXX
>>>>>>>>>>> macros can be dropped which are no longer needed.
>>>>>>>>>>
>>>>>>>>>> What I would really like to know is why having to run _code_ to work out
>>>>>>>>>> what the page protections need to be is better than looking it up in a
>>>>>>>>>> table.
>>>>>>>>>>
>>>>>>>>>> Not only is this more expensive in terms of CPU cycles, it also brings
>>>>>>>>>> additional code size with it.
>>>>>>>>>>
>>>>>>>>>> I'm struggling to see what the benefit is.
>>>>>>>>>
>>>>>>>>> Currently vm_get_page_prot() is also being _run_ to fetch required page
>>>>>>>>> protection values. Although that is being run in the core MM and from a
>>>>>>>>> platform perspective __SXXX, __PXXX are just being exported for a table.
>>>>>>>>> Looking it up in a table (and applying more constructs there after) is
>>>>>>>>> not much different than a clean switch case statement in terms of CPU
>>>>>>>>> usage. So this is not more expensive in terms of CPU cycles.
>>>>>>>>
>>>>>>>> I disagree.
>>>>>>>
>>>>>>> So do I.
>>>>>>>
>>>>>>>>
>>>>>>>> However, let's base this disagreement on some evidence. Here is the
>>>>>>>> present 32-bit ARM implementation:
>>>>>>>>
>>>>>>>> 00000048 <vm_get_page_prot>:
>>>>>>>>         48:       e200000f        and     r0, r0, #15
>>>>>>>>         4c:       e3003000        movw    r3, #0
>>>>>>>>                           4c: R_ARM_MOVW_ABS_NC   .LANCHOR1
>>>>>>>>         50:       e3403000        movt    r3, #0
>>>>>>>>                           50: R_ARM_MOVT_ABS      .LANCHOR1
>>>>>>>>         54:       e7930100        ldr     r0, [r3, r0, lsl #2]
>>>>>>>>         58:       e12fff1e        bx      lr
>>>>>>>>
>>>>>>>> That is five instructions long.
>>>>>>>
>>>>>>> On ppc32 I get:
>>>>>>>
>>>>>>> 00000094 <vm_get_page_prot>:
>>>>>>>         94: 3d 20 00 00     lis     r9,0
>>>>>>>                     96: R_PPC_ADDR16_HA     .data..ro_after_init
>>>>>>>         98: 54 84 16 ba     rlwinm  r4,r4,2,26,29
>>>>>>>         9c: 39 29 00 00     addi    r9,r9,0
>>>>>>>                     9e: R_PPC_ADDR16_LO     .data..ro_after_init
>>>>>>>         a0: 7d 29 20 2e     lwzx    r9,r9,r4
>>>>>>>         a4: 91 23 00 00     stw     r9,0(r3)
>>>>>>>         a8: 4e 80 00 20     blr
>>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>> Please show that your new implementation is not more expensive on
>>>>>>>> 32-bit ARM. Please do so by building a 32-bit kernel, and providing
>>>>>>>> the disassembly.
>>>>>>>
>>>>>>> With your series I get:
>>>>>>>
>>>>>>> 00000000 <vm_get_page_prot>:
>>>>>>>      0:     3d 20 00 00     lis     r9,0
>>>>>>>                     2: R_PPC_ADDR16_HA      .rodata
>>>>>>>      4:     39 29 00 00     addi    r9,r9,0
>>>>>>>                     6: R_PPC_ADDR16_LO      .rodata
>>>>>>>      8:     54 84 16 ba     rlwinm  r4,r4,2,26,29
>>>>>>>      c:     7d 49 20 2e     lwzx    r10,r9,r4
>>>>>>>     10:     7d 4a 4a 14     add     r10,r10,r9
>>>>>>>     14:     7d 49 03 a6     mtctr   r10
>>>>>>>     18:     4e 80 04 20     bctr
>>>>>>>     1c:     39 20 03 15     li      r9,789
>>>>>>>     20:     91 23 00 00     stw     r9,0(r3)
>>>>>>>     24:     4e 80 00 20     blr
>>>>>>>     28:     39 20 01 15     li      r9,277
>>>>>>>     2c:     91 23 00 00     stw     r9,0(r3)
>>>>>>>     30:     4e 80 00 20     blr
>>>>>>>     34:     39 20 07 15     li      r9,1813
>>>>>>>     38:     91 23 00 00     stw     r9,0(r3)
>>>>>>>     3c:     4e 80 00 20     blr
>>>>>>>     40:     39 20 05 15     li      r9,1301
>>>>>>>     44:     91 23 00 00     stw     r9,0(r3)
>>>>>>>     48:     4e 80 00 20     blr
>>>>>>>     4c:     39 20 01 11     li      r9,273
>>>>>>>     50:     4b ff ff d0     b       20 <vm_get_page_prot+0x20>
>>>>>>>
>>>>>>>
>>>>>>> That is definitely more expensive, it implements a table of branches.
>>>>>>
>>>>>> Okay, will split out the PPC32 implementation that retains existing
>>>>>> table look up method. Also planning to keep that inside same file
>>>>>> (arch/powerpc/mm/mmap.c), unless you have a difference preference.
>>>>>
>>>>> My point was not to get something specific for PPC32, but to amplify on
>>>>> Russell's objection.
>>>>>
>>>>> As this is bad for ARM and bad for PPC32, do we have any evidence that
>>>>> your change is good for any other architecture ?
>>>>>
>>>>> I checked PPC64 and there is exactly the same drawback. With the current
>>>>> implementation it is a small function performing table read then a few
>>>>> adjustment. After your change it is a bigger function implementing a
>>>>> table of branches.
>>>>
>>>> I am wondering if this would not be the case for any other switch case
>>>> statement on the platform ? Is there something specific/different just
>>>> on vm_get_page_prot() implementation ? Are you suggesting that switch
>>>> case statements should just be avoided instead ?
>>>>
>>>>>
>>>>> So, as requested by Russell, could you look at the disassembly for other
>>>>> architectures and show us that ARM and POWERPC are the only ones for
>>>>> which your change is not optimal ?
>>>>
>>>> But the primary purpose of this series is not to guarantee optimized
>>>> code on platform by platform basis, while migrating from a table based
>>>> look up method into a switch case statement.
>>>>
>>>> But instead, the purposes is to remove current levels of unnecessary
>>>> abstraction while converting a vm_flags access combination into page
>>>> protection. The switch case statement for platform implementation of
>>>> vm_get_page_prot() just seemed logical enough. Christoph's original
>>>> suggestion patch for x86 had the same implementation as well.
>>>>
>>>> But if the table look up is still better/preferred method on certain
>>>> platforms like arm or ppc32, will be happy to preserve that.
>>>
>>> I doubt the switch() variant would give better code on any platform.
>>>
>>> What about using tables everywhere, using designated initializers
>>> to improve readability?
>>
>> Designated initializers ? Could you please be more specific. A table look
>> up on arm platform would be something like this and arm_protection_map[]
>> needs to be updated with user_pgprot like before. Just wondering how a
>> designated initializer will help here.
> 
> It's more readable than the original:
> 
>     pgprot_t protection_map[16] __ro_after_init = {
>             __P000, __P001, __P010, __P011, __P100, __P101, __P110, __P111,
>             __S000, __S001, __S010, __S011, __S100, __S101, __S110, __S111
>     };
> 
>>
>> static pgprot_t arm_protection_map[16] __ro_after_init = {
>>        [VM_NONE]                                       = __PAGE_NONE,
>>        [VM_READ]                                       = __PAGE_READONLY,
>>        [VM_WRITE]                                      = __PAGE_COPY,
>>        [VM_WRITE | VM_READ]                            = __PAGE_COPY,
>>        [VM_EXEC]                                       = __PAGE_READONLY_EXEC,
>>        [VM_EXEC | VM_READ]                             = __PAGE_READONLY_EXEC,
>>        [VM_EXEC | VM_WRITE]                            = __PAGE_COPY_EXEC,
>>        [VM_EXEC | VM_WRITE | VM_READ]                  = __PAGE_COPY_EXEC,
>>        [VM_SHARED]                                     = __PAGE_NONE,
>>        [VM_SHARED | VM_READ]                           = __PAGE_READONLY,
>>        [VM_SHARED | VM_WRITE]                          = __PAGE_SHARED,
>>        [VM_SHARED | VM_WRITE | VM_READ]                = __PAGE_SHARED,
>>        [VM_SHARED | VM_EXEC]                           = __PAGE_READONLY_EXEC,
>>        [VM_SHARED | VM_EXEC | VM_READ]                 = __PAGE_READONLY_EXEC,
>>        [VM_SHARED | VM_EXEC | VM_WRITE]                = __PAGE_SHARED_EXEC,
>>        [VM_SHARED | VM_EXEC | VM_WRITE | VM_READ]      = __PAGE_SHARED_EXEC
>> };
> 
> Yeah, like that.
> 
> Seems like you already made such a conversion in
> https://lore.kernel.org/all/1645425519-9034-3-git-send-email-anshuman.khandual@arm.com/

Will rework the series in two different phases as mentioned on the other thread.

WARNING: multiple messages have this Message-ID (diff)
From: Anshuman Khandual <anshuman.khandual@arm.com>
To: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>,
	"Russell King (Oracle)" <linux@armlinux.org.uk>,
	"linux-ia64@vger.kernel.org" <linux-ia64@vger.kernel.org>,
	"linux-sh@vger.kernel.org" <linux-sh@vger.kernel.org>,
	"linux-mips@vger.kernel.org" <linux-mips@vger.kernel.org>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	"sparclinux@vger.kernel.org" <sparclinux@vger.kernel.org>,
	"linux-riscv@lists.infradead.org"
	<linux-riscv@lists.infradead.org>,
	"linux-arch@vger.kernel.org" <linux-arch@vger.kernel.org>,
	"linux-s390@vger.kernel.org" <linux-s390@vger.kernel.org>,
	"linux-hexagon@vger.kernel.org" <linux-hexagon@vger.kernel.org>,
	"linux-csky@vger.kernel.org" <linux-csky@vger.kernel.org>,
	Christoph Hellwig <hch@infradead.org>,
	"linux-snps-arc@lists.infradead.org"
	<linux-snps-arc@lists.infradead.org>,
	"linux-xtensa@linux-xtensa.org" <linux-xtensa@linux-xtensa.org>,
	Arnd Bergmann <arnd@arndb.de>,
	"linux-um@lists.infradead.org" <linux-um@lists.infradead.org>,
	"linux-m68k@lists.linux-m68k.org"
	<linux-m68k@lists.linux-m68k.org>,
	"openrisc@lists.librecores.org" <openrisc@lists.librecores.org>,
	"linux-arm-kernel@lists.infradead.org"
	<linux-arm-kernel@lists.infradead.org>,
	"linux-parisc@vger.kernel.org" <linux-parisc@vger.kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"linux-alpha@vger.kernel.org" <linux-alpha@vger.kernel.org>,
	"akpm@linux-foundation.org" <akpm@linux-foundation.org>,
	"linuxppc-dev@lists.ozlabs.org" <linuxppc-dev@lists.ozlabs.org>
Subject: Re: [PATCH V3 09/30] arm/mm: Enable ARCH_HAS_VM_GET_PAGE_PROT
Date: Wed, 9 Mar 2022 17:03:28 +0530	[thread overview]
Message-ID: <70a99f28-ea69-58e3-f919-85551943c0a3@arm.com> (raw)
In-Reply-To: <CAMuHMdUPw_yBMe9XxQHHuf40it3V5mGuOA10KnMpXt124Ay8Tw@mail.gmail.com>



On 3/2/22 16:44, Geert Uytterhoeven wrote:
> Hi Anshuman,
> 
> On Wed, Mar 2, 2022 at 12:07 PM Anshuman Khandual
> <anshuman.khandual@arm.com> wrote:
>> On 3/2/22 3:35 PM, Geert Uytterhoeven wrote:
>>> On Wed, Mar 2, 2022 at 10:51 AM Anshuman Khandual
>>> <anshuman.khandual@arm.com> wrote:
>>>> On 3/2/22 12:35 PM, Christophe Leroy wrote:
>>>>> Le 02/03/2022 à 04:22, Anshuman Khandual a écrit :
>>>>>> On 3/1/22 1:46 PM, Christophe Leroy wrote:
>>>>>>> Le 01/03/2022 à 01:31, Russell King (Oracle) a écrit :
>>>>>>>> On Tue, Mar 01, 2022 at 05:30:41AM +0530, Anshuman Khandual wrote:
>>>>>>>>> On 2/28/22 4:27 PM, Russell King (Oracle) wrote:
>>>>>>>>>> On Mon, Feb 28, 2022 at 04:17:32PM +0530, Anshuman Khandual wrote:
>>>>>>>>>>> This defines and exports a platform specific custom vm_get_page_prot() via
>>>>>>>>>>> subscribing ARCH_HAS_VM_GET_PAGE_PROT. Subsequently all __SXXX and __PXXX
>>>>>>>>>>> macros can be dropped which are no longer needed.
>>>>>>>>>>
>>>>>>>>>> What I would really like to know is why having to run _code_ to work out
>>>>>>>>>> what the page protections need to be is better than looking it up in a
>>>>>>>>>> table.
>>>>>>>>>>
>>>>>>>>>> Not only is this more expensive in terms of CPU cycles, it also brings
>>>>>>>>>> additional code size with it.
>>>>>>>>>>
>>>>>>>>>> I'm struggling to see what the benefit is.
>>>>>>>>>
>>>>>>>>> Currently vm_get_page_prot() is also being _run_ to fetch required page
>>>>>>>>> protection values. Although that is being run in the core MM and from a
>>>>>>>>> platform perspective __SXXX, __PXXX are just being exported for a table.
>>>>>>>>> Looking it up in a table (and applying more constructs there after) is
>>>>>>>>> not much different than a clean switch case statement in terms of CPU
>>>>>>>>> usage. So this is not more expensive in terms of CPU cycles.
>>>>>>>>
>>>>>>>> I disagree.
>>>>>>>
>>>>>>> So do I.
>>>>>>>
>>>>>>>>
>>>>>>>> However, let's base this disagreement on some evidence. Here is the
>>>>>>>> present 32-bit ARM implementation:
>>>>>>>>
>>>>>>>> 00000048 <vm_get_page_prot>:
>>>>>>>>         48:       e200000f        and     r0, r0, #15
>>>>>>>>         4c:       e3003000        movw    r3, #0
>>>>>>>>                           4c: R_ARM_MOVW_ABS_NC   .LANCHOR1
>>>>>>>>         50:       e3403000        movt    r3, #0
>>>>>>>>                           50: R_ARM_MOVT_ABS      .LANCHOR1
>>>>>>>>         54:       e7930100        ldr     r0, [r3, r0, lsl #2]
>>>>>>>>         58:       e12fff1e        bx      lr
>>>>>>>>
>>>>>>>> That is five instructions long.
>>>>>>>
>>>>>>> On ppc32 I get:
>>>>>>>
>>>>>>> 00000094 <vm_get_page_prot>:
>>>>>>>         94: 3d 20 00 00     lis     r9,0
>>>>>>>                     96: R_PPC_ADDR16_HA     .data..ro_after_init
>>>>>>>         98: 54 84 16 ba     rlwinm  r4,r4,2,26,29
>>>>>>>         9c: 39 29 00 00     addi    r9,r9,0
>>>>>>>                     9e: R_PPC_ADDR16_LO     .data..ro_after_init
>>>>>>>         a0: 7d 29 20 2e     lwzx    r9,r9,r4
>>>>>>>         a4: 91 23 00 00     stw     r9,0(r3)
>>>>>>>         a8: 4e 80 00 20     blr
>>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>> Please show that your new implementation is not more expensive on
>>>>>>>> 32-bit ARM. Please do so by building a 32-bit kernel, and providing
>>>>>>>> the disassembly.
>>>>>>>
>>>>>>> With your series I get:
>>>>>>>
>>>>>>> 00000000 <vm_get_page_prot>:
>>>>>>>      0:     3d 20 00 00     lis     r9,0
>>>>>>>                     2: R_PPC_ADDR16_HA      .rodata
>>>>>>>      4:     39 29 00 00     addi    r9,r9,0
>>>>>>>                     6: R_PPC_ADDR16_LO      .rodata
>>>>>>>      8:     54 84 16 ba     rlwinm  r4,r4,2,26,29
>>>>>>>      c:     7d 49 20 2e     lwzx    r10,r9,r4
>>>>>>>     10:     7d 4a 4a 14     add     r10,r10,r9
>>>>>>>     14:     7d 49 03 a6     mtctr   r10
>>>>>>>     18:     4e 80 04 20     bctr
>>>>>>>     1c:     39 20 03 15     li      r9,789
>>>>>>>     20:     91 23 00 00     stw     r9,0(r3)
>>>>>>>     24:     4e 80 00 20     blr
>>>>>>>     28:     39 20 01 15     li      r9,277
>>>>>>>     2c:     91 23 00 00     stw     r9,0(r3)
>>>>>>>     30:     4e 80 00 20     blr
>>>>>>>     34:     39 20 07 15     li      r9,1813
>>>>>>>     38:     91 23 00 00     stw     r9,0(r3)
>>>>>>>     3c:     4e 80 00 20     blr
>>>>>>>     40:     39 20 05 15     li      r9,1301
>>>>>>>     44:     91 23 00 00     stw     r9,0(r3)
>>>>>>>     48:     4e 80 00 20     blr
>>>>>>>     4c:     39 20 01 11     li      r9,273
>>>>>>>     50:     4b ff ff d0     b       20 <vm_get_page_prot+0x20>
>>>>>>>
>>>>>>>
>>>>>>> That is definitely more expensive, it implements a table of branches.
>>>>>>
>>>>>> Okay, will split out the PPC32 implementation that retains existing
>>>>>> table look up method. Also planning to keep that inside same file
>>>>>> (arch/powerpc/mm/mmap.c), unless you have a difference preference.
>>>>>
>>>>> My point was not to get something specific for PPC32, but to amplify on
>>>>> Russell's objection.
>>>>>
>>>>> As this is bad for ARM and bad for PPC32, do we have any evidence that
>>>>> your change is good for any other architecture ?
>>>>>
>>>>> I checked PPC64 and there is exactly the same drawback. With the current
>>>>> implementation it is a small function performing table read then a few
>>>>> adjustment. After your change it is a bigger function implementing a
>>>>> table of branches.
>>>>
>>>> I am wondering if this would not be the case for any other switch case
>>>> statement on the platform ? Is there something specific/different just
>>>> on vm_get_page_prot() implementation ? Are you suggesting that switch
>>>> case statements should just be avoided instead ?
>>>>
>>>>>
>>>>> So, as requested by Russell, could you look at the disassembly for other
>>>>> architectures and show us that ARM and POWERPC are the only ones for
>>>>> which your change is not optimal ?
>>>>
>>>> But the primary purpose of this series is not to guarantee optimized
>>>> code on platform by platform basis, while migrating from a table based
>>>> look up method into a switch case statement.
>>>>
>>>> But instead, the purposes is to remove current levels of unnecessary
>>>> abstraction while converting a vm_flags access combination into page
>>>> protection. The switch case statement for platform implementation of
>>>> vm_get_page_prot() just seemed logical enough. Christoph's original
>>>> suggestion patch for x86 had the same implementation as well.
>>>>
>>>> But if the table look up is still better/preferred method on certain
>>>> platforms like arm or ppc32, will be happy to preserve that.
>>>
>>> I doubt the switch() variant would give better code on any platform.
>>>
>>> What about using tables everywhere, using designated initializers
>>> to improve readability?
>>
>> Designated initializers ? Could you please be more specific. A table look
>> up on arm platform would be something like this and arm_protection_map[]
>> needs to be updated with user_pgprot like before. Just wondering how a
>> designated initializer will help here.
> 
> It's more readable than the original:
> 
>     pgprot_t protection_map[16] __ro_after_init = {
>             __P000, __P001, __P010, __P011, __P100, __P101, __P110, __P111,
>             __S000, __S001, __S010, __S011, __S100, __S101, __S110, __S111
>     };
> 
>>
>> static pgprot_t arm_protection_map[16] __ro_after_init = {
>>        [VM_NONE]                                       = __PAGE_NONE,
>>        [VM_READ]                                       = __PAGE_READONLY,
>>        [VM_WRITE]                                      = __PAGE_COPY,
>>        [VM_WRITE | VM_READ]                            = __PAGE_COPY,
>>        [VM_EXEC]                                       = __PAGE_READONLY_EXEC,
>>        [VM_EXEC | VM_READ]                             = __PAGE_READONLY_EXEC,
>>        [VM_EXEC | VM_WRITE]                            = __PAGE_COPY_EXEC,
>>        [VM_EXEC | VM_WRITE | VM_READ]                  = __PAGE_COPY_EXEC,
>>        [VM_SHARED]                                     = __PAGE_NONE,
>>        [VM_SHARED | VM_READ]                           = __PAGE_READONLY,
>>        [VM_SHARED | VM_WRITE]                          = __PAGE_SHARED,
>>        [VM_SHARED | VM_WRITE | VM_READ]                = __PAGE_SHARED,
>>        [VM_SHARED | VM_EXEC]                           = __PAGE_READONLY_EXEC,
>>        [VM_SHARED | VM_EXEC | VM_READ]                 = __PAGE_READONLY_EXEC,
>>        [VM_SHARED | VM_EXEC | VM_WRITE]                = __PAGE_SHARED_EXEC,
>>        [VM_SHARED | VM_EXEC | VM_WRITE | VM_READ]      = __PAGE_SHARED_EXEC
>> };
> 
> Yeah, like that.
> 
> Seems like you already made such a conversion in
> https://lore.kernel.org/all/1645425519-9034-3-git-send-email-anshuman.khandual@arm.com/

Will rework the series in two different phases as mentioned on the other thread.

_______________________________________________
linux-snps-arc mailing list
linux-snps-arc@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-snps-arc

WARNING: multiple messages have this Message-ID (diff)
From: Anshuman Khandual <anshuman.khandual@arm.com>
To: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>,
	"Russell King (Oracle)" <linux@armlinux.org.uk>,
	"linux-ia64@vger.kernel.org" <linux-ia64@vger.kernel.org>,
	"linux-sh@vger.kernel.org" <linux-sh@vger.kernel.org>,
	"linux-mips@vger.kernel.org" <linux-mips@vger.kernel.org>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	"sparclinux@vger.kernel.org" <sparclinux@vger.kernel.org>,
	"linux-riscv@lists.infradead.org"
	<linux-riscv@lists.infradead.org>,
	"linux-arch@vger.kernel.org" <linux-arch@vger.kernel.org>,
	"linux-s390@vger.kernel.org" <linux-s390@vger.kernel.org>,
	"linux-hexagon@vger.kernel.org" <linux-hexagon@vger.kernel.org>,
	"linux-csky@vger.kernel.org" <linux-csky@vger.kernel.org>,
	Christoph Hellwig <hch@infradead.org>,
	"linux-snps-arc@lists.infradead.org"
	<linux-snps-arc@lists.infradead.org>,
	"linux-xtensa@linux-xtensa.org" <linux-xtensa@linux-xtensa.org>,
	Arnd Bergmann <arnd@arndb.de>,
	"linux-um@lists.infradead.org" <linux-um@lists.infradead.org>,
	"linux-m68k@lists.linux-m68k.org"
	<linux-m68k@lists.linux-m68k.org>,
	"openrisc@lists.librecores.org" <openrisc@lists.librecores.org>,
	"linux-arm-kernel@lists.infradead.org"
	<linux-arm-kernel@lists.infradead.org>,
	"linux-parisc@vger.kernel.org" <linux-parisc@vger.kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"linux-alpha@vger.kernel.org" <linux-alpha@vger.kernel.org>,
	"akpm@linux-foundation.org" <akpm@linux-foundation.org>,
	"linuxppc-dev@lists.ozlabs.org" <linuxppc-dev@lists.ozlabs.org>
Subject: Re: [PATCH V3 09/30] arm/mm: Enable ARCH_HAS_VM_GET_PAGE_PROT
Date: Wed, 9 Mar 2022 17:03:28 +0530	[thread overview]
Message-ID: <70a99f28-ea69-58e3-f919-85551943c0a3@arm.com> (raw)
In-Reply-To: <CAMuHMdUPw_yBMe9XxQHHuf40it3V5mGuOA10KnMpXt124Ay8Tw@mail.gmail.com>



On 3/2/22 16:44, Geert Uytterhoeven wrote:
> Hi Anshuman,
> 
> On Wed, Mar 2, 2022 at 12:07 PM Anshuman Khandual
> <anshuman.khandual@arm.com> wrote:
>> On 3/2/22 3:35 PM, Geert Uytterhoeven wrote:
>>> On Wed, Mar 2, 2022 at 10:51 AM Anshuman Khandual
>>> <anshuman.khandual@arm.com> wrote:
>>>> On 3/2/22 12:35 PM, Christophe Leroy wrote:
>>>>> Le 02/03/2022 à 04:22, Anshuman Khandual a écrit :
>>>>>> On 3/1/22 1:46 PM, Christophe Leroy wrote:
>>>>>>> Le 01/03/2022 à 01:31, Russell King (Oracle) a écrit :
>>>>>>>> On Tue, Mar 01, 2022 at 05:30:41AM +0530, Anshuman Khandual wrote:
>>>>>>>>> On 2/28/22 4:27 PM, Russell King (Oracle) wrote:
>>>>>>>>>> On Mon, Feb 28, 2022 at 04:17:32PM +0530, Anshuman Khandual wrote:
>>>>>>>>>>> This defines and exports a platform specific custom vm_get_page_prot() via
>>>>>>>>>>> subscribing ARCH_HAS_VM_GET_PAGE_PROT. Subsequently all __SXXX and __PXXX
>>>>>>>>>>> macros can be dropped which are no longer needed.
>>>>>>>>>>
>>>>>>>>>> What I would really like to know is why having to run _code_ to work out
>>>>>>>>>> what the page protections need to be is better than looking it up in a
>>>>>>>>>> table.
>>>>>>>>>>
>>>>>>>>>> Not only is this more expensive in terms of CPU cycles, it also brings
>>>>>>>>>> additional code size with it.
>>>>>>>>>>
>>>>>>>>>> I'm struggling to see what the benefit is.
>>>>>>>>>
>>>>>>>>> Currently vm_get_page_prot() is also being _run_ to fetch required page
>>>>>>>>> protection values. Although that is being run in the core MM and from a
>>>>>>>>> platform perspective __SXXX, __PXXX are just being exported for a table.
>>>>>>>>> Looking it up in a table (and applying more constructs there after) is
>>>>>>>>> not much different than a clean switch case statement in terms of CPU
>>>>>>>>> usage. So this is not more expensive in terms of CPU cycles.
>>>>>>>>
>>>>>>>> I disagree.
>>>>>>>
>>>>>>> So do I.
>>>>>>>
>>>>>>>>
>>>>>>>> However, let's base this disagreement on some evidence. Here is the
>>>>>>>> present 32-bit ARM implementation:
>>>>>>>>
>>>>>>>> 00000048 <vm_get_page_prot>:
>>>>>>>>         48:       e200000f        and     r0, r0, #15
>>>>>>>>         4c:       e3003000        movw    r3, #0
>>>>>>>>                           4c: R_ARM_MOVW_ABS_NC   .LANCHOR1
>>>>>>>>         50:       e3403000        movt    r3, #0
>>>>>>>>                           50: R_ARM_MOVT_ABS      .LANCHOR1
>>>>>>>>         54:       e7930100        ldr     r0, [r3, r0, lsl #2]
>>>>>>>>         58:       e12fff1e        bx      lr
>>>>>>>>
>>>>>>>> That is five instructions long.
>>>>>>>
>>>>>>> On ppc32 I get:
>>>>>>>
>>>>>>> 00000094 <vm_get_page_prot>:
>>>>>>>         94: 3d 20 00 00     lis     r9,0
>>>>>>>                     96: R_PPC_ADDR16_HA     .data..ro_after_init
>>>>>>>         98: 54 84 16 ba     rlwinm  r4,r4,2,26,29
>>>>>>>         9c: 39 29 00 00     addi    r9,r9,0
>>>>>>>                     9e: R_PPC_ADDR16_LO     .data..ro_after_init
>>>>>>>         a0: 7d 29 20 2e     lwzx    r9,r9,r4
>>>>>>>         a4: 91 23 00 00     stw     r9,0(r3)
>>>>>>>         a8: 4e 80 00 20     blr
>>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>> Please show that your new implementation is not more expensive on
>>>>>>>> 32-bit ARM. Please do so by building a 32-bit kernel, and providing
>>>>>>>> the disassembly.
>>>>>>>
>>>>>>> With your series I get:
>>>>>>>
>>>>>>> 00000000 <vm_get_page_prot>:
>>>>>>>      0:     3d 20 00 00     lis     r9,0
>>>>>>>                     2: R_PPC_ADDR16_HA      .rodata
>>>>>>>      4:     39 29 00 00     addi    r9,r9,0
>>>>>>>                     6: R_PPC_ADDR16_LO      .rodata
>>>>>>>      8:     54 84 16 ba     rlwinm  r4,r4,2,26,29
>>>>>>>      c:     7d 49 20 2e     lwzx    r10,r9,r4
>>>>>>>     10:     7d 4a 4a 14     add     r10,r10,r9
>>>>>>>     14:     7d 49 03 a6     mtctr   r10
>>>>>>>     18:     4e 80 04 20     bctr
>>>>>>>     1c:     39 20 03 15     li      r9,789
>>>>>>>     20:     91 23 00 00     stw     r9,0(r3)
>>>>>>>     24:     4e 80 00 20     blr
>>>>>>>     28:     39 20 01 15     li      r9,277
>>>>>>>     2c:     91 23 00 00     stw     r9,0(r3)
>>>>>>>     30:     4e 80 00 20     blr
>>>>>>>     34:     39 20 07 15     li      r9,1813
>>>>>>>     38:     91 23 00 00     stw     r9,0(r3)
>>>>>>>     3c:     4e 80 00 20     blr
>>>>>>>     40:     39 20 05 15     li      r9,1301
>>>>>>>     44:     91 23 00 00     stw     r9,0(r3)
>>>>>>>     48:     4e 80 00 20     blr
>>>>>>>     4c:     39 20 01 11     li      r9,273
>>>>>>>     50:     4b ff ff d0     b       20 <vm_get_page_prot+0x20>
>>>>>>>
>>>>>>>
>>>>>>> That is definitely more expensive, it implements a table of branches.
>>>>>>
>>>>>> Okay, will split out the PPC32 implementation that retains existing
>>>>>> table look up method. Also planning to keep that inside same file
>>>>>> (arch/powerpc/mm/mmap.c), unless you have a difference preference.
>>>>>
>>>>> My point was not to get something specific for PPC32, but to amplify on
>>>>> Russell's objection.
>>>>>
>>>>> As this is bad for ARM and bad for PPC32, do we have any evidence that
>>>>> your change is good for any other architecture ?
>>>>>
>>>>> I checked PPC64 and there is exactly the same drawback. With the current
>>>>> implementation it is a small function performing table read then a few
>>>>> adjustment. After your change it is a bigger function implementing a
>>>>> table of branches.
>>>>
>>>> I am wondering if this would not be the case for any other switch case
>>>> statement on the platform ? Is there something specific/different just
>>>> on vm_get_page_prot() implementation ? Are you suggesting that switch
>>>> case statements should just be avoided instead ?
>>>>
>>>>>
>>>>> So, as requested by Russell, could you look at the disassembly for other
>>>>> architectures and show us that ARM and POWERPC are the only ones for
>>>>> which your change is not optimal ?
>>>>
>>>> But the primary purpose of this series is not to guarantee optimized
>>>> code on platform by platform basis, while migrating from a table based
>>>> look up method into a switch case statement.
>>>>
>>>> But instead, the purposes is to remove current levels of unnecessary
>>>> abstraction while converting a vm_flags access combination into page
>>>> protection. The switch case statement for platform implementation of
>>>> vm_get_page_prot() just seemed logical enough. Christoph's original
>>>> suggestion patch for x86 had the same implementation as well.
>>>>
>>>> But if the table look up is still better/preferred method on certain
>>>> platforms like arm or ppc32, will be happy to preserve that.
>>>
>>> I doubt the switch() variant would give better code on any platform.
>>>
>>> What about using tables everywhere, using designated initializers
>>> to improve readability?
>>
>> Designated initializers ? Could you please be more specific. A table look
>> up on arm platform would be something like this and arm_protection_map[]
>> needs to be updated with user_pgprot like before. Just wondering how a
>> designated initializer will help here.
> 
> It's more readable than the original:
> 
>     pgprot_t protection_map[16] __ro_after_init = {
>             __P000, __P001, __P010, __P011, __P100, __P101, __P110, __P111,
>             __S000, __S001, __S010, __S011, __S100, __S101, __S110, __S111
>     };
> 
>>
>> static pgprot_t arm_protection_map[16] __ro_after_init = {
>>        [VM_NONE]                                       = __PAGE_NONE,
>>        [VM_READ]                                       = __PAGE_READONLY,
>>        [VM_WRITE]                                      = __PAGE_COPY,
>>        [VM_WRITE | VM_READ]                            = __PAGE_COPY,
>>        [VM_EXEC]                                       = __PAGE_READONLY_EXEC,
>>        [VM_EXEC | VM_READ]                             = __PAGE_READONLY_EXEC,
>>        [VM_EXEC | VM_WRITE]                            = __PAGE_COPY_EXEC,
>>        [VM_EXEC | VM_WRITE | VM_READ]                  = __PAGE_COPY_EXEC,
>>        [VM_SHARED]                                     = __PAGE_NONE,
>>        [VM_SHARED | VM_READ]                           = __PAGE_READONLY,
>>        [VM_SHARED | VM_WRITE]                          = __PAGE_SHARED,
>>        [VM_SHARED | VM_WRITE | VM_READ]                = __PAGE_SHARED,
>>        [VM_SHARED | VM_EXEC]                           = __PAGE_READONLY_EXEC,
>>        [VM_SHARED | VM_EXEC | VM_READ]                 = __PAGE_READONLY_EXEC,
>>        [VM_SHARED | VM_EXEC | VM_WRITE]                = __PAGE_SHARED_EXEC,
>>        [VM_SHARED | VM_EXEC | VM_WRITE | VM_READ]      = __PAGE_SHARED_EXEC
>> };
> 
> Yeah, like that.
> 
> Seems like you already made such a conversion in
> https://lore.kernel.org/all/1645425519-9034-3-git-send-email-anshuman.khandual@arm.com/

Will rework the series in two different phases as mentioned on the other thread.

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

WARNING: multiple messages have this Message-ID (diff)
From: Anshuman Khandual <anshuman.khandual@arm.com>
To: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: "linux-ia64@vger.kernel.org" <linux-ia64@vger.kernel.org>,
	"linux-sh@vger.kernel.org" <linux-sh@vger.kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	"linux-csky@vger.kernel.org" <linux-csky@vger.kernel.org>,
	"sparclinux@vger.kernel.org" <sparclinux@vger.kernel.org>,
	"linux-riscv@lists.infradead.org"
	<linux-riscv@lists.infradead.org>,
	"linux-arch@vger.kernel.org" <linux-arch@vger.kernel.org>,
	"linux-s390@vger.kernel.org" <linux-s390@vger.kernel.org>,
	"linux-hexagon@vger.kernel.org" <linux-hexagon@vger.kernel.org>,
	"Russell King \(Oracle\)" <linux@armlinux.org.uk>,
	Christoph Hellwig <hch@infradead.org>,
	"linux-snps-arc@lists.infradead.org"
	<linux-snps-arc@lists.infradead.org>,
	"linux-xtensa@linux-xtensa.org" <linux-xtensa@linux-xtensa.org>,
	Arnd Bergmann <arnd@arndb.de>,
	"linux-um@lists.infradead.org" <linux-um@lists.infradead.org>,
	"linux-m68k@lists.linux-m68k.org"
	<linux-m68k@lists.linux-m68k.org>,
	"openrisc@lists.librecores.org" <openrisc@lists.librecores.org>,
	"linux-arm-kernel@lists.infradead.org"
	<linux-arm-kernel@lists.infradead.org>,
	"linux-parisc@vger.kernel.org" <linux-parisc@vger.kernel.org>,
	"linux-mips@vger.kernel.org" <linux-mips@vger.kernel.org>,
	"linux-alpha@vger.kernel.org" <linux-alpha@vger.kernel.org>,
	"akpm@linux-foundation.org" <akpm@linux-foundation.org>,
	"linuxppc-dev@lists.ozlabs.org" <linuxppc-dev@lists.ozlabs.org>
Subject: Re: [PATCH V3 09/30] arm/mm: Enable ARCH_HAS_VM_GET_PAGE_PROT
Date: Wed, 9 Mar 2022 17:03:28 +0530	[thread overview]
Message-ID: <70a99f28-ea69-58e3-f919-85551943c0a3@arm.com> (raw)
In-Reply-To: <CAMuHMdUPw_yBMe9XxQHHuf40it3V5mGuOA10KnMpXt124Ay8Tw@mail.gmail.com>



On 3/2/22 16:44, Geert Uytterhoeven wrote:
> Hi Anshuman,
> 
> On Wed, Mar 2, 2022 at 12:07 PM Anshuman Khandual
> <anshuman.khandual@arm.com> wrote:
>> On 3/2/22 3:35 PM, Geert Uytterhoeven wrote:
>>> On Wed, Mar 2, 2022 at 10:51 AM Anshuman Khandual
>>> <anshuman.khandual@arm.com> wrote:
>>>> On 3/2/22 12:35 PM, Christophe Leroy wrote:
>>>>> Le 02/03/2022 à 04:22, Anshuman Khandual a écrit :
>>>>>> On 3/1/22 1:46 PM, Christophe Leroy wrote:
>>>>>>> Le 01/03/2022 à 01:31, Russell King (Oracle) a écrit :
>>>>>>>> On Tue, Mar 01, 2022 at 05:30:41AM +0530, Anshuman Khandual wrote:
>>>>>>>>> On 2/28/22 4:27 PM, Russell King (Oracle) wrote:
>>>>>>>>>> On Mon, Feb 28, 2022 at 04:17:32PM +0530, Anshuman Khandual wrote:
>>>>>>>>>>> This defines and exports a platform specific custom vm_get_page_prot() via
>>>>>>>>>>> subscribing ARCH_HAS_VM_GET_PAGE_PROT. Subsequently all __SXXX and __PXXX
>>>>>>>>>>> macros can be dropped which are no longer needed.
>>>>>>>>>>
>>>>>>>>>> What I would really like to know is why having to run _code_ to work out
>>>>>>>>>> what the page protections need to be is better than looking it up in a
>>>>>>>>>> table.
>>>>>>>>>>
>>>>>>>>>> Not only is this more expensive in terms of CPU cycles, it also brings
>>>>>>>>>> additional code size with it.
>>>>>>>>>>
>>>>>>>>>> I'm struggling to see what the benefit is.
>>>>>>>>>
>>>>>>>>> Currently vm_get_page_prot() is also being _run_ to fetch required page
>>>>>>>>> protection values. Although that is being run in the core MM and from a
>>>>>>>>> platform perspective __SXXX, __PXXX are just being exported for a table.
>>>>>>>>> Looking it up in a table (and applying more constructs there after) is
>>>>>>>>> not much different than a clean switch case statement in terms of CPU
>>>>>>>>> usage. So this is not more expensive in terms of CPU cycles.
>>>>>>>>
>>>>>>>> I disagree.
>>>>>>>
>>>>>>> So do I.
>>>>>>>
>>>>>>>>
>>>>>>>> However, let's base this disagreement on some evidence. Here is the
>>>>>>>> present 32-bit ARM implementation:
>>>>>>>>
>>>>>>>> 00000048 <vm_get_page_prot>:
>>>>>>>>         48:       e200000f        and     r0, r0, #15
>>>>>>>>         4c:       e3003000        movw    r3, #0
>>>>>>>>                           4c: R_ARM_MOVW_ABS_NC   .LANCHOR1
>>>>>>>>         50:       e3403000        movt    r3, #0
>>>>>>>>                           50: R_ARM_MOVT_ABS      .LANCHOR1
>>>>>>>>         54:       e7930100        ldr     r0, [r3, r0, lsl #2]
>>>>>>>>         58:       e12fff1e        bx      lr
>>>>>>>>
>>>>>>>> That is five instructions long.
>>>>>>>
>>>>>>> On ppc32 I get:
>>>>>>>
>>>>>>> 00000094 <vm_get_page_prot>:
>>>>>>>         94: 3d 20 00 00     lis     r9,0
>>>>>>>                     96: R_PPC_ADDR16_HA     .data..ro_after_init
>>>>>>>         98: 54 84 16 ba     rlwinm  r4,r4,2,26,29
>>>>>>>         9c: 39 29 00 00     addi    r9,r9,0
>>>>>>>                     9e: R_PPC_ADDR16_LO     .data..ro_after_init
>>>>>>>         a0: 7d 29 20 2e     lwzx    r9,r9,r4
>>>>>>>         a4: 91 23 00 00     stw     r9,0(r3)
>>>>>>>         a8: 4e 80 00 20     blr
>>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>> Please show that your new implementation is not more expensive on
>>>>>>>> 32-bit ARM. Please do so by building a 32-bit kernel, and providing
>>>>>>>> the disassembly.
>>>>>>>
>>>>>>> With your series I get:
>>>>>>>
>>>>>>> 00000000 <vm_get_page_prot>:
>>>>>>>      0:     3d 20 00 00     lis     r9,0
>>>>>>>                     2: R_PPC_ADDR16_HA      .rodata
>>>>>>>      4:     39 29 00 00     addi    r9,r9,0
>>>>>>>                     6: R_PPC_ADDR16_LO      .rodata
>>>>>>>      8:     54 84 16 ba     rlwinm  r4,r4,2,26,29
>>>>>>>      c:     7d 49 20 2e     lwzx    r10,r9,r4
>>>>>>>     10:     7d 4a 4a 14     add     r10,r10,r9
>>>>>>>     14:     7d 49 03 a6     mtctr   r10
>>>>>>>     18:     4e 80 04 20     bctr
>>>>>>>     1c:     39 20 03 15     li      r9,789
>>>>>>>     20:     91 23 00 00     stw     r9,0(r3)
>>>>>>>     24:     4e 80 00 20     blr
>>>>>>>     28:     39 20 01 15     li      r9,277
>>>>>>>     2c:     91 23 00 00     stw     r9,0(r3)
>>>>>>>     30:     4e 80 00 20     blr
>>>>>>>     34:     39 20 07 15     li      r9,1813
>>>>>>>     38:     91 23 00 00     stw     r9,0(r3)
>>>>>>>     3c:     4e 80 00 20     blr
>>>>>>>     40:     39 20 05 15     li      r9,1301
>>>>>>>     44:     91 23 00 00     stw     r9,0(r3)
>>>>>>>     48:     4e 80 00 20     blr
>>>>>>>     4c:     39 20 01 11     li      r9,273
>>>>>>>     50:     4b ff ff d0     b       20 <vm_get_page_prot+0x20>
>>>>>>>
>>>>>>>
>>>>>>> That is definitely more expensive, it implements a table of branches.
>>>>>>
>>>>>> Okay, will split out the PPC32 implementation that retains existing
>>>>>> table look up method. Also planning to keep that inside same file
>>>>>> (arch/powerpc/mm/mmap.c), unless you have a difference preference.
>>>>>
>>>>> My point was not to get something specific for PPC32, but to amplify on
>>>>> Russell's objection.
>>>>>
>>>>> As this is bad for ARM and bad for PPC32, do we have any evidence that
>>>>> your change is good for any other architecture ?
>>>>>
>>>>> I checked PPC64 and there is exactly the same drawback. With the current
>>>>> implementation it is a small function performing table read then a few
>>>>> adjustment. After your change it is a bigger function implementing a
>>>>> table of branches.
>>>>
>>>> I am wondering if this would not be the case for any other switch case
>>>> statement on the platform ? Is there something specific/different just
>>>> on vm_get_page_prot() implementation ? Are you suggesting that switch
>>>> case statements should just be avoided instead ?
>>>>
>>>>>
>>>>> So, as requested by Russell, could you look at the disassembly for other
>>>>> architectures and show us that ARM and POWERPC are the only ones for
>>>>> which your change is not optimal ?
>>>>
>>>> But the primary purpose of this series is not to guarantee optimized
>>>> code on platform by platform basis, while migrating from a table based
>>>> look up method into a switch case statement.
>>>>
>>>> But instead, the purposes is to remove current levels of unnecessary
>>>> abstraction while converting a vm_flags access combination into page
>>>> protection. The switch case statement for platform implementation of
>>>> vm_get_page_prot() just seemed logical enough. Christoph's original
>>>> suggestion patch for x86 had the same implementation as well.
>>>>
>>>> But if the table look up is still better/preferred method on certain
>>>> platforms like arm or ppc32, will be happy to preserve that.
>>>
>>> I doubt the switch() variant would give better code on any platform.
>>>
>>> What about using tables everywhere, using designated initializers
>>> to improve readability?
>>
>> Designated initializers ? Could you please be more specific. A table look
>> up on arm platform would be something like this and arm_protection_map[]
>> needs to be updated with user_pgprot like before. Just wondering how a
>> designated initializer will help here.
> 
> It's more readable than the original:
> 
>     pgprot_t protection_map[16] __ro_after_init = {
>             __P000, __P001, __P010, __P011, __P100, __P101, __P110, __P111,
>             __S000, __S001, __S010, __S011, __S100, __S101, __S110, __S111
>     };
> 
>>
>> static pgprot_t arm_protection_map[16] __ro_after_init = {
>>        [VM_NONE]                                       = __PAGE_NONE,
>>        [VM_READ]                                       = __PAGE_READONLY,
>>        [VM_WRITE]                                      = __PAGE_COPY,
>>        [VM_WRITE | VM_READ]                            = __PAGE_COPY,
>>        [VM_EXEC]                                       = __PAGE_READONLY_EXEC,
>>        [VM_EXEC | VM_READ]                             = __PAGE_READONLY_EXEC,
>>        [VM_EXEC | VM_WRITE]                            = __PAGE_COPY_EXEC,
>>        [VM_EXEC | VM_WRITE | VM_READ]                  = __PAGE_COPY_EXEC,
>>        [VM_SHARED]                                     = __PAGE_NONE,
>>        [VM_SHARED | VM_READ]                           = __PAGE_READONLY,
>>        [VM_SHARED | VM_WRITE]                          = __PAGE_SHARED,
>>        [VM_SHARED | VM_WRITE | VM_READ]                = __PAGE_SHARED,
>>        [VM_SHARED | VM_EXEC]                           = __PAGE_READONLY_EXEC,
>>        [VM_SHARED | VM_EXEC | VM_READ]                 = __PAGE_READONLY_EXEC,
>>        [VM_SHARED | VM_EXEC | VM_WRITE]                = __PAGE_SHARED_EXEC,
>>        [VM_SHARED | VM_EXEC | VM_WRITE | VM_READ]      = __PAGE_SHARED_EXEC
>> };
> 
> Yeah, like that.
> 
> Seems like you already made such a conversion in
> https://lore.kernel.org/all/1645425519-9034-3-git-send-email-anshuman.khandual@arm.com/

Will rework the series in two different phases as mentioned on the other thread.

WARNING: multiple messages have this Message-ID (diff)
From: Anshuman Khandual <anshuman.khandual@arm.com>
To: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>,
	"Russell King (Oracle)" <linux@armlinux.org.uk>,
	"linux-ia64@vger.kernel.org" <linux-ia64@vger.kernel.org>,
	"linux-sh@vger.kernel.org" <linux-sh@vger.kernel.org>,
	"linux-mips@vger.kernel.org" <linux-mips@vger.kernel.org>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	"sparclinux@vger.kernel.org" <sparclinux@vger.kernel.org>,
	"linux-riscv@lists.infradead.org"
	<linux-riscv@lists.infradead.org>,
	"linux-arch@vger.kernel.org" <linux-arch@vger.kernel.org>,
	"linux-s390@vger.kernel.org" <linux-s390@vger.kernel.org>,
	"linux-hexagon@vger.kernel.org" <linux-hexagon@vger.kernel.org>,
	"linux-csky@vger.kernel.org" <linux-csky@vger.kernel.org>,
	Christoph Hellwig <hch@infradead.org>,
	"linux-snps-arc@lists.infradead.org"
	<linux-snps-arc@lists.infradead.org>,
	"linux-xtensa@linux-xtensa.org" <linux-xtensa@linux-xtensa.org>,
	Arnd Bergmann <arnd@arndb.de>,
	"linux-um@lists.infradead.org" <linux-um@lists.infradead.org>,
	"linux-m68k@lists.linux-m68k.org"
	<linux-m68k@lists.linux-m68k.org>,
	"openrisc@lists.librecores.org" <openrisc@lists.librecores.org>,
	"linux-arm-kernel@lists.infradead.org"
	<linux-arm-kernel@lists.infradead.org>,
	"linux-parisc@vger.kernel.org" <linux-parisc@vger.kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"linux-alpha@vger.kernel.org" <linux-alpha@vger.kernel.org>,
	"akpm@linux-foundation.org" <akpm@linux-foundation.org>,
	"linuxppc-dev@lists.ozlabs.org" <linuxppc-dev@lists.ozlabs.org>
Subject: Re: [PATCH V3 09/30] arm/mm: Enable ARCH_HAS_VM_GET_PAGE_PROT
Date: Wed, 9 Mar 2022 17:03:28 +0530	[thread overview]
Message-ID: <70a99f28-ea69-58e3-f919-85551943c0a3@arm.com> (raw)
In-Reply-To: <CAMuHMdUPw_yBMe9XxQHHuf40it3V5mGuOA10KnMpXt124Ay8Tw@mail.gmail.com>



On 3/2/22 16:44, Geert Uytterhoeven wrote:
> Hi Anshuman,
> 
> On Wed, Mar 2, 2022 at 12:07 PM Anshuman Khandual
> <anshuman.khandual@arm.com> wrote:
>> On 3/2/22 3:35 PM, Geert Uytterhoeven wrote:
>>> On Wed, Mar 2, 2022 at 10:51 AM Anshuman Khandual
>>> <anshuman.khandual@arm.com> wrote:
>>>> On 3/2/22 12:35 PM, Christophe Leroy wrote:
>>>>> Le 02/03/2022 à 04:22, Anshuman Khandual a écrit :
>>>>>> On 3/1/22 1:46 PM, Christophe Leroy wrote:
>>>>>>> Le 01/03/2022 à 01:31, Russell King (Oracle) a écrit :
>>>>>>>> On Tue, Mar 01, 2022 at 05:30:41AM +0530, Anshuman Khandual wrote:
>>>>>>>>> On 2/28/22 4:27 PM, Russell King (Oracle) wrote:
>>>>>>>>>> On Mon, Feb 28, 2022 at 04:17:32PM +0530, Anshuman Khandual wrote:
>>>>>>>>>>> This defines and exports a platform specific custom vm_get_page_prot() via
>>>>>>>>>>> subscribing ARCH_HAS_VM_GET_PAGE_PROT. Subsequently all __SXXX and __PXXX
>>>>>>>>>>> macros can be dropped which are no longer needed.
>>>>>>>>>>
>>>>>>>>>> What I would really like to know is why having to run _code_ to work out
>>>>>>>>>> what the page protections need to be is better than looking it up in a
>>>>>>>>>> table.
>>>>>>>>>>
>>>>>>>>>> Not only is this more expensive in terms of CPU cycles, it also brings
>>>>>>>>>> additional code size with it.
>>>>>>>>>>
>>>>>>>>>> I'm struggling to see what the benefit is.
>>>>>>>>>
>>>>>>>>> Currently vm_get_page_prot() is also being _run_ to fetch required page
>>>>>>>>> protection values. Although that is being run in the core MM and from a
>>>>>>>>> platform perspective __SXXX, __PXXX are just being exported for a table.
>>>>>>>>> Looking it up in a table (and applying more constructs there after) is
>>>>>>>>> not much different than a clean switch case statement in terms of CPU
>>>>>>>>> usage. So this is not more expensive in terms of CPU cycles.
>>>>>>>>
>>>>>>>> I disagree.
>>>>>>>
>>>>>>> So do I.
>>>>>>>
>>>>>>>>
>>>>>>>> However, let's base this disagreement on some evidence. Here is the
>>>>>>>> present 32-bit ARM implementation:
>>>>>>>>
>>>>>>>> 00000048 <vm_get_page_prot>:
>>>>>>>>         48:       e200000f        and     r0, r0, #15
>>>>>>>>         4c:       e3003000        movw    r3, #0
>>>>>>>>                           4c: R_ARM_MOVW_ABS_NC   .LANCHOR1
>>>>>>>>         50:       e3403000        movt    r3, #0
>>>>>>>>                           50: R_ARM_MOVT_ABS      .LANCHOR1
>>>>>>>>         54:       e7930100        ldr     r0, [r3, r0, lsl #2]
>>>>>>>>         58:       e12fff1e        bx      lr
>>>>>>>>
>>>>>>>> That is five instructions long.
>>>>>>>
>>>>>>> On ppc32 I get:
>>>>>>>
>>>>>>> 00000094 <vm_get_page_prot>:
>>>>>>>         94: 3d 20 00 00     lis     r9,0
>>>>>>>                     96: R_PPC_ADDR16_HA     .data..ro_after_init
>>>>>>>         98: 54 84 16 ba     rlwinm  r4,r4,2,26,29
>>>>>>>         9c: 39 29 00 00     addi    r9,r9,0
>>>>>>>                     9e: R_PPC_ADDR16_LO     .data..ro_after_init
>>>>>>>         a0: 7d 29 20 2e     lwzx    r9,r9,r4
>>>>>>>         a4: 91 23 00 00     stw     r9,0(r3)
>>>>>>>         a8: 4e 80 00 20     blr
>>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>> Please show that your new implementation is not more expensive on
>>>>>>>> 32-bit ARM. Please do so by building a 32-bit kernel, and providing
>>>>>>>> the disassembly.
>>>>>>>
>>>>>>> With your series I get:
>>>>>>>
>>>>>>> 00000000 <vm_get_page_prot>:
>>>>>>>      0:     3d 20 00 00     lis     r9,0
>>>>>>>                     2: R_PPC_ADDR16_HA      .rodata
>>>>>>>      4:     39 29 00 00     addi    r9,r9,0
>>>>>>>                     6: R_PPC_ADDR16_LO      .rodata
>>>>>>>      8:     54 84 16 ba     rlwinm  r4,r4,2,26,29
>>>>>>>      c:     7d 49 20 2e     lwzx    r10,r9,r4
>>>>>>>     10:     7d 4a 4a 14     add     r10,r10,r9
>>>>>>>     14:     7d 49 03 a6     mtctr   r10
>>>>>>>     18:     4e 80 04 20     bctr
>>>>>>>     1c:     39 20 03 15     li      r9,789
>>>>>>>     20:     91 23 00 00     stw     r9,0(r3)
>>>>>>>     24:     4e 80 00 20     blr
>>>>>>>     28:     39 20 01 15     li      r9,277
>>>>>>>     2c:     91 23 00 00     stw     r9,0(r3)
>>>>>>>     30:     4e 80 00 20     blr
>>>>>>>     34:     39 20 07 15     li      r9,1813
>>>>>>>     38:     91 23 00 00     stw     r9,0(r3)
>>>>>>>     3c:     4e 80 00 20     blr
>>>>>>>     40:     39 20 05 15     li      r9,1301
>>>>>>>     44:     91 23 00 00     stw     r9,0(r3)
>>>>>>>     48:     4e 80 00 20     blr
>>>>>>>     4c:     39 20 01 11     li      r9,273
>>>>>>>     50:     4b ff ff d0     b       20 <vm_get_page_prot+0x20>
>>>>>>>
>>>>>>>
>>>>>>> That is definitely more expensive, it implements a table of branches.
>>>>>>
>>>>>> Okay, will split out the PPC32 implementation that retains existing
>>>>>> table look up method. Also planning to keep that inside same file
>>>>>> (arch/powerpc/mm/mmap.c), unless you have a difference preference.
>>>>>
>>>>> My point was not to get something specific for PPC32, but to amplify on
>>>>> Russell's objection.
>>>>>
>>>>> As this is bad for ARM and bad for PPC32, do we have any evidence that
>>>>> your change is good for any other architecture ?
>>>>>
>>>>> I checked PPC64 and there is exactly the same drawback. With the current
>>>>> implementation it is a small function performing table read then a few
>>>>> adjustment. After your change it is a bigger function implementing a
>>>>> table of branches.
>>>>
>>>> I am wondering if this would not be the case for any other switch case
>>>> statement on the platform ? Is there something specific/different just
>>>> on vm_get_page_prot() implementation ? Are you suggesting that switch
>>>> case statements should just be avoided instead ?
>>>>
>>>>>
>>>>> So, as requested by Russell, could you look at the disassembly for other
>>>>> architectures and show us that ARM and POWERPC are the only ones for
>>>>> which your change is not optimal ?
>>>>
>>>> But the primary purpose of this series is not to guarantee optimized
>>>> code on platform by platform basis, while migrating from a table based
>>>> look up method into a switch case statement.
>>>>
>>>> But instead, the purposes is to remove current levels of unnecessary
>>>> abstraction while converting a vm_flags access combination into page
>>>> protection. The switch case statement for platform implementation of
>>>> vm_get_page_prot() just seemed logical enough. Christoph's original
>>>> suggestion patch for x86 had the same implementation as well.
>>>>
>>>> But if the table look up is still better/preferred method on certain
>>>> platforms like arm or ppc32, will be happy to preserve that.
>>>
>>> I doubt the switch() variant would give better code on any platform.
>>>
>>> What about using tables everywhere, using designated initializers
>>> to improve readability?
>>
>> Designated initializers ? Could you please be more specific. A table look
>> up on arm platform would be something like this and arm_protection_map[]
>> needs to be updated with user_pgprot like before. Just wondering how a
>> designated initializer will help here.
> 
> It's more readable than the original:
> 
>     pgprot_t protection_map[16] __ro_after_init = {
>             __P000, __P001, __P010, __P011, __P100, __P101, __P110, __P111,
>             __S000, __S001, __S010, __S011, __S100, __S101, __S110, __S111
>     };
> 
>>
>> static pgprot_t arm_protection_map[16] __ro_after_init = {
>>        [VM_NONE]                                       = __PAGE_NONE,
>>        [VM_READ]                                       = __PAGE_READONLY,
>>        [VM_WRITE]                                      = __PAGE_COPY,
>>        [VM_WRITE | VM_READ]                            = __PAGE_COPY,
>>        [VM_EXEC]                                       = __PAGE_READONLY_EXEC,
>>        [VM_EXEC | VM_READ]                             = __PAGE_READONLY_EXEC,
>>        [VM_EXEC | VM_WRITE]                            = __PAGE_COPY_EXEC,
>>        [VM_EXEC | VM_WRITE | VM_READ]                  = __PAGE_COPY_EXEC,
>>        [VM_SHARED]                                     = __PAGE_NONE,
>>        [VM_SHARED | VM_READ]                           = __PAGE_READONLY,
>>        [VM_SHARED | VM_WRITE]                          = __PAGE_SHARED,
>>        [VM_SHARED | VM_WRITE | VM_READ]                = __PAGE_SHARED,
>>        [VM_SHARED | VM_EXEC]                           = __PAGE_READONLY_EXEC,
>>        [VM_SHARED | VM_EXEC | VM_READ]                 = __PAGE_READONLY_EXEC,
>>        [VM_SHARED | VM_EXEC | VM_WRITE]                = __PAGE_SHARED_EXEC,
>>        [VM_SHARED | VM_EXEC | VM_WRITE | VM_READ]      = __PAGE_SHARED_EXEC
>> };
> 
> Yeah, like that.
> 
> Seems like you already made such a conversion in
> https://lore.kernel.org/all/1645425519-9034-3-git-send-email-anshuman.khandual@arm.com/

Will rework the series in two different phases as mentioned on the other thread.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

WARNING: multiple messages have this Message-ID (diff)
From: Anshuman Khandual <anshuman.khandual@arm.com>
To: openrisc@lists.librecores.org
Subject: [OpenRISC] [PATCH V3 09/30] arm/mm: Enable ARCH_HAS_VM_GET_PAGE_PROT
Date: Wed, 9 Mar 2022 17:03:28 +0530	[thread overview]
Message-ID: <70a99f28-ea69-58e3-f919-85551943c0a3@arm.com> (raw)
In-Reply-To: <CAMuHMdUPw_yBMe9XxQHHuf40it3V5mGuOA10KnMpXt124Ay8Tw@mail.gmail.com>



On 3/2/22 16:44, Geert Uytterhoeven wrote:
> Hi Anshuman,
> 
> On Wed, Mar 2, 2022 at 12:07 PM Anshuman Khandual
> <anshuman.khandual@arm.com> wrote:
>> On 3/2/22 3:35 PM, Geert Uytterhoeven wrote:
>>> On Wed, Mar 2, 2022 at 10:51 AM Anshuman Khandual
>>> <anshuman.khandual@arm.com> wrote:
>>>> On 3/2/22 12:35 PM, Christophe Leroy wrote:
>>>>> Le 02/03/2022 à 04:22, Anshuman Khandual a écrit :
>>>>>> On 3/1/22 1:46 PM, Christophe Leroy wrote:
>>>>>>> Le 01/03/2022 à 01:31, Russell King (Oracle) a écrit :
>>>>>>>> On Tue, Mar 01, 2022 at 05:30:41AM +0530, Anshuman Khandual wrote:
>>>>>>>>> On 2/28/22 4:27 PM, Russell King (Oracle) wrote:
>>>>>>>>>> On Mon, Feb 28, 2022 at 04:17:32PM +0530, Anshuman Khandual wrote:
>>>>>>>>>>> This defines and exports a platform specific custom vm_get_page_prot() via
>>>>>>>>>>> subscribing ARCH_HAS_VM_GET_PAGE_PROT. Subsequently all __SXXX and __PXXX
>>>>>>>>>>> macros can be dropped which are no longer needed.
>>>>>>>>>>
>>>>>>>>>> What I would really like to know is why having to run _code_ to work out
>>>>>>>>>> what the page protections need to be is better than looking it up in a
>>>>>>>>>> table.
>>>>>>>>>>
>>>>>>>>>> Not only is this more expensive in terms of CPU cycles, it also brings
>>>>>>>>>> additional code size with it.
>>>>>>>>>>
>>>>>>>>>> I'm struggling to see what the benefit is.
>>>>>>>>>
>>>>>>>>> Currently vm_get_page_prot() is also being _run_ to fetch required page
>>>>>>>>> protection values. Although that is being run in the core MM and from a
>>>>>>>>> platform perspective __SXXX, __PXXX are just being exported for a table.
>>>>>>>>> Looking it up in a table (and applying more constructs there after) is
>>>>>>>>> not much different than a clean switch case statement in terms of CPU
>>>>>>>>> usage. So this is not more expensive in terms of CPU cycles.
>>>>>>>>
>>>>>>>> I disagree.
>>>>>>>
>>>>>>> So do I.
>>>>>>>
>>>>>>>>
>>>>>>>> However, let's base this disagreement on some evidence. Here is the
>>>>>>>> present 32-bit ARM implementation:
>>>>>>>>
>>>>>>>> 00000048 <vm_get_page_prot>:
>>>>>>>>         48:       e200000f        and     r0, r0, #15
>>>>>>>>         4c:       e3003000        movw    r3, #0
>>>>>>>>                           4c: R_ARM_MOVW_ABS_NC   .LANCHOR1
>>>>>>>>         50:       e3403000        movt    r3, #0
>>>>>>>>                           50: R_ARM_MOVT_ABS      .LANCHOR1
>>>>>>>>         54:       e7930100        ldr     r0, [r3, r0, lsl #2]
>>>>>>>>         58:       e12fff1e        bx      lr
>>>>>>>>
>>>>>>>> That is five instructions long.
>>>>>>>
>>>>>>> On ppc32 I get:
>>>>>>>
>>>>>>> 00000094 <vm_get_page_prot>:
>>>>>>>         94: 3d 20 00 00     lis     r9,0
>>>>>>>                     96: R_PPC_ADDR16_HA     .data..ro_after_init
>>>>>>>         98: 54 84 16 ba     rlwinm  r4,r4,2,26,29
>>>>>>>         9c: 39 29 00 00     addi    r9,r9,0
>>>>>>>                     9e: R_PPC_ADDR16_LO     .data..ro_after_init
>>>>>>>         a0: 7d 29 20 2e     lwzx    r9,r9,r4
>>>>>>>         a4: 91 23 00 00     stw     r9,0(r3)
>>>>>>>         a8: 4e 80 00 20     blr
>>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>> Please show that your new implementation is not more expensive on
>>>>>>>> 32-bit ARM. Please do so by building a 32-bit kernel, and providing
>>>>>>>> the disassembly.
>>>>>>>
>>>>>>> With your series I get:
>>>>>>>
>>>>>>> 00000000 <vm_get_page_prot>:
>>>>>>>      0:     3d 20 00 00     lis     r9,0
>>>>>>>                     2: R_PPC_ADDR16_HA      .rodata
>>>>>>>      4:     39 29 00 00     addi    r9,r9,0
>>>>>>>                     6: R_PPC_ADDR16_LO      .rodata
>>>>>>>      8:     54 84 16 ba     rlwinm  r4,r4,2,26,29
>>>>>>>      c:     7d 49 20 2e     lwzx    r10,r9,r4
>>>>>>>     10:     7d 4a 4a 14     add     r10,r10,r9
>>>>>>>     14:     7d 49 03 a6     mtctr   r10
>>>>>>>     18:     4e 80 04 20     bctr
>>>>>>>     1c:     39 20 03 15     li      r9,789
>>>>>>>     20:     91 23 00 00     stw     r9,0(r3)
>>>>>>>     24:     4e 80 00 20     blr
>>>>>>>     28:     39 20 01 15     li      r9,277
>>>>>>>     2c:     91 23 00 00     stw     r9,0(r3)
>>>>>>>     30:     4e 80 00 20     blr
>>>>>>>     34:     39 20 07 15     li      r9,1813
>>>>>>>     38:     91 23 00 00     stw     r9,0(r3)
>>>>>>>     3c:     4e 80 00 20     blr
>>>>>>>     40:     39 20 05 15     li      r9,1301
>>>>>>>     44:     91 23 00 00     stw     r9,0(r3)
>>>>>>>     48:     4e 80 00 20     blr
>>>>>>>     4c:     39 20 01 11     li      r9,273
>>>>>>>     50:     4b ff ff d0     b       20 <vm_get_page_prot+0x20>
>>>>>>>
>>>>>>>
>>>>>>> That is definitely more expensive, it implements a table of branches.
>>>>>>
>>>>>> Okay, will split out the PPC32 implementation that retains existing
>>>>>> table look up method. Also planning to keep that inside same file
>>>>>> (arch/powerpc/mm/mmap.c), unless you have a difference preference.
>>>>>
>>>>> My point was not to get something specific for PPC32, but to amplify on
>>>>> Russell's objection.
>>>>>
>>>>> As this is bad for ARM and bad for PPC32, do we have any evidence that
>>>>> your change is good for any other architecture ?
>>>>>
>>>>> I checked PPC64 and there is exactly the same drawback. With the current
>>>>> implementation it is a small function performing table read then a few
>>>>> adjustment. After your change it is a bigger function implementing a
>>>>> table of branches.
>>>>
>>>> I am wondering if this would not be the case for any other switch case
>>>> statement on the platform ? Is there something specific/different just
>>>> on vm_get_page_prot() implementation ? Are you suggesting that switch
>>>> case statements should just be avoided instead ?
>>>>
>>>>>
>>>>> So, as requested by Russell, could you look at the disassembly for other
>>>>> architectures and show us that ARM and POWERPC are the only ones for
>>>>> which your change is not optimal ?
>>>>
>>>> But the primary purpose of this series is not to guarantee optimized
>>>> code on platform by platform basis, while migrating from a table based
>>>> look up method into a switch case statement.
>>>>
>>>> But instead, the purposes is to remove current levels of unnecessary
>>>> abstraction while converting a vm_flags access combination into page
>>>> protection. The switch case statement for platform implementation of
>>>> vm_get_page_prot() just seemed logical enough. Christoph's original
>>>> suggestion patch for x86 had the same implementation as well.
>>>>
>>>> But if the table look up is still better/preferred method on certain
>>>> platforms like arm or ppc32, will be happy to preserve that.
>>>
>>> I doubt the switch() variant would give better code on any platform.
>>>
>>> What about using tables everywhere, using designated initializers
>>> to improve readability?
>>
>> Designated initializers ? Could you please be more specific. A table look
>> up on arm platform would be something like this and arm_protection_map[]
>> needs to be updated with user_pgprot like before. Just wondering how a
>> designated initializer will help here.
> 
> It's more readable than the original:
> 
>     pgprot_t protection_map[16] __ro_after_init = {
>             __P000, __P001, __P010, __P011, __P100, __P101, __P110, __P111,
>             __S000, __S001, __S010, __S011, __S100, __S101, __S110, __S111
>     };
> 
>>
>> static pgprot_t arm_protection_map[16] __ro_after_init = {
>>        [VM_NONE]                                       = __PAGE_NONE,
>>        [VM_READ]                                       = __PAGE_READONLY,
>>        [VM_WRITE]                                      = __PAGE_COPY,
>>        [VM_WRITE | VM_READ]                            = __PAGE_COPY,
>>        [VM_EXEC]                                       = __PAGE_READONLY_EXEC,
>>        [VM_EXEC | VM_READ]                             = __PAGE_READONLY_EXEC,
>>        [VM_EXEC | VM_WRITE]                            = __PAGE_COPY_EXEC,
>>        [VM_EXEC | VM_WRITE | VM_READ]                  = __PAGE_COPY_EXEC,
>>        [VM_SHARED]                                     = __PAGE_NONE,
>>        [VM_SHARED | VM_READ]                           = __PAGE_READONLY,
>>        [VM_SHARED | VM_WRITE]                          = __PAGE_SHARED,
>>        [VM_SHARED | VM_WRITE | VM_READ]                = __PAGE_SHARED,
>>        [VM_SHARED | VM_EXEC]                           = __PAGE_READONLY_EXEC,
>>        [VM_SHARED | VM_EXEC | VM_READ]                 = __PAGE_READONLY_EXEC,
>>        [VM_SHARED | VM_EXEC | VM_WRITE]                = __PAGE_SHARED_EXEC,
>>        [VM_SHARED | VM_EXEC | VM_WRITE | VM_READ]      = __PAGE_SHARED_EXEC
>> };
> 
> Yeah, like that.
> 
> Seems like you already made such a conversion in
> https://lore.kernel.org/all/1645425519-9034-3-git-send-email-anshuman.khandual at arm.com/

Will rework the series in two different phases as mentioned on the other thread.

WARNING: multiple messages have this Message-ID (diff)
From: Anshuman Khandual <anshuman.khandual@arm.com>
To: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>,
	"Russell King (Oracle)" <linux@armlinux.org.uk>,
	"linux-ia64@vger.kernel.org" <linux-ia64@vger.kernel.org>,
	"linux-sh@vger.kernel.org" <linux-sh@vger.kernel.org>,
	"linux-mips@vger.kernel.org" <linux-mips@vger.kernel.org>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	"sparclinux@vger.kernel.org" <sparclinux@vger.kernel.org>,
	"linux-riscv@lists.infradead.org"
	<linux-riscv@lists.infradead.org>,
	"linux-arch@vger.kernel.org" <linux-arch@vger.kernel.org>,
	"linux-s390@vger.kernel.org" <linux-s390@vger.kernel.org>,
	"linux-hexagon@vger.kernel.org" <linux-hexagon@vger.kernel.org>,
	"linux-csky@vger.kernel.org" <linux-csky@vger.kernel.org>,
	Christoph Hellwig <hch@infradead.org>,
	"linux-snps-arc@lists.infradead.org"
	<linux-snps-arc@lists.infradead.org>,
	"linux-xtensa@linux-xtensa.org" <linux-xtensa@linux-xtensa.org>,
	Arnd Bergmann <arnd@arndb.de>,
	"linux-um@lists.infradead.org" <linux-um@lists.infradead.org>,
	"linux-m68k@lists.linux-m68k.org"
	<linux-m68k@lists.linux-m68k.org>,
	"openrisc@lists.librecores.org" <openrisc@lists.librecores.org>,
	"linux-arm-kernel@lists.infradead.org"
	<linux-arm-kernel@lists.infradead.org>,
	"linux-parisc@vger.kernel.org" <linux-parisc@vger.kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"linux-alpha@vger.kernel.org" <linux-alpha@vger.kernel.org>,
	"akpm@linux-foundation.org" <akpm@linux-foundation.org>,
	"linuxppc-dev@lists.ozlabs.org" <linuxppc-dev@lists.ozlabs.org>
Subject: Re: [PATCH V3 09/30] arm/mm: Enable ARCH_HAS_VM_GET_PAGE_PROT
Date: Wed, 09 Mar 2022 11:45:28 +0000	[thread overview]
Message-ID: <70a99f28-ea69-58e3-f919-85551943c0a3@arm.com> (raw)
In-Reply-To: <CAMuHMdUPw_yBMe9XxQHHuf40it3V5mGuOA10KnMpXt124Ay8Tw@mail.gmail.com>



On 3/2/22 16:44, Geert Uytterhoeven wrote:
> Hi Anshuman,
> 
> On Wed, Mar 2, 2022 at 12:07 PM Anshuman Khandual
> <anshuman.khandual@arm.com> wrote:
>> On 3/2/22 3:35 PM, Geert Uytterhoeven wrote:
>>> On Wed, Mar 2, 2022 at 10:51 AM Anshuman Khandual
>>> <anshuman.khandual@arm.com> wrote:
>>>> On 3/2/22 12:35 PM, Christophe Leroy wrote:
>>>>> Le 02/03/2022 à 04:22, Anshuman Khandual a écrit :
>>>>>> On 3/1/22 1:46 PM, Christophe Leroy wrote:
>>>>>>> Le 01/03/2022 à 01:31, Russell King (Oracle) a écrit :
>>>>>>>> On Tue, Mar 01, 2022 at 05:30:41AM +0530, Anshuman Khandual wrote:
>>>>>>>>> On 2/28/22 4:27 PM, Russell King (Oracle) wrote:
>>>>>>>>>> On Mon, Feb 28, 2022 at 04:17:32PM +0530, Anshuman Khandual wrote:
>>>>>>>>>>> This defines and exports a platform specific custom vm_get_page_prot() via
>>>>>>>>>>> subscribing ARCH_HAS_VM_GET_PAGE_PROT. Subsequently all __SXXX and __PXXX
>>>>>>>>>>> macros can be dropped which are no longer needed.
>>>>>>>>>>
>>>>>>>>>> What I would really like to know is why having to run _code_ to work out
>>>>>>>>>> what the page protections need to be is better than looking it up in a
>>>>>>>>>> table.
>>>>>>>>>>
>>>>>>>>>> Not only is this more expensive in terms of CPU cycles, it also brings
>>>>>>>>>> additional code size with it.
>>>>>>>>>>
>>>>>>>>>> I'm struggling to see what the benefit is.
>>>>>>>>>
>>>>>>>>> Currently vm_get_page_prot() is also being _run_ to fetch required page
>>>>>>>>> protection values. Although that is being run in the core MM and from a
>>>>>>>>> platform perspective __SXXX, __PXXX are just being exported for a table.
>>>>>>>>> Looking it up in a table (and applying more constructs there after) is
>>>>>>>>> not much different than a clean switch case statement in terms of CPU
>>>>>>>>> usage. So this is not more expensive in terms of CPU cycles.
>>>>>>>>
>>>>>>>> I disagree.
>>>>>>>
>>>>>>> So do I.
>>>>>>>
>>>>>>>>
>>>>>>>> However, let's base this disagreement on some evidence. Here is the
>>>>>>>> present 32-bit ARM implementation:
>>>>>>>>
>>>>>>>> 00000048 <vm_get_page_prot>:
>>>>>>>>         48:       e200000f        and     r0, r0, #15
>>>>>>>>         4c:       e3003000        movw    r3, #0
>>>>>>>>                           4c: R_ARM_MOVW_ABS_NC   .LANCHOR1
>>>>>>>>         50:       e3403000        movt    r3, #0
>>>>>>>>                           50: R_ARM_MOVT_ABS      .LANCHOR1
>>>>>>>>         54:       e7930100        ldr     r0, [r3, r0, lsl #2]
>>>>>>>>         58:       e12fff1e        bx      lr
>>>>>>>>
>>>>>>>> That is five instructions long.
>>>>>>>
>>>>>>> On ppc32 I get:
>>>>>>>
>>>>>>> 00000094 <vm_get_page_prot>:
>>>>>>>         94: 3d 20 00 00     lis     r9,0
>>>>>>>                     96: R_PPC_ADDR16_HA     .data..ro_after_init
>>>>>>>         98: 54 84 16 ba     rlwinm  r4,r4,2,26,29
>>>>>>>         9c: 39 29 00 00     addi    r9,r9,0
>>>>>>>                     9e: R_PPC_ADDR16_LO     .data..ro_after_init
>>>>>>>         a0: 7d 29 20 2e     lwzx    r9,r9,r4
>>>>>>>         a4: 91 23 00 00     stw     r9,0(r3)
>>>>>>>         a8: 4e 80 00 20     blr
>>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>> Please show that your new implementation is not more expensive on
>>>>>>>> 32-bit ARM. Please do so by building a 32-bit kernel, and providing
>>>>>>>> the disassembly.
>>>>>>>
>>>>>>> With your series I get:
>>>>>>>
>>>>>>> 00000000 <vm_get_page_prot>:
>>>>>>>      0:     3d 20 00 00     lis     r9,0
>>>>>>>                     2: R_PPC_ADDR16_HA      .rodata
>>>>>>>      4:     39 29 00 00     addi    r9,r9,0
>>>>>>>                     6: R_PPC_ADDR16_LO      .rodata
>>>>>>>      8:     54 84 16 ba     rlwinm  r4,r4,2,26,29
>>>>>>>      c:     7d 49 20 2e     lwzx    r10,r9,r4
>>>>>>>     10:     7d 4a 4a 14     add     r10,r10,r9
>>>>>>>     14:     7d 49 03 a6     mtctr   r10
>>>>>>>     18:     4e 80 04 20     bctr
>>>>>>>     1c:     39 20 03 15     li      r9,789
>>>>>>>     20:     91 23 00 00     stw     r9,0(r3)
>>>>>>>     24:     4e 80 00 20     blr
>>>>>>>     28:     39 20 01 15     li      r9,277
>>>>>>>     2c:     91 23 00 00     stw     r9,0(r3)
>>>>>>>     30:     4e 80 00 20     blr
>>>>>>>     34:     39 20 07 15     li      r9,1813
>>>>>>>     38:     91 23 00 00     stw     r9,0(r3)
>>>>>>>     3c:     4e 80 00 20     blr
>>>>>>>     40:     39 20 05 15     li      r9,1301
>>>>>>>     44:     91 23 00 00     stw     r9,0(r3)
>>>>>>>     48:     4e 80 00 20     blr
>>>>>>>     4c:     39 20 01 11     li      r9,273
>>>>>>>     50:     4b ff ff d0     b       20 <vm_get_page_prot+0x20>
>>>>>>>
>>>>>>>
>>>>>>> That is definitely more expensive, it implements a table of branches.
>>>>>>
>>>>>> Okay, will split out the PPC32 implementation that retains existing
>>>>>> table look up method. Also planning to keep that inside same file
>>>>>> (arch/powerpc/mm/mmap.c), unless you have a difference preference.
>>>>>
>>>>> My point was not to get something specific for PPC32, but to amplify on
>>>>> Russell's objection.
>>>>>
>>>>> As this is bad for ARM and bad for PPC32, do we have any evidence that
>>>>> your change is good for any other architecture ?
>>>>>
>>>>> I checked PPC64 and there is exactly the same drawback. With the current
>>>>> implementation it is a small function performing table read then a few
>>>>> adjustment. After your change it is a bigger function implementing a
>>>>> table of branches.
>>>>
>>>> I am wondering if this would not be the case for any other switch case
>>>> statement on the platform ? Is there something specific/different just
>>>> on vm_get_page_prot() implementation ? Are you suggesting that switch
>>>> case statements should just be avoided instead ?
>>>>
>>>>>
>>>>> So, as requested by Russell, could you look at the disassembly for other
>>>>> architectures and show us that ARM and POWERPC are the only ones for
>>>>> which your change is not optimal ?
>>>>
>>>> But the primary purpose of this series is not to guarantee optimized
>>>> code on platform by platform basis, while migrating from a table based
>>>> look up method into a switch case statement.
>>>>
>>>> But instead, the purposes is to remove current levels of unnecessary
>>>> abstraction while converting a vm_flags access combination into page
>>>> protection. The switch case statement for platform implementation of
>>>> vm_get_page_prot() just seemed logical enough. Christoph's original
>>>> suggestion patch for x86 had the same implementation as well.
>>>>
>>>> But if the table look up is still better/preferred method on certain
>>>> platforms like arm or ppc32, will be happy to preserve that.
>>>
>>> I doubt the switch() variant would give better code on any platform.
>>>
>>> What about using tables everywhere, using designated initializers
>>> to improve readability?
>>
>> Designated initializers ? Could you please be more specific. A table look
>> up on arm platform would be something like this and arm_protection_map[]
>> needs to be updated with user_pgprot like before. Just wondering how a
>> designated initializer will help here.
> 
> It's more readable than the original:
> 
>     pgprot_t protection_map[16] __ro_after_init = {
>             __P000, __P001, __P010, __P011, __P100, __P101, __P110, __P111,
>             __S000, __S001, __S010, __S011, __S100, __S101, __S110, __S111
>     };
> 
>>
>> static pgprot_t arm_protection_map[16] __ro_after_init = {
>>        [VM_NONE]                                       = __PAGE_NONE,
>>        [VM_READ]                                       = __PAGE_READONLY,
>>        [VM_WRITE]                                      = __PAGE_COPY,
>>        [VM_WRITE | VM_READ]                            = __PAGE_COPY,
>>        [VM_EXEC]                                       = __PAGE_READONLY_EXEC,
>>        [VM_EXEC | VM_READ]                             = __PAGE_READONLY_EXEC,
>>        [VM_EXEC | VM_WRITE]                            = __PAGE_COPY_EXEC,
>>        [VM_EXEC | VM_WRITE | VM_READ]                  = __PAGE_COPY_EXEC,
>>        [VM_SHARED]                                     = __PAGE_NONE,
>>        [VM_SHARED | VM_READ]                           = __PAGE_READONLY,
>>        [VM_SHARED | VM_WRITE]                          = __PAGE_SHARED,
>>        [VM_SHARED | VM_WRITE | VM_READ]                = __PAGE_SHARED,
>>        [VM_SHARED | VM_EXEC]                           = __PAGE_READONLY_EXEC,
>>        [VM_SHARED | VM_EXEC | VM_READ]                 = __PAGE_READONLY_EXEC,
>>        [VM_SHARED | VM_EXEC | VM_WRITE]                = __PAGE_SHARED_EXEC,
>>        [VM_SHARED | VM_EXEC | VM_WRITE | VM_READ]      = __PAGE_SHARED_EXEC
>> };
> 
> Yeah, like that.
> 
> Seems like you already made such a conversion in
> https://lore.kernel.org/all/1645425519-9034-3-git-send-email-anshuman.khandual@arm.com/

Will rework the series in two different phases as mentioned on the other thread.

WARNING: multiple messages have this Message-ID (diff)
From: Anshuman Khandual <anshuman.khandual@arm.com>
To: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>,
	"Russell King (Oracle)" <linux@armlinux.org.uk>,
	"linux-ia64@vger.kernel.org" <linux-ia64@vger.kernel.org>,
	"linux-sh@vger.kernel.org" <linux-sh@vger.kernel.org>,
	"linux-mips@vger.kernel.org" <linux-mips@vger.kernel.org>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	"sparclinux@vger.kernel.org" <sparclinux@vger.kernel.org>,
	"linux-riscv@lists.infradead.org"
	<linux-riscv@lists.infradead.org>,
	"linux-arch@vger.kernel.org" <linux-arch@vger.kernel.org>,
	"linux-s390@vger.kernel.org" <linux-s390@vger.kernel.org>,
	"linux-hexagon@vger.kernel.org" <linux-hexagon@vger.kernel.org>,
	"linux-csky@vger.kernel.org" <linux-csky@vger.kernel.org>,
	Christoph Hellwig <hch@infradead.org>,
	"linux-snps-arc@lists.infradead.org" <linux-snps-arc@lists.in>
Subject: Re: [PATCH V3 09/30] arm/mm: Enable ARCH_HAS_VM_GET_PAGE_PROT
Date: Wed, 9 Mar 2022 17:03:28 +0530	[thread overview]
Message-ID: <70a99f28-ea69-58e3-f919-85551943c0a3@arm.com> (raw)
In-Reply-To: <CAMuHMdUPw_yBMe9XxQHHuf40it3V5mGuOA10KnMpXt124Ay8Tw@mail.gmail.com>



On 3/2/22 16:44, Geert Uytterhoeven wrote:
> Hi Anshuman,
> 
> On Wed, Mar 2, 2022 at 12:07 PM Anshuman Khandual
> <anshuman.khandual@arm.com> wrote:
>> On 3/2/22 3:35 PM, Geert Uytterhoeven wrote:
>>> On Wed, Mar 2, 2022 at 10:51 AM Anshuman Khandual
>>> <anshuman.khandual@arm.com> wrote:
>>>> On 3/2/22 12:35 PM, Christophe Leroy wrote:
>>>>> Le 02/03/2022 à 04:22, Anshuman Khandual a écrit :
>>>>>> On 3/1/22 1:46 PM, Christophe Leroy wrote:
>>>>>>> Le 01/03/2022 à 01:31, Russell King (Oracle) a écrit :
>>>>>>>> On Tue, Mar 01, 2022 at 05:30:41AM +0530, Anshuman Khandual wrote:
>>>>>>>>> On 2/28/22 4:27 PM, Russell King (Oracle) wrote:
>>>>>>>>>> On Mon, Feb 28, 2022 at 04:17:32PM +0530, Anshuman Khandual wrote:
>>>>>>>>>>> This defines and exports a platform specific custom vm_get_page_prot() via
>>>>>>>>>>> subscribing ARCH_HAS_VM_GET_PAGE_PROT. Subsequently all __SXXX and __PXXX
>>>>>>>>>>> macros can be dropped which are no longer needed.
>>>>>>>>>>
>>>>>>>>>> What I would really like to know is why having to run _code_ to work out
>>>>>>>>>> what the page protections need to be is better than looking it up in a
>>>>>>>>>> table.
>>>>>>>>>>
>>>>>>>>>> Not only is this more expensive in terms of CPU cycles, it also brings
>>>>>>>>>> additional code size with it.
>>>>>>>>>>
>>>>>>>>>> I'm struggling to see what the benefit is.
>>>>>>>>>
>>>>>>>>> Currently vm_get_page_prot() is also being _run_ to fetch required page
>>>>>>>>> protection values. Although that is being run in the core MM and from a
>>>>>>>>> platform perspective __SXXX, __PXXX are just being exported for a table.
>>>>>>>>> Looking it up in a table (and applying more constructs there after) is
>>>>>>>>> not much different than a clean switch case statement in terms of CPU
>>>>>>>>> usage. So this is not more expensive in terms of CPU cycles.
>>>>>>>>
>>>>>>>> I disagree.
>>>>>>>
>>>>>>> So do I.
>>>>>>>
>>>>>>>>
>>>>>>>> However, let's base this disagreement on some evidence. Here is the
>>>>>>>> present 32-bit ARM implementation:
>>>>>>>>
>>>>>>>> 00000048 <vm_get_page_prot>:
>>>>>>>>         48:       e200000f        and     r0, r0, #15
>>>>>>>>         4c:       e3003000        movw    r3, #0
>>>>>>>>                           4c: R_ARM_MOVW_ABS_NC   .LANCHOR1
>>>>>>>>         50:       e3403000        movt    r3, #0
>>>>>>>>                           50: R_ARM_MOVT_ABS      .LANCHOR1
>>>>>>>>         54:       e7930100        ldr     r0, [r3, r0, lsl #2]
>>>>>>>>         58:       e12fff1e        bx      lr
>>>>>>>>
>>>>>>>> That is five instructions long.
>>>>>>>
>>>>>>> On ppc32 I get:
>>>>>>>
>>>>>>> 00000094 <vm_get_page_prot>:
>>>>>>>         94: 3d 20 00 00     lis     r9,0
>>>>>>>                     96: R_PPC_ADDR16_HA     .data..ro_after_init
>>>>>>>         98: 54 84 16 ba     rlwinm  r4,r4,2,26,29
>>>>>>>         9c: 39 29 00 00     addi    r9,r9,0
>>>>>>>                     9e: R_PPC_ADDR16_LO     .data..ro_after_init
>>>>>>>         a0: 7d 29 20 2e     lwzx    r9,r9,r4
>>>>>>>         a4: 91 23 00 00     stw     r9,0(r3)
>>>>>>>         a8: 4e 80 00 20     blr
>>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>> Please show that your new implementation is not more expensive on
>>>>>>>> 32-bit ARM. Please do so by building a 32-bit kernel, and providing
>>>>>>>> the disassembly.
>>>>>>>
>>>>>>> With your series I get:
>>>>>>>
>>>>>>> 00000000 <vm_get_page_prot>:
>>>>>>>      0:     3d 20 00 00     lis     r9,0
>>>>>>>                     2: R_PPC_ADDR16_HA      .rodata
>>>>>>>      4:     39 29 00 00     addi    r9,r9,0
>>>>>>>                     6: R_PPC_ADDR16_LO      .rodata
>>>>>>>      8:     54 84 16 ba     rlwinm  r4,r4,2,26,29
>>>>>>>      c:     7d 49 20 2e     lwzx    r10,r9,r4
>>>>>>>     10:     7d 4a 4a 14     add     r10,r10,r9
>>>>>>>     14:     7d 49 03 a6     mtctr   r10
>>>>>>>     18:     4e 80 04 20     bctr
>>>>>>>     1c:     39 20 03 15     li      r9,789
>>>>>>>     20:     91 23 00 00     stw     r9,0(r3)
>>>>>>>     24:     4e 80 00 20     blr
>>>>>>>     28:     39 20 01 15     li      r9,277
>>>>>>>     2c:     91 23 00 00     stw     r9,0(r3)
>>>>>>>     30:     4e 80 00 20     blr
>>>>>>>     34:     39 20 07 15     li      r9,1813
>>>>>>>     38:     91 23 00 00     stw     r9,0(r3)
>>>>>>>     3c:     4e 80 00 20     blr
>>>>>>>     40:     39 20 05 15     li      r9,1301
>>>>>>>     44:     91 23 00 00     stw     r9,0(r3)
>>>>>>>     48:     4e 80 00 20     blr
>>>>>>>     4c:     39 20 01 11     li      r9,273
>>>>>>>     50:     4b ff ff d0     b       20 <vm_get_page_prot+0x20>
>>>>>>>
>>>>>>>
>>>>>>> That is definitely more expensive, it implements a table of branches.
>>>>>>
>>>>>> Okay, will split out the PPC32 implementation that retains existing
>>>>>> table look up method. Also planning to keep that inside same file
>>>>>> (arch/powerpc/mm/mmap.c), unless you have a difference preference.
>>>>>
>>>>> My point was not to get something specific for PPC32, but to amplify on
>>>>> Russell's objection.
>>>>>
>>>>> As this is bad for ARM and bad for PPC32, do we have any evidence that
>>>>> your change is good for any other architecture ?
>>>>>
>>>>> I checked PPC64 and there is exactly the same drawback. With the current
>>>>> implementation it is a small function performing table read then a few
>>>>> adjustment. After your change it is a bigger function implementing a
>>>>> table of branches.
>>>>
>>>> I am wondering if this would not be the case for any other switch case
>>>> statement on the platform ? Is there something specific/different just
>>>> on vm_get_page_prot() implementation ? Are you suggesting that switch
>>>> case statements should just be avoided instead ?
>>>>
>>>>>
>>>>> So, as requested by Russell, could you look at the disassembly for other
>>>>> architectures and show us that ARM and POWERPC are the only ones for
>>>>> which your change is not optimal ?
>>>>
>>>> But the primary purpose of this series is not to guarantee optimized
>>>> code on platform by platform basis, while migrating from a table based
>>>> look up method into a switch case statement.
>>>>
>>>> But instead, the purposes is to remove current levels of unnecessary
>>>> abstraction while converting a vm_flags access combination into page
>>>> protection. The switch case statement for platform implementation of
>>>> vm_get_page_prot() just seemed logical enough. Christoph's original
>>>> suggestion patch for x86 had the same implementation as well.
>>>>
>>>> But if the table look up is still better/preferred method on certain
>>>> platforms like arm or ppc32, will be happy to preserve that.
>>>
>>> I doubt the switch() variant would give better code on any platform.
>>>
>>> What about using tables everywhere, using designated initializers
>>> to improve readability?
>>
>> Designated initializers ? Could you please be more specific. A table look
>> up on arm platform would be something like this and arm_protection_map[]
>> needs to be updated with user_pgprot like before. Just wondering how a
>> designated initializer will help here.
> 
> It's more readable than the original:
> 
>     pgprot_t protection_map[16] __ro_after_init = {
>             __P000, __P001, __P010, __P011, __P100, __P101, __P110, __P111,
>             __S000, __S001, __S010, __S011, __S100, __S101, __S110, __S111
>     };
> 
>>
>> static pgprot_t arm_protection_map[16] __ro_after_init = {
>>        [VM_NONE]                                       = __PAGE_NONE,
>>        [VM_READ]                                       = __PAGE_READONLY,
>>        [VM_WRITE]                                      = __PAGE_COPY,
>>        [VM_WRITE | VM_READ]                            = __PAGE_COPY,
>>        [VM_EXEC]                                       = __PAGE_READONLY_EXEC,
>>        [VM_EXEC | VM_READ]                             = __PAGE_READONLY_EXEC,
>>        [VM_EXEC | VM_WRITE]                            = __PAGE_COPY_EXEC,
>>        [VM_EXEC | VM_WRITE | VM_READ]                  = __PAGE_COPY_EXEC,
>>        [VM_SHARED]                                     = __PAGE_NONE,
>>        [VM_SHARED | VM_READ]                           = __PAGE_READONLY,
>>        [VM_SHARED | VM_WRITE]                          = __PAGE_SHARED,
>>        [VM_SHARED | VM_WRITE | VM_READ]                = __PAGE_SHARED,
>>        [VM_SHARED | VM_EXEC]                           = __PAGE_READONLY_EXEC,
>>        [VM_SHARED | VM_EXEC | VM_READ]                 = __PAGE_READONLY_EXEC,
>>        [VM_SHARED | VM_EXEC | VM_WRITE]                = __PAGE_SHARED_EXEC,
>>        [VM_SHARED | VM_EXEC | VM_WRITE | VM_READ]      = __PAGE_SHARED_EXEC
>> };
> 
> Yeah, like that.
> 
> Seems like you already made such a conversion in
> https://lore.kernel.org/all/1645425519-9034-3-git-send-email-anshuman.khandual@arm.com/

Will rework the series in two different phases as mentioned on the other thread.

  reply	other threads:[~2022-03-09 11:33 UTC|newest]

Thread overview: 355+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-02-28 10:47 [PATCH V3 00/30] mm/mmap: Drop protection_map[] and platform's __SXXX/__PXXX requirements Anshuman Khandual
2022-02-28 10:59 ` Anshuman Khandual
2022-02-28 10:47 ` [OpenRISC] " Anshuman Khandual
2022-02-28 10:47 ` Anshuman Khandual
2022-02-28 10:47 ` Anshuman Khandual
2022-02-28 10:47 ` Anshuman Khandual
2022-02-28 10:47 ` Anshuman Khandual
2022-02-28 10:47 ` [PATCH V3 01/30] mm/debug_vm_pgtable: Drop protection_map[] usage Anshuman Khandual
2022-02-28 10:59   ` Anshuman Khandual
2022-02-28 10:47   ` [OpenRISC] " Anshuman Khandual
2022-02-28 10:47   ` Anshuman Khandual
2022-02-28 10:47   ` Anshuman Khandual
2022-02-28 10:47   ` Anshuman Khandual
2022-02-28 10:47   ` Anshuman Khandual
2022-02-28 10:47 ` [PATCH V3 02/30] mm/mmap: Clarify protection_map[] indices Anshuman Khandual
2022-02-28 10:59   ` Anshuman Khandual
2022-02-28 10:47   ` [OpenRISC] " Anshuman Khandual
2022-02-28 10:47   ` Anshuman Khandual
2022-02-28 10:47   ` Anshuman Khandual
2022-02-28 10:47   ` Anshuman Khandual
2022-02-28 10:47   ` Anshuman Khandual
2022-02-28 10:47 ` [PATCH V3 03/30] mm/mmap: Add new config ARCH_HAS_VM_GET_PAGE_PROT Anshuman Khandual
2022-02-28 10:59   ` Anshuman Khandual
2022-02-28 10:47   ` [OpenRISC] " Anshuman Khandual
2022-02-28 10:47   ` Anshuman Khandual
2022-02-28 10:47   ` Anshuman Khandual
2022-02-28 10:47   ` Anshuman Khandual
2022-02-28 10:47   ` Anshuman Khandual
2022-02-28 10:47 ` [PATCH V3 04/30] powerpc/mm: Enable ARCH_HAS_VM_GET_PAGE_PROT Anshuman Khandual
2022-02-28 10:59   ` Anshuman Khandual
2022-02-28 10:47   ` [OpenRISC] " Anshuman Khandual
2022-02-28 10:47   ` Anshuman Khandual
2022-02-28 10:47   ` Anshuman Khandual
2022-02-28 10:47   ` Anshuman Khandual
2022-02-28 10:47   ` Anshuman Khandual
2022-03-02  5:23   ` Michael Ellerman
2022-03-02  5:23     ` Michael Ellerman
2022-03-02  5:23     ` [OpenRISC] " Michael Ellerman
2022-03-02  5:23     ` Michael Ellerman
2022-03-02  5:23     ` Michael Ellerman
2022-03-02  5:23     ` Michael Ellerman
2022-03-02  5:23     ` Michael Ellerman
2022-02-28 10:47 ` [PATCH V3 05/30] arm64/mm: " Anshuman Khandual
2022-02-28 10:59   ` Anshuman Khandual
2022-02-28 10:47   ` [OpenRISC] " Anshuman Khandual
2022-02-28 10:47   ` Anshuman Khandual
2022-02-28 10:47   ` Anshuman Khandual
2022-02-28 10:47   ` Anshuman Khandual
2022-02-28 10:47   ` Anshuman Khandual
2022-03-03 15:28   ` Catalin Marinas
2022-03-03 15:28     ` Catalin Marinas
2022-03-03 15:28     ` [OpenRISC] " Catalin Marinas
2022-03-03 15:28     ` Catalin Marinas
2022-03-03 15:28     ` Catalin Marinas
2022-03-03 15:28     ` Catalin Marinas
2022-03-03 15:28     ` Catalin Marinas
2022-03-09 11:31     ` Anshuman Khandual
2022-03-09 11:43       ` Anshuman Khandual
2022-03-09 11:31       ` [OpenRISC] " Anshuman Khandual
2022-03-09 11:31       ` Anshuman Khandual
2022-03-09 11:31       ` Anshuman Khandual
2022-03-09 11:31       ` Anshuman Khandual
2022-03-09 11:31       ` Anshuman Khandual
2022-02-28 10:47 ` [PATCH V3 06/30] sparc/mm: " Anshuman Khandual
2022-02-28 10:59   ` Anshuman Khandual
2022-02-28 10:47   ` [OpenRISC] " Anshuman Khandual
2022-02-28 10:47   ` Anshuman Khandual
2022-02-28 10:47   ` Anshuman Khandual
2022-02-28 10:47   ` Anshuman Khandual
2022-02-28 10:47   ` Anshuman Khandual
2022-02-28 10:47 ` [PATCH V3 07/30] mips/mm: " Anshuman Khandual
2022-02-28 10:59   ` Anshuman Khandual
2022-02-28 10:47   ` [OpenRISC] " Anshuman Khandual
2022-02-28 10:47   ` Anshuman Khandual
2022-02-28 10:47   ` Anshuman Khandual
2022-02-28 10:47   ` Anshuman Khandual
2022-02-28 10:47   ` Anshuman Khandual
2022-02-28 10:47 ` [PATCH V3 08/30] m68k/mm: " Anshuman Khandual
2022-02-28 10:59   ` Anshuman Khandual
2022-02-28 10:47   ` [OpenRISC] " Anshuman Khandual
2022-02-28 10:47   ` Anshuman Khandual
2022-02-28 10:47   ` Anshuman Khandual
2022-02-28 10:47   ` Anshuman Khandual
2022-02-28 10:47   ` Anshuman Khandual
2022-02-28 10:47 ` [PATCH V3 09/30] arm/mm: " Anshuman Khandual
2022-02-28 10:59   ` Anshuman Khandual
2022-02-28 10:47   ` [OpenRISC] " Anshuman Khandual
2022-02-28 10:47   ` Anshuman Khandual
2022-02-28 10:47   ` Anshuman Khandual
2022-02-28 10:47   ` Anshuman Khandual
2022-02-28 10:47   ` Anshuman Khandual
2022-02-28 10:57   ` Russell King (Oracle)
2022-02-28 10:57     ` [OpenRISC] " Russell King
2022-02-28 10:57     ` Russell King (Oracle)
2022-02-28 10:57     ` Russell King (Oracle)
2022-02-28 10:57     ` Russell King (Oracle)
2022-02-28 10:57     ` Russell King (Oracle)
2022-02-28 13:49     ` Geert Uytterhoeven
2022-02-28 13:49       ` Geert Uytterhoeven
2022-02-28 13:49       ` Geert Uytterhoeven
2022-02-28 13:49       ` [OpenRISC] " Geert Uytterhoeven
2022-02-28 13:49       ` Geert Uytterhoeven
2022-02-28 13:49       ` Geert Uytterhoeven
2022-02-28 13:49       ` Geert Uytterhoeven
2022-02-28 13:49       ` Geert Uytterhoeven
2022-03-01  0:00     ` Anshuman Khandual
2022-03-01  0:12       ` Anshuman Khandual
2022-03-01  0:00       ` [OpenRISC] " Anshuman Khandual
2022-03-01  0:00       ` Anshuman Khandual
2022-03-01  0:00       ` Anshuman Khandual
2022-03-01  0:00       ` Anshuman Khandual
2022-03-01  0:00       ` Anshuman Khandual
2022-03-01  0:31       ` Russell King (Oracle)
2022-03-01  0:31         ` Russell King (Oracle)
2022-03-01  0:31         ` [OpenRISC] " Russell King
2022-03-01  0:31         ` Russell King (Oracle)
2022-03-01  0:31         ` Russell King (Oracle)
2022-03-01  0:31         ` Russell King (Oracle)
2022-03-01  0:31         ` Russell King (Oracle)
2022-03-01  8:16         ` Christophe Leroy
2022-03-01  8:16           ` Christophe Leroy
2022-03-01  8:16           ` Christophe Leroy
2022-03-01  8:16           ` [OpenRISC] " Christophe Leroy
2022-03-01  8:16           ` Christophe Leroy
2022-03-01  8:16           ` Christophe Leroy
2022-03-01  8:16           ` Christophe Leroy
2022-03-01  8:16           ` Christophe Leroy
2022-03-02  3:22           ` Anshuman Khandual
2022-03-02  3:34             ` Anshuman Khandual
2022-03-02  3:22             ` Anshuman Khandual
2022-03-02  3:22             ` [OpenRISC] " Anshuman Khandual
2022-03-02  3:22             ` Anshuman Khandual
2022-03-02  3:22             ` Anshuman Khandual
2022-03-02  3:22             ` Anshuman Khandual
2022-03-02  3:22             ` Anshuman Khandual
2022-03-02  7:05             ` Christophe Leroy
2022-03-02  7:05               ` Christophe Leroy
2022-03-02  7:05               ` Christophe Leroy
2022-03-02  7:05               ` [OpenRISC] " Christophe Leroy
2022-03-02  7:05               ` Christophe Leroy
2022-03-02  7:05               ` Christophe Leroy
2022-03-02  7:05               ` Christophe Leroy
2022-03-02  7:05               ` Christophe Leroy
2022-03-02  9:51               ` Anshuman Khandual
2022-03-02  9:51                 ` Anshuman Khandual
2022-03-02  9:51                 ` [OpenRISC] " Anshuman Khandual
2022-03-02  9:51                 ` Anshuman Khandual
2022-03-02  9:51                 ` Anshuman Khandual
2022-03-02  9:51                 ` Anshuman Khandual
2022-03-02  9:51                 ` Anshuman Khandual
2022-03-02  9:51                 ` Anshuman Khandual
2022-03-02 10:05                 ` Geert Uytterhoeven
2022-03-02 10:05                   ` Geert Uytterhoeven
2022-03-02 10:05                   ` Geert Uytterhoeven
2022-03-02 10:05                   ` [OpenRISC] " Geert Uytterhoeven
2022-03-02 10:05                   ` Geert Uytterhoeven
2022-03-02 10:05                   ` Geert Uytterhoeven
2022-03-02 10:05                   ` Geert Uytterhoeven
2022-03-02 10:05                   ` Geert Uytterhoeven
2022-03-02 11:06                   ` Anshuman Khandual
2022-03-02 11:18                     ` Anshuman Khandual
2022-03-02 11:06                     ` Anshuman Khandual
2022-03-02 11:06                     ` [OpenRISC] " Anshuman Khandual
2022-03-02 11:06                     ` Anshuman Khandual
2022-03-02 11:06                     ` Anshuman Khandual
2022-03-02 11:06                     ` Anshuman Khandual
2022-03-02 11:06                     ` Anshuman Khandual
2022-03-02 11:14                     ` Geert Uytterhoeven
2022-03-02 11:14                       ` Geert Uytterhoeven
2022-03-02 11:14                       ` Geert Uytterhoeven
2022-03-02 11:14                       ` [OpenRISC] " Geert Uytterhoeven
2022-03-02 11:14                       ` Geert Uytterhoeven
2022-03-02 11:14                       ` Geert Uytterhoeven
2022-03-02 11:14                       ` Geert Uytterhoeven
2022-03-02 11:14                       ` Geert Uytterhoeven
2022-03-09 11:33                       ` Anshuman Khandual [this message]
2022-03-09 11:45                         ` Anshuman Khandual
2022-03-09 11:33                         ` Anshuman Khandual
2022-03-09 11:33                         ` [OpenRISC] " Anshuman Khandual
2022-03-09 11:33                         ` Anshuman Khandual
2022-03-09 11:33                         ` Anshuman Khandual
2022-03-09 11:33                         ` Anshuman Khandual
2022-03-09 11:33                         ` Anshuman Khandual
2022-03-02 11:19                     ` Russell King (Oracle)
2022-03-02 11:19                       ` Russell King (Oracle)
2022-03-02 11:19                       ` Russell King (Oracle)
2022-03-02 11:19                       ` [OpenRISC] " Russell King
2022-03-02 11:19                       ` Russell King (Oracle)
2022-03-02 11:19                       ` Russell King (Oracle)
2022-03-02 11:19                       ` Russell King (Oracle)
2022-03-02 11:19                       ` Russell King (Oracle)
2022-03-02  3:15         ` Anshuman Khandual
2022-03-02  3:27           ` Anshuman Khandual
2022-03-02  3:15           ` [OpenRISC] " Anshuman Khandual
2022-03-02  3:15           ` Anshuman Khandual
2022-03-02  3:15           ` Anshuman Khandual
2022-03-02  3:15           ` Anshuman Khandual
2022-03-02  3:15           ` Anshuman Khandual
2022-02-28 10:47 ` [PATCH V3 10/30] x86/mm: " Anshuman Khandual
2022-02-28 10:59   ` Anshuman Khandual
2022-02-28 10:47   ` [OpenRISC] " Anshuman Khandual
2022-02-28 10:47   ` Anshuman Khandual
2022-02-28 10:47   ` Anshuman Khandual
2022-02-28 10:47   ` Anshuman Khandual
2022-02-28 10:47   ` Anshuman Khandual
2022-02-28 10:47 ` [PATCH V3 11/30] mm/mmap: Drop protection_map[] Anshuman Khandual
2022-02-28 10:59   ` Anshuman Khandual
2022-02-28 10:47   ` [OpenRISC] " Anshuman Khandual
2022-02-28 10:47   ` Anshuman Khandual
2022-02-28 10:47   ` Anshuman Khandual
2022-02-28 10:47   ` Anshuman Khandual
2022-02-28 10:47   ` Anshuman Khandual
2022-02-28 10:47 ` [PATCH V3 12/30] mm/mmap: Drop arch_filter_pgprot() Anshuman Khandual
2022-02-28 10:59   ` Anshuman Khandual
2022-02-28 10:47   ` [OpenRISC] " Anshuman Khandual
2022-02-28 10:47   ` Anshuman Khandual
2022-02-28 10:47   ` Anshuman Khandual
2022-02-28 10:47   ` Anshuman Khandual
2022-02-28 10:47   ` Anshuman Khandual
2022-02-28 10:47 ` [PATCH V3 13/30] mm/mmap: Drop arch_vm_get_page_pgprot() Anshuman Khandual
2022-02-28 10:59   ` Anshuman Khandual
2022-02-28 10:47   ` Anshuman Khandual
2022-02-28 10:47   ` [OpenRISC] " Anshuman Khandual
2022-02-28 10:47   ` Anshuman Khandual
2022-02-28 10:47   ` Anshuman Khandual
2022-02-28 10:47   ` Anshuman Khandual
2022-02-28 10:47   ` Anshuman Khandual
2022-02-28 10:47 ` [PATCH V3 14/30] s390/mm: Enable ARCH_HAS_VM_GET_PAGE_PROT Anshuman Khandual
2022-02-28 10:59   ` Anshuman Khandual
2022-02-28 10:47   ` [OpenRISC] " Anshuman Khandual
2022-02-28 10:47   ` Anshuman Khandual
2022-02-28 10:47   ` Anshuman Khandual
2022-02-28 10:47   ` Anshuman Khandual
2022-02-28 10:47   ` Anshuman Khandual
2022-02-28 10:47 ` [PATCH V3 15/30] riscv/mm: " Anshuman Khandual
2022-02-28 10:59   ` Anshuman Khandual
2022-02-28 10:47   ` [OpenRISC] " Anshuman Khandual
2022-02-28 10:47   ` Anshuman Khandual
2022-02-28 10:47   ` Anshuman Khandual
2022-02-28 10:47   ` Anshuman Khandual
2022-02-28 10:47   ` Anshuman Khandual
2022-02-28 10:47 ` [PATCH V3 16/30] alpha/mm: " Anshuman Khandual
2022-02-28 10:59   ` Anshuman Khandual
2022-02-28 10:47   ` [OpenRISC] " Anshuman Khandual
2022-02-28 10:47   ` Anshuman Khandual
2022-02-28 10:47   ` Anshuman Khandual
2022-02-28 10:47   ` Anshuman Khandual
2022-02-28 10:47   ` Anshuman Khandual
2022-02-28 10:47 ` [PATCH V3 17/30] sh/mm: " Anshuman Khandual
2022-02-28 10:59   ` Anshuman Khandual
2022-02-28 10:47   ` [OpenRISC] " Anshuman Khandual
2022-02-28 10:47   ` Anshuman Khandual
2022-02-28 10:47   ` Anshuman Khandual
2022-02-28 10:47   ` Anshuman Khandual
2022-02-28 10:47   ` Anshuman Khandual
2022-02-28 10:47 ` [PATCH V3 18/30] arc/mm: " Anshuman Khandual
2022-02-28 10:59   ` Anshuman Khandual
2022-02-28 10:47   ` [OpenRISC] " Anshuman Khandual
2022-02-28 10:47   ` Anshuman Khandual
2022-02-28 10:47   ` Anshuman Khandual
2022-02-28 10:47   ` Anshuman Khandual
2022-02-28 10:47   ` Anshuman Khandual
2022-02-28 10:47 ` [PATCH V3 19/30] csky/mm: " Anshuman Khandual
2022-02-28 10:59   ` Anshuman Khandual
2022-02-28 10:47   ` [OpenRISC] " Anshuman Khandual
2022-02-28 10:47   ` Anshuman Khandual
2022-02-28 10:47   ` Anshuman Khandual
2022-02-28 10:47   ` Anshuman Khandual
2022-02-28 10:47   ` Anshuman Khandual
2022-03-01 14:00   ` Guo Ren
2022-03-01 14:00     ` Guo Ren
2022-03-01 14:00     ` Guo Ren
2022-03-01 14:00     ` [OpenRISC] " Guo Ren
2022-03-01 14:00     ` Guo Ren
2022-03-01 14:00     ` Guo Ren
2022-03-01 14:00     ` Guo Ren
2022-03-01 14:00     ` Guo Ren
2022-02-28 10:47 ` [PATCH V3 20/30] xtensa/mm: " Anshuman Khandual
2022-02-28 10:59   ` Anshuman Khandual
2022-02-28 10:47   ` [OpenRISC] " Anshuman Khandual
2022-02-28 10:47   ` Anshuman Khandual
2022-02-28 10:47   ` Anshuman Khandual
2022-02-28 10:47   ` Anshuman Khandual
2022-02-28 10:47   ` Anshuman Khandual
2022-02-28 10:47 ` [PATCH V3 21/30] parisc/mm: " Anshuman Khandual
2022-02-28 10:59   ` Anshuman Khandual
2022-02-28 10:47   ` [OpenRISC] " Anshuman Khandual
2022-02-28 10:47   ` Anshuman Khandual
2022-02-28 10:47   ` Anshuman Khandual
2022-02-28 10:47   ` Anshuman Khandual
2022-02-28 10:47   ` Anshuman Khandual
2022-02-28 10:47 ` [PATCH V3 22/30] openrisc/mm: " Anshuman Khandual
2022-02-28 10:59   ` Anshuman Khandual
2022-02-28 10:47   ` [OpenRISC] " Anshuman Khandual
2022-02-28 10:47   ` Anshuman Khandual
2022-02-28 10:47   ` Anshuman Khandual
2022-02-28 10:47   ` Anshuman Khandual
2022-02-28 10:47   ` Anshuman Khandual
2022-02-28 10:47 ` [PATCH V3 23/30] um/mm: " Anshuman Khandual
2022-02-28 10:59   ` Anshuman Khandual
2022-02-28 10:47   ` [OpenRISC] " Anshuman Khandual
2022-02-28 10:47   ` Anshuman Khandual
2022-02-28 10:47   ` Anshuman Khandual
2022-02-28 10:47   ` Anshuman Khandual
2022-02-28 10:47   ` Anshuman Khandual
2022-02-28 10:47 ` [PATCH V3 24/30] microblaze/mm: " Anshuman Khandual
2022-02-28 10:59   ` Anshuman Khandual
2022-02-28 10:47   ` [OpenRISC] " Anshuman Khandual
2022-02-28 10:47   ` Anshuman Khandual
2022-02-28 10:47   ` Anshuman Khandual
2022-02-28 10:47   ` Anshuman Khandual
2022-02-28 10:47   ` Anshuman Khandual
2022-02-28 10:47 ` [PATCH V3 25/30] nios2/mm: " Anshuman Khandual
2022-02-28 10:59   ` Anshuman Khandual
2022-02-28 10:47   ` [OpenRISC] " Anshuman Khandual
2022-02-28 10:47   ` Anshuman Khandual
2022-02-28 10:47   ` Anshuman Khandual
2022-02-28 10:47   ` Anshuman Khandual
2022-02-28 10:47   ` Anshuman Khandual
2022-02-28 10:47 ` [PATCH V3 26/30] hexagon/mm: " Anshuman Khandual
2022-02-28 10:59   ` Anshuman Khandual
2022-02-28 10:47   ` [OpenRISC] " Anshuman Khandual
2022-02-28 10:47   ` Anshuman Khandual
2022-02-28 10:47   ` Anshuman Khandual
2022-02-28 10:47   ` Anshuman Khandual
2022-02-28 10:47   ` Anshuman Khandual
2022-02-28 10:47 ` [PATCH V3 27/30] nds32/mm: " Anshuman Khandual
2022-02-28 10:59   ` Anshuman Khandual
2022-02-28 10:47   ` [OpenRISC] " Anshuman Khandual
2022-02-28 10:47   ` Anshuman Khandual
2022-02-28 10:47   ` Anshuman Khandual
2022-02-28 10:47   ` Anshuman Khandual
2022-02-28 10:47   ` Anshuman Khandual
2022-02-28 10:47 ` [PATCH V3 28/30] ia64/mm: " Anshuman Khandual
2022-02-28 10:59   ` Anshuman Khandual
2022-02-28 10:47   ` [OpenRISC] " Anshuman Khandual
2022-02-28 10:47   ` Anshuman Khandual
2022-02-28 10:47   ` Anshuman Khandual
2022-02-28 10:47   ` Anshuman Khandual
2022-02-28 10:47   ` Anshuman Khandual
2022-02-28 10:47 ` [PATCH V3 29/30] mm/mmap: Drop generic vm_get_page_prot() Anshuman Khandual
2022-02-28 10:59   ` Anshuman Khandual
2022-02-28 10:47   ` [OpenRISC] " Anshuman Khandual
2022-02-28 10:47   ` Anshuman Khandual
2022-02-28 10:47   ` Anshuman Khandual
2022-02-28 10:47   ` Anshuman Khandual
2022-02-28 10:47   ` Anshuman Khandual
2022-02-28 10:47 ` [PATCH V3 30/30] mm/mmap: Drop ARCH_HAS_VM_GET_PAGE_PROT Anshuman Khandual
2022-02-28 10:59   ` Anshuman Khandual
2022-02-28 10:47   ` [OpenRISC] " Anshuman Khandual
2022-02-28 10:47   ` Anshuman Khandual
2022-02-28 10:47   ` Anshuman Khandual
2022-02-28 10:47   ` Anshuman Khandual
2022-02-28 10:47   ` Anshuman Khandual
2022-03-09 10:59 ` [PATCH V3 00/30] mm/mmap: Drop protection_map[] and platform's __SXXX/__PXXX requirements Anshuman Khandual

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=70a99f28-ea69-58e3-f919-85551943c0a3@arm.com \
    --to=anshuman.khandual@arm.com \
    --cc=akpm@linux-foundation.org \
    --cc=arnd@arndb.de \
    --cc=christophe.leroy@csgroup.eu \
    --cc=geert@linux-m68k.org \
    --cc=hch@infradead.org \
    --cc=linux-alpha@vger.kernel.org \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-csky@vger.kernel.org \
    --cc=linux-hexagon@vger.kernel.org \
    --cc=linux-ia64@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-m68k@lists.linux-m68k.org \
    --cc=linux-mips@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-parisc@vger.kernel.org \
    --cc=linux-riscv@lists.infradead.org \
    --cc=linux-s390@vger.kernel.org \
    --cc=linux-sh@vger.kernel.org \
    --cc=linux-snps-arc@lists.infradead.org \
    --cc=linux-um@lists.infradead.org \
    --cc=linux-xtensa@linux-xtensa.org \
    --cc=linux@armlinux.org.uk \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=openrisc@lists.librecores.org \
    --cc=sparclinux@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.