* [RFC 0/3] arm: support get_user_pages_fast @ 2018-09-06 10:22 ` Minchan Kim 0 siblings, 0 replies; 12+ messages in thread From: Minchan Kim @ 2018-09-06 10:22 UTC (permalink / raw) To: Andrew Morton, linux Cc: steve.capper, will.deacon, catalin.marinas, linux-kernel, linux-arm-kernel, kernel-team, android-treble-mediatek-ext, Minchan Kim Recently, I got a report get_user_pages_fast helps app's launching time due to reducing uninterruptible sleep time because it could reduce mmap_sem lock contentions when app is launching. To support gupf in ARM-non-LPAE, first patch reorders memory type table to use 5th bit of the page table. It seems we don't use the bit in ARMv6+ but it needs double check from maintainers. Second patch introduces L_PTE_SPECIAL for arm so that last patch can support get_user_pags_fast. I would greatly appreciate if guys review that I screw up something, especially, architecture stuffs. Thanks. Minchan Kim (3): arm: mm: reordering memory type table arm: mm: introduce L_PTE_SPECIAL arm: mm: support get_user_pages_fast arch/arm/Kconfig | 2 +- arch/arm/include/asm/pgtable-2level.h | 16 +- arch/arm/include/asm/pgtable-3level.h | 6 - arch/arm/include/asm/pgtable.h | 13 ++ arch/arm/mm/Makefile | 6 + arch/arm/mm/gup.c | 221 ++++++++++++++++++++++++++ arch/arm/mm/proc-macros.S | 16 +- 7 files changed, 261 insertions(+), 19 deletions(-) create mode 100644 arch/arm/mm/gup.c -- 2.19.0.rc1.350.ge57e33dbd1-goog ^ permalink raw reply [flat|nested] 12+ messages in thread
* [RFC 0/3] arm: support get_user_pages_fast @ 2018-09-06 10:22 ` Minchan Kim 0 siblings, 0 replies; 12+ messages in thread From: Minchan Kim @ 2018-09-06 10:22 UTC (permalink / raw) To: linux-arm-kernel Recently, I got a report get_user_pages_fast helps app's launching time due to reducing uninterruptible sleep time because it could reduce mmap_sem lock contentions when app is launching. To support gupf in ARM-non-LPAE, first patch reorders memory type table to use 5th bit of the page table. It seems we don't use the bit in ARMv6+ but it needs double check from maintainers. Second patch introduces L_PTE_SPECIAL for arm so that last patch can support get_user_pags_fast. I would greatly appreciate if guys review that I screw up something, especially, architecture stuffs. Thanks. Minchan Kim (3): arm: mm: reordering memory type table arm: mm: introduce L_PTE_SPECIAL arm: mm: support get_user_pages_fast arch/arm/Kconfig | 2 +- arch/arm/include/asm/pgtable-2level.h | 16 +- arch/arm/include/asm/pgtable-3level.h | 6 - arch/arm/include/asm/pgtable.h | 13 ++ arch/arm/mm/Makefile | 6 + arch/arm/mm/gup.c | 221 ++++++++++++++++++++++++++ arch/arm/mm/proc-macros.S | 16 +- 7 files changed, 261 insertions(+), 19 deletions(-) create mode 100644 arch/arm/mm/gup.c -- 2.19.0.rc1.350.ge57e33dbd1-goog ^ permalink raw reply [flat|nested] 12+ messages in thread
* [RFC 1/3] arm: mm: reordering memory type table 2018-09-06 10:22 ` Minchan Kim @ 2018-09-06 10:22 ` Minchan Kim -1 siblings, 0 replies; 12+ messages in thread From: Minchan Kim @ 2018-09-06 10:22 UTC (permalink / raw) To: Andrew Morton, linux Cc: steve.capper, will.deacon, catalin.marinas, linux-kernel, linux-arm-kernel, kernel-team, android-treble-mediatek-ext, Minchan Kim To use bit 5th in page table, we need a room for that and it seems we don't need 4 bits for the memory type with ARMv6+. If so, let's reorder bits to make bit 5 free. We will use the bit for L_PTE_SPECIAL in next patch. Cc: Russell King <linux@armlinux.org.uk> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Steve Capper <steve.capper@linaro.org> Signed-off-by: Minchan Kim <minchan@kernel.org> --- arch/arm/include/asm/pgtable-2level.h | 13 +++++++++++-- arch/arm/mm/proc-macros.S | 16 ++++++++-------- 2 files changed, 19 insertions(+), 10 deletions(-) diff --git a/arch/arm/include/asm/pgtable-2level.h b/arch/arm/include/asm/pgtable-2level.h index 92fd2c8a9af0..91b99fadcba1 100644 --- a/arch/arm/include/asm/pgtable-2level.h +++ b/arch/arm/include/asm/pgtable-2level.h @@ -164,14 +164,23 @@ #define L_PTE_MT_BUFFERABLE (_AT(pteval_t, 0x01) << 2) /* 0001 */ #define L_PTE_MT_WRITETHROUGH (_AT(pteval_t, 0x02) << 2) /* 0010 */ #define L_PTE_MT_WRITEBACK (_AT(pteval_t, 0x03) << 2) /* 0011 */ +#define L_PTE_MT_DEV_SHARED (_AT(pteval_t, 0x04) << 2) /* 0100 */ #define L_PTE_MT_MINICACHE (_AT(pteval_t, 0x06) << 2) /* 0110 (sa1100, xscale) */ #define L_PTE_MT_WRITEALLOC (_AT(pteval_t, 0x07) << 2) /* 0111 */ -#define L_PTE_MT_DEV_SHARED (_AT(pteval_t, 0x04) << 2) /* 0100 */ -#define L_PTE_MT_DEV_NONSHARED (_AT(pteval_t, 0x0c) << 2) /* 1100 */ +#if defined(CONFIG_CPU_V7) || defined(CONFIG_CPU_V7M) || \ + defined (CONFIG_CPU_V6) || defined(CONFIG_CPU_V6K) +#define L_PTE_MT_DEV_WC L_PTE_MT_BUFFERABLE +#define L_PTE_MT_DEV_CACHED L_PTE_MT_WRITEBACK +#define L_PTE_MT_DEV_NONSHARED L_PTE_MT_MINICACHE +#define L_PTE_MT_VECTORS (_AT(pteval_t, 0x05) << 2) /* 0101 */ +#define L_PTE_MT_MASK (_AT(pteval_t, 0x07) << 2) +#else #define L_PTE_MT_DEV_WC (_AT(pteval_t, 0x09) << 2) /* 1001 */ #define L_PTE_MT_DEV_CACHED (_AT(pteval_t, 0x0b) << 2) /* 1011 */ +#define L_PTE_MT_DEV_NONSHARED (_AT(pteval_t, 0x0c) << 2) /* 1100 */ #define L_PTE_MT_VECTORS (_AT(pteval_t, 0x0f) << 2) /* 1111 */ #define L_PTE_MT_MASK (_AT(pteval_t, 0x0f) << 2) +#endif #ifndef __ASSEMBLY__ diff --git a/arch/arm/mm/proc-macros.S b/arch/arm/mm/proc-macros.S index 81d0efb055c6..f896a30653fa 100644 --- a/arch/arm/mm/proc-macros.S +++ b/arch/arm/mm/proc-macros.S @@ -134,21 +134,21 @@ .macro armv6_mt_table pfx \pfx\()_mt_table: .long 0x00 @ L_PTE_MT_UNCACHED - .long PTE_EXT_TEX(1) @ L_PTE_MT_BUFFERABLE + .long PTE_EXT_TEX(1) @ L_PTE_MT_BUFFERABLE(L_PTE_MT_DEV_WC) .long PTE_CACHEABLE @ L_PTE_MT_WRITETHROUGH - .long PTE_CACHEABLE | PTE_BUFFERABLE @ L_PTE_MT_WRITEBACK + .long PTE_CACHEABLE | PTE_BUFFERABLE @ L_PTE_MT_WRITEBACK(L_PTE_MT_DEV_CACHED) .long PTE_BUFFERABLE @ L_PTE_MT_DEV_SHARED - .long 0x00 @ unused - .long 0x00 @ L_PTE_MT_MINICACHE (not present) + .long PTE_CACHEABLE | PTE_BUFFERABLE | PTE_EXT_APX @ L_PTE_MT_VECTORS + .long PTE_EXT_TEX(2) @ L_PTE_MT_DEV_NONSHARED .long PTE_EXT_TEX(1) | PTE_CACHEABLE | PTE_BUFFERABLE @ L_PTE_MT_WRITEALLOC .long 0x00 @ unused - .long PTE_EXT_TEX(1) @ L_PTE_MT_DEV_WC .long 0x00 @ unused - .long PTE_CACHEABLE | PTE_BUFFERABLE @ L_PTE_MT_DEV_CACHED - .long PTE_EXT_TEX(2) @ L_PTE_MT_DEV_NONSHARED .long 0x00 @ unused .long 0x00 @ unused - .long PTE_CACHEABLE | PTE_BUFFERABLE | PTE_EXT_APX @ L_PTE_MT_VECTORS + .long 0x00 @ unused + .long 0x00 @ unused + .long 0x00 @ unused + .long 0x00 @ unused .endm .macro armv6_set_pte_ext pfx -- 2.19.0.rc1.350.ge57e33dbd1-goog ^ permalink raw reply related [flat|nested] 12+ messages in thread
* [RFC 1/3] arm: mm: reordering memory type table @ 2018-09-06 10:22 ` Minchan Kim 0 siblings, 0 replies; 12+ messages in thread From: Minchan Kim @ 2018-09-06 10:22 UTC (permalink / raw) To: linux-arm-kernel To use bit 5th in page table, we need a room for that and it seems we don't need 4 bits for the memory type with ARMv6+. If so, let's reorder bits to make bit 5 free. We will use the bit for L_PTE_SPECIAL in next patch. Cc: Russell King <linux@armlinux.org.uk> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Steve Capper <steve.capper@linaro.org> Signed-off-by: Minchan Kim <minchan@kernel.org> --- arch/arm/include/asm/pgtable-2level.h | 13 +++++++++++-- arch/arm/mm/proc-macros.S | 16 ++++++++-------- 2 files changed, 19 insertions(+), 10 deletions(-) diff --git a/arch/arm/include/asm/pgtable-2level.h b/arch/arm/include/asm/pgtable-2level.h index 92fd2c8a9af0..91b99fadcba1 100644 --- a/arch/arm/include/asm/pgtable-2level.h +++ b/arch/arm/include/asm/pgtable-2level.h @@ -164,14 +164,23 @@ #define L_PTE_MT_BUFFERABLE (_AT(pteval_t, 0x01) << 2) /* 0001 */ #define L_PTE_MT_WRITETHROUGH (_AT(pteval_t, 0x02) << 2) /* 0010 */ #define L_PTE_MT_WRITEBACK (_AT(pteval_t, 0x03) << 2) /* 0011 */ +#define L_PTE_MT_DEV_SHARED (_AT(pteval_t, 0x04) << 2) /* 0100 */ #define L_PTE_MT_MINICACHE (_AT(pteval_t, 0x06) << 2) /* 0110 (sa1100, xscale) */ #define L_PTE_MT_WRITEALLOC (_AT(pteval_t, 0x07) << 2) /* 0111 */ -#define L_PTE_MT_DEV_SHARED (_AT(pteval_t, 0x04) << 2) /* 0100 */ -#define L_PTE_MT_DEV_NONSHARED (_AT(pteval_t, 0x0c) << 2) /* 1100 */ +#if defined(CONFIG_CPU_V7) || defined(CONFIG_CPU_V7M) || \ + defined (CONFIG_CPU_V6) || defined(CONFIG_CPU_V6K) +#define L_PTE_MT_DEV_WC L_PTE_MT_BUFFERABLE +#define L_PTE_MT_DEV_CACHED L_PTE_MT_WRITEBACK +#define L_PTE_MT_DEV_NONSHARED L_PTE_MT_MINICACHE +#define L_PTE_MT_VECTORS (_AT(pteval_t, 0x05) << 2) /* 0101 */ +#define L_PTE_MT_MASK (_AT(pteval_t, 0x07) << 2) +#else #define L_PTE_MT_DEV_WC (_AT(pteval_t, 0x09) << 2) /* 1001 */ #define L_PTE_MT_DEV_CACHED (_AT(pteval_t, 0x0b) << 2) /* 1011 */ +#define L_PTE_MT_DEV_NONSHARED (_AT(pteval_t, 0x0c) << 2) /* 1100 */ #define L_PTE_MT_VECTORS (_AT(pteval_t, 0x0f) << 2) /* 1111 */ #define L_PTE_MT_MASK (_AT(pteval_t, 0x0f) << 2) +#endif #ifndef __ASSEMBLY__ diff --git a/arch/arm/mm/proc-macros.S b/arch/arm/mm/proc-macros.S index 81d0efb055c6..f896a30653fa 100644 --- a/arch/arm/mm/proc-macros.S +++ b/arch/arm/mm/proc-macros.S @@ -134,21 +134,21 @@ .macro armv6_mt_table pfx \pfx\()_mt_table: .long 0x00 @ L_PTE_MT_UNCACHED - .long PTE_EXT_TEX(1) @ L_PTE_MT_BUFFERABLE + .long PTE_EXT_TEX(1) @ L_PTE_MT_BUFFERABLE(L_PTE_MT_DEV_WC) .long PTE_CACHEABLE @ L_PTE_MT_WRITETHROUGH - .long PTE_CACHEABLE | PTE_BUFFERABLE @ L_PTE_MT_WRITEBACK + .long PTE_CACHEABLE | PTE_BUFFERABLE @ L_PTE_MT_WRITEBACK(L_PTE_MT_DEV_CACHED) .long PTE_BUFFERABLE @ L_PTE_MT_DEV_SHARED - .long 0x00 @ unused - .long 0x00 @ L_PTE_MT_MINICACHE (not present) + .long PTE_CACHEABLE | PTE_BUFFERABLE | PTE_EXT_APX @ L_PTE_MT_VECTORS + .long PTE_EXT_TEX(2) @ L_PTE_MT_DEV_NONSHARED .long PTE_EXT_TEX(1) | PTE_CACHEABLE | PTE_BUFFERABLE @ L_PTE_MT_WRITEALLOC .long 0x00 @ unused - .long PTE_EXT_TEX(1) @ L_PTE_MT_DEV_WC .long 0x00 @ unused - .long PTE_CACHEABLE | PTE_BUFFERABLE @ L_PTE_MT_DEV_CACHED - .long PTE_EXT_TEX(2) @ L_PTE_MT_DEV_NONSHARED .long 0x00 @ unused .long 0x00 @ unused - .long PTE_CACHEABLE | PTE_BUFFERABLE | PTE_EXT_APX @ L_PTE_MT_VECTORS + .long 0x00 @ unused + .long 0x00 @ unused + .long 0x00 @ unused + .long 0x00 @ unused .endm .macro armv6_set_pte_ext pfx -- 2.19.0.rc1.350.ge57e33dbd1-goog ^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [RFC 1/3] arm: mm: reordering memory type table 2018-09-06 10:22 ` Minchan Kim @ 2018-09-10 16:50 ` Catalin Marinas -1 siblings, 0 replies; 12+ messages in thread From: Catalin Marinas @ 2018-09-10 16:50 UTC (permalink / raw) To: Minchan Kim Cc: Andrew Morton, linux, steve.capper, will.deacon, linux-kernel, android-treble-mediatek-ext, kernel-team, linux-arm-kernel, Simon Horman On Thu, Sep 06, 2018 at 07:22:10PM +0900, Minchan Kim wrote: > diff --git a/arch/arm/include/asm/pgtable-2level.h b/arch/arm/include/asm/pgtable-2level.h > index 92fd2c8a9af0..91b99fadcba1 100644 > --- a/arch/arm/include/asm/pgtable-2level.h > +++ b/arch/arm/include/asm/pgtable-2level.h > @@ -164,14 +164,23 @@ > #define L_PTE_MT_BUFFERABLE (_AT(pteval_t, 0x01) << 2) /* 0001 */ > #define L_PTE_MT_WRITETHROUGH (_AT(pteval_t, 0x02) << 2) /* 0010 */ > #define L_PTE_MT_WRITEBACK (_AT(pteval_t, 0x03) << 2) /* 0011 */ > +#define L_PTE_MT_DEV_SHARED (_AT(pteval_t, 0x04) << 2) /* 0100 */ > #define L_PTE_MT_MINICACHE (_AT(pteval_t, 0x06) << 2) /* 0110 (sa1100, xscale) */ > #define L_PTE_MT_WRITEALLOC (_AT(pteval_t, 0x07) << 2) /* 0111 */ > -#define L_PTE_MT_DEV_SHARED (_AT(pteval_t, 0x04) << 2) /* 0100 */ > -#define L_PTE_MT_DEV_NONSHARED (_AT(pteval_t, 0x0c) << 2) /* 1100 */ > +#if defined(CONFIG_CPU_V7) || defined(CONFIG_CPU_V7M) || \ > + defined (CONFIG_CPU_V6) || defined(CONFIG_CPU_V6K) > +#define L_PTE_MT_DEV_WC L_PTE_MT_BUFFERABLE > +#define L_PTE_MT_DEV_CACHED L_PTE_MT_WRITEBACK > +#define L_PTE_MT_DEV_NONSHARED L_PTE_MT_MINICACHE I think you can just ignore v7M here, it doesn't have an MMU. You are defining L_PTE_MT_DEV_NONSHARED to L_PTE_MT_MINICACHE but what I think you just meant is index 6 in the cpu_v6_mt_table which I would use explicitly to avoid confusion. Anyway, on ARMv7 or ARMv6+LPAE, the non-shared device gets mapped to shared device in hardware. Looking through the arm32 code, it seems that MT_DEVICE_NONSHARED is used by arch/arm/mach-shmobile/setup-r8a7779.c and IIUC that's a v7 platform (R-Car H1, Cortex-A9). I think the above should be defined to L_PTE_MT_DEV_SHARED, unless I miss any place where DEV_NONSHARED is relevant on ARMv6 (adding Simon to confirm on shmobile). > diff --git a/arch/arm/mm/proc-macros.S b/arch/arm/mm/proc-macros.S > index 81d0efb055c6..f896a30653fa 100644 > --- a/arch/arm/mm/proc-macros.S > +++ b/arch/arm/mm/proc-macros.S > @@ -134,21 +134,21 @@ > .macro armv6_mt_table pfx > \pfx\()_mt_table: Since you changed the MT index, you'd have to fix proc-v7-*levels.S as well. If you define DEV_NONSHARED to SHARED, I think you only need to update the index for L_PTE_MT_VECTORS. -- Catalin ^ permalink raw reply [flat|nested] 12+ messages in thread
* [RFC 1/3] arm: mm: reordering memory type table @ 2018-09-10 16:50 ` Catalin Marinas 0 siblings, 0 replies; 12+ messages in thread From: Catalin Marinas @ 2018-09-10 16:50 UTC (permalink / raw) To: linux-arm-kernel On Thu, Sep 06, 2018 at 07:22:10PM +0900, Minchan Kim wrote: > diff --git a/arch/arm/include/asm/pgtable-2level.h b/arch/arm/include/asm/pgtable-2level.h > index 92fd2c8a9af0..91b99fadcba1 100644 > --- a/arch/arm/include/asm/pgtable-2level.h > +++ b/arch/arm/include/asm/pgtable-2level.h > @@ -164,14 +164,23 @@ > #define L_PTE_MT_BUFFERABLE (_AT(pteval_t, 0x01) << 2) /* 0001 */ > #define L_PTE_MT_WRITETHROUGH (_AT(pteval_t, 0x02) << 2) /* 0010 */ > #define L_PTE_MT_WRITEBACK (_AT(pteval_t, 0x03) << 2) /* 0011 */ > +#define L_PTE_MT_DEV_SHARED (_AT(pteval_t, 0x04) << 2) /* 0100 */ > #define L_PTE_MT_MINICACHE (_AT(pteval_t, 0x06) << 2) /* 0110 (sa1100, xscale) */ > #define L_PTE_MT_WRITEALLOC (_AT(pteval_t, 0x07) << 2) /* 0111 */ > -#define L_PTE_MT_DEV_SHARED (_AT(pteval_t, 0x04) << 2) /* 0100 */ > -#define L_PTE_MT_DEV_NONSHARED (_AT(pteval_t, 0x0c) << 2) /* 1100 */ > +#if defined(CONFIG_CPU_V7) || defined(CONFIG_CPU_V7M) || \ > + defined (CONFIG_CPU_V6) || defined(CONFIG_CPU_V6K) > +#define L_PTE_MT_DEV_WC L_PTE_MT_BUFFERABLE > +#define L_PTE_MT_DEV_CACHED L_PTE_MT_WRITEBACK > +#define L_PTE_MT_DEV_NONSHARED L_PTE_MT_MINICACHE I think you can just ignore v7M here, it doesn't have an MMU. You are defining L_PTE_MT_DEV_NONSHARED to L_PTE_MT_MINICACHE but what I think you just meant is index 6 in the cpu_v6_mt_table which I would use explicitly to avoid confusion. Anyway, on ARMv7 or ARMv6+LPAE, the non-shared device gets mapped to shared device in hardware. Looking through the arm32 code, it seems that MT_DEVICE_NONSHARED is used by arch/arm/mach-shmobile/setup-r8a7779.c and IIUC that's a v7 platform (R-Car H1, Cortex-A9). I think the above should be defined to L_PTE_MT_DEV_SHARED, unless I miss any place where DEV_NONSHARED is relevant on ARMv6 (adding Simon to confirm on shmobile). > diff --git a/arch/arm/mm/proc-macros.S b/arch/arm/mm/proc-macros.S > index 81d0efb055c6..f896a30653fa 100644 > --- a/arch/arm/mm/proc-macros.S > +++ b/arch/arm/mm/proc-macros.S > @@ -134,21 +134,21 @@ > .macro armv6_mt_table pfx > \pfx\()_mt_table: Since you changed the MT index, you'd have to fix proc-v7-*levels.S as well. If you define DEV_NONSHARED to SHARED, I think you only need to update the index for L_PTE_MT_VECTORS. -- Catalin ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [RFC 1/3] arm: mm: reordering memory type table 2018-09-10 16:50 ` Catalin Marinas @ 2018-09-14 6:26 ` Minchan Kim -1 siblings, 0 replies; 12+ messages in thread From: Minchan Kim @ 2018-09-14 6:26 UTC (permalink / raw) To: Catalin Marinas Cc: Andrew Morton, linux, steve.capper, will.deacon, linux-kernel, android-treble-mediatek-ext, kernel-team, linux-arm-kernel, Simon Horman On Mon, Sep 10, 2018 at 05:50:11PM +0100, Catalin Marinas wrote: > On Thu, Sep 06, 2018 at 07:22:10PM +0900, Minchan Kim wrote: > > diff --git a/arch/arm/include/asm/pgtable-2level.h b/arch/arm/include/asm/pgtable-2level.h > > index 92fd2c8a9af0..91b99fadcba1 100644 > > --- a/arch/arm/include/asm/pgtable-2level.h > > +++ b/arch/arm/include/asm/pgtable-2level.h > > @@ -164,14 +164,23 @@ > > #define L_PTE_MT_BUFFERABLE (_AT(pteval_t, 0x01) << 2) /* 0001 */ > > #define L_PTE_MT_WRITETHROUGH (_AT(pteval_t, 0x02) << 2) /* 0010 */ > > #define L_PTE_MT_WRITEBACK (_AT(pteval_t, 0x03) << 2) /* 0011 */ > > +#define L_PTE_MT_DEV_SHARED (_AT(pteval_t, 0x04) << 2) /* 0100 */ > > #define L_PTE_MT_MINICACHE (_AT(pteval_t, 0x06) << 2) /* 0110 (sa1100, xscale) */ > > #define L_PTE_MT_WRITEALLOC (_AT(pteval_t, 0x07) << 2) /* 0111 */ > > -#define L_PTE_MT_DEV_SHARED (_AT(pteval_t, 0x04) << 2) /* 0100 */ > > -#define L_PTE_MT_DEV_NONSHARED (_AT(pteval_t, 0x0c) << 2) /* 1100 */ > > +#if defined(CONFIG_CPU_V7) || defined(CONFIG_CPU_V7M) || \ > > + defined (CONFIG_CPU_V6) || defined(CONFIG_CPU_V6K) > > +#define L_PTE_MT_DEV_WC L_PTE_MT_BUFFERABLE > > +#define L_PTE_MT_DEV_CACHED L_PTE_MT_WRITEBACK > > +#define L_PTE_MT_DEV_NONSHARED L_PTE_MT_MINICACHE > > I think you can just ignore v7M here, it doesn't have an MMU. I didn't know that. Will fix. > > You are defining L_PTE_MT_DEV_NONSHARED to L_PTE_MT_MINICACHE but what I > think you just meant is index 6 in the cpu_v6_mt_table which I would use > explicitly to avoid confusion. > > Anyway, on ARMv7 or ARMv6+LPAE, the non-shared device gets mapped to > shared device in hardware. Looking through the arm32 code, it seems that Thanks for the information. I didn't know that. > MT_DEVICE_NONSHARED is used by arch/arm/mach-shmobile/setup-r8a7779.c > and IIUC that's a v7 platform (R-Car H1, Cortex-A9). I think the above > should be defined to L_PTE_MT_DEV_SHARED, unless I miss any place where > DEV_NONSHARED is relevant on ARMv6 (adding Simon to confirm on > shmobile). Simon, could you confirm this? > > > diff --git a/arch/arm/mm/proc-macros.S b/arch/arm/mm/proc-macros.S > > index 81d0efb055c6..f896a30653fa 100644 > > --- a/arch/arm/mm/proc-macros.S > > +++ b/arch/arm/mm/proc-macros.S > > @@ -134,21 +134,21 @@ > > .macro armv6_mt_table pfx > > \pfx\()_mt_table: > > Since you changed the MT index, you'd have to fix proc-v7-*levels.S as > well. If you define DEV_NONSHARED to SHARED, I think you only need to > update the index for L_PTE_MT_VECTORS. Good idea. I will try it on. Thanks for the review, Catalin. ^ permalink raw reply [flat|nested] 12+ messages in thread
* [RFC 1/3] arm: mm: reordering memory type table @ 2018-09-14 6:26 ` Minchan Kim 0 siblings, 0 replies; 12+ messages in thread From: Minchan Kim @ 2018-09-14 6:26 UTC (permalink / raw) To: linux-arm-kernel On Mon, Sep 10, 2018 at 05:50:11PM +0100, Catalin Marinas wrote: > On Thu, Sep 06, 2018 at 07:22:10PM +0900, Minchan Kim wrote: > > diff --git a/arch/arm/include/asm/pgtable-2level.h b/arch/arm/include/asm/pgtable-2level.h > > index 92fd2c8a9af0..91b99fadcba1 100644 > > --- a/arch/arm/include/asm/pgtable-2level.h > > +++ b/arch/arm/include/asm/pgtable-2level.h > > @@ -164,14 +164,23 @@ > > #define L_PTE_MT_BUFFERABLE (_AT(pteval_t, 0x01) << 2) /* 0001 */ > > #define L_PTE_MT_WRITETHROUGH (_AT(pteval_t, 0x02) << 2) /* 0010 */ > > #define L_PTE_MT_WRITEBACK (_AT(pteval_t, 0x03) << 2) /* 0011 */ > > +#define L_PTE_MT_DEV_SHARED (_AT(pteval_t, 0x04) << 2) /* 0100 */ > > #define L_PTE_MT_MINICACHE (_AT(pteval_t, 0x06) << 2) /* 0110 (sa1100, xscale) */ > > #define L_PTE_MT_WRITEALLOC (_AT(pteval_t, 0x07) << 2) /* 0111 */ > > -#define L_PTE_MT_DEV_SHARED (_AT(pteval_t, 0x04) << 2) /* 0100 */ > > -#define L_PTE_MT_DEV_NONSHARED (_AT(pteval_t, 0x0c) << 2) /* 1100 */ > > +#if defined(CONFIG_CPU_V7) || defined(CONFIG_CPU_V7M) || \ > > + defined (CONFIG_CPU_V6) || defined(CONFIG_CPU_V6K) > > +#define L_PTE_MT_DEV_WC L_PTE_MT_BUFFERABLE > > +#define L_PTE_MT_DEV_CACHED L_PTE_MT_WRITEBACK > > +#define L_PTE_MT_DEV_NONSHARED L_PTE_MT_MINICACHE > > I think you can just ignore v7M here, it doesn't have an MMU. I didn't know that. Will fix. > > You are defining L_PTE_MT_DEV_NONSHARED to L_PTE_MT_MINICACHE but what I > think you just meant is index 6 in the cpu_v6_mt_table which I would use > explicitly to avoid confusion. > > Anyway, on ARMv7 or ARMv6+LPAE, the non-shared device gets mapped to > shared device in hardware. Looking through the arm32 code, it seems that Thanks for the information. I didn't know that. > MT_DEVICE_NONSHARED is used by arch/arm/mach-shmobile/setup-r8a7779.c > and IIUC that's a v7 platform (R-Car H1, Cortex-A9). I think the above > should be defined to L_PTE_MT_DEV_SHARED, unless I miss any place where > DEV_NONSHARED is relevant on ARMv6 (adding Simon to confirm on > shmobile). Simon, could you confirm this? > > > diff --git a/arch/arm/mm/proc-macros.S b/arch/arm/mm/proc-macros.S > > index 81d0efb055c6..f896a30653fa 100644 > > --- a/arch/arm/mm/proc-macros.S > > +++ b/arch/arm/mm/proc-macros.S > > @@ -134,21 +134,21 @@ > > .macro armv6_mt_table pfx > > \pfx\()_mt_table: > > Since you changed the MT index, you'd have to fix proc-v7-*levels.S as > well. If you define DEV_NONSHARED to SHARED, I think you only need to > update the index for L_PTE_MT_VECTORS. Good idea. I will try it on. Thanks for the review, Catalin. ^ permalink raw reply [flat|nested] 12+ messages in thread
* [RFC 2/3] arm: mm: introduce L_PTE_SPECIAL 2018-09-06 10:22 ` Minchan Kim @ 2018-09-06 10:22 ` Minchan Kim -1 siblings, 0 replies; 12+ messages in thread From: Minchan Kim @ 2018-09-06 10:22 UTC (permalink / raw) To: Andrew Morton, linux Cc: steve.capper, will.deacon, catalin.marinas, linux-kernel, linux-arm-kernel, kernel-team, android-treble-mediatek-ext, Minchan Kim This patch introduces L_PTE_SPECIAL and pte functions for supporting get_user_pages_fast. Cc: Russell King <linux@armlinux.org.uk> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Steve Capper <steve.capper@linaro.org> Signed-off-by: Minchan Kim <minchan@kernel.org> --- arch/arm/Kconfig | 2 +- arch/arm/include/asm/pgtable-2level.h | 3 +-- arch/arm/include/asm/pgtable-3level.h | 6 ------ arch/arm/include/asm/pgtable.h | 13 +++++++++++++ 4 files changed, 15 insertions(+), 9 deletions(-) diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig index e8cd55a5b04c..5d4489a019c4 100644 --- a/arch/arm/Kconfig +++ b/arch/arm/Kconfig @@ -10,7 +10,7 @@ config ARM select ARCH_HAS_FORTIFY_SOURCE select ARCH_HAS_KCOV select ARCH_HAS_MEMBARRIER_SYNC_CORE - select ARCH_HAS_PTE_SPECIAL if ARM_LPAE + select ARCH_HAS_PTE_SPECIAL if (ARM_LPAE || CPU_V7 || CPU_V7M || CPU_V6 || CPUV6K) select ARCH_HAS_PHYS_TO_DMA select ARCH_HAS_SET_MEMORY select ARCH_HAS_STRICT_KERNEL_RWX if MMU && !XIP_KERNEL diff --git a/arch/arm/include/asm/pgtable-2level.h b/arch/arm/include/asm/pgtable-2level.h index 91b99fadcba1..82386a2b84e7 100644 --- a/arch/arm/include/asm/pgtable-2level.h +++ b/arch/arm/include/asm/pgtable-2level.h @@ -120,6 +120,7 @@ #define L_PTE_VALID (_AT(pteval_t, 1) << 0) /* Valid */ #define L_PTE_PRESENT (_AT(pteval_t, 1) << 0) #define L_PTE_YOUNG (_AT(pteval_t, 1) << 1) +#define L_PTE_SPECIAL (_AT(pteval_t, 1) << 5) #define L_PTE_DIRTY (_AT(pteval_t, 1) << 6) #define L_PTE_RDONLY (_AT(pteval_t, 1) << 7) #define L_PTE_USER (_AT(pteval_t, 1) << 8) @@ -222,8 +223,6 @@ static inline pmd_t *pmd_offset(pud_t *pud, unsigned long addr) #define pmd_addr_end(addr,end) (end) #define set_pte_ext(ptep,pte,ext) cpu_set_pte_ext(ptep,pte,ext) -#define pte_special(pte) (0) -static inline pte_t pte_mkspecial(pte_t pte) { return pte; } /* * We don't have huge page support for short descriptors, for the moment diff --git a/arch/arm/include/asm/pgtable-3level.h b/arch/arm/include/asm/pgtable-3level.h index 6d50a11d7793..b6f52e16b478 100644 --- a/arch/arm/include/asm/pgtable-3level.h +++ b/arch/arm/include/asm/pgtable-3level.h @@ -213,12 +213,6 @@ static inline pmd_t *pmd_offset(pud_t *pud, unsigned long addr) #define pmd_present(pmd) (pmd_isset((pmd), L_PMD_SECT_VALID)) #define pmd_young(pmd) (pmd_isset((pmd), PMD_SECT_AF)) -#define pte_special(pte) (pte_isset((pte), L_PTE_SPECIAL)) -static inline pte_t pte_mkspecial(pte_t pte) -{ - pte_val(pte) |= L_PTE_SPECIAL; - return pte; -} #define pmd_write(pmd) (pmd_isclear((pmd), L_PMD_SECT_RDONLY)) #define pmd_dirty(pmd) (pmd_isset((pmd), L_PMD_SECT_DIRTY)) diff --git a/arch/arm/include/asm/pgtable.h b/arch/arm/include/asm/pgtable.h index a757401129f9..6cc7ce0e423e 100644 --- a/arch/arm/include/asm/pgtable.h +++ b/arch/arm/include/asm/pgtable.h @@ -228,6 +228,11 @@ static inline pte_t *pmd_page_vaddr(pmd_t pmd) #define pte_dirty(pte) (pte_isset((pte), L_PTE_DIRTY)) #define pte_young(pte) (pte_isset((pte), L_PTE_YOUNG)) #define pte_exec(pte) (pte_isclear((pte), L_PTE_XN)) +#ifdef CONFIG_ARCH_HAS_PTE_SPECIAL +#define pte_special(pte) (pte_isset((pte), L_PTE_SPECIAL)) +#else +#define pte_special(pte) (0) +#endif #define pte_valid_user(pte) \ (pte_valid(pte) && pte_isset((pte), L_PTE_USER) && pte_young(pte)) @@ -318,6 +323,14 @@ static inline pte_t pte_mknexec(pte_t pte) return set_pte_bit(pte, __pgprot(L_PTE_XN)); } +#ifdef CONFIG_ARCH_HAS_PTE_SPECIAL +static inline pte_t pte_mkspecial(pte_t pte) +{ + return set_pte_bit(pte, __pgprot(L_PTE_SPECIAL)); +} +#else +static inline pte_t pte_mkspecial(pte_t pte) { return pte; } +#endif static inline pte_t pte_modify(pte_t pte, pgprot_t newprot) { const pteval_t mask = L_PTE_XN | L_PTE_RDONLY | L_PTE_USER | -- 2.19.0.rc1.350.ge57e33dbd1-goog ^ permalink raw reply related [flat|nested] 12+ messages in thread
* [RFC 2/3] arm: mm: introduce L_PTE_SPECIAL @ 2018-09-06 10:22 ` Minchan Kim 0 siblings, 0 replies; 12+ messages in thread From: Minchan Kim @ 2018-09-06 10:22 UTC (permalink / raw) To: linux-arm-kernel This patch introduces L_PTE_SPECIAL and pte functions for supporting get_user_pages_fast. Cc: Russell King <linux@armlinux.org.uk> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Steve Capper <steve.capper@linaro.org> Signed-off-by: Minchan Kim <minchan@kernel.org> --- arch/arm/Kconfig | 2 +- arch/arm/include/asm/pgtable-2level.h | 3 +-- arch/arm/include/asm/pgtable-3level.h | 6 ------ arch/arm/include/asm/pgtable.h | 13 +++++++++++++ 4 files changed, 15 insertions(+), 9 deletions(-) diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig index e8cd55a5b04c..5d4489a019c4 100644 --- a/arch/arm/Kconfig +++ b/arch/arm/Kconfig @@ -10,7 +10,7 @@ config ARM select ARCH_HAS_FORTIFY_SOURCE select ARCH_HAS_KCOV select ARCH_HAS_MEMBARRIER_SYNC_CORE - select ARCH_HAS_PTE_SPECIAL if ARM_LPAE + select ARCH_HAS_PTE_SPECIAL if (ARM_LPAE || CPU_V7 || CPU_V7M || CPU_V6 || CPUV6K) select ARCH_HAS_PHYS_TO_DMA select ARCH_HAS_SET_MEMORY select ARCH_HAS_STRICT_KERNEL_RWX if MMU && !XIP_KERNEL diff --git a/arch/arm/include/asm/pgtable-2level.h b/arch/arm/include/asm/pgtable-2level.h index 91b99fadcba1..82386a2b84e7 100644 --- a/arch/arm/include/asm/pgtable-2level.h +++ b/arch/arm/include/asm/pgtable-2level.h @@ -120,6 +120,7 @@ #define L_PTE_VALID (_AT(pteval_t, 1) << 0) /* Valid */ #define L_PTE_PRESENT (_AT(pteval_t, 1) << 0) #define L_PTE_YOUNG (_AT(pteval_t, 1) << 1) +#define L_PTE_SPECIAL (_AT(pteval_t, 1) << 5) #define L_PTE_DIRTY (_AT(pteval_t, 1) << 6) #define L_PTE_RDONLY (_AT(pteval_t, 1) << 7) #define L_PTE_USER (_AT(pteval_t, 1) << 8) @@ -222,8 +223,6 @@ static inline pmd_t *pmd_offset(pud_t *pud, unsigned long addr) #define pmd_addr_end(addr,end) (end) #define set_pte_ext(ptep,pte,ext) cpu_set_pte_ext(ptep,pte,ext) -#define pte_special(pte) (0) -static inline pte_t pte_mkspecial(pte_t pte) { return pte; } /* * We don't have huge page support for short descriptors, for the moment diff --git a/arch/arm/include/asm/pgtable-3level.h b/arch/arm/include/asm/pgtable-3level.h index 6d50a11d7793..b6f52e16b478 100644 --- a/arch/arm/include/asm/pgtable-3level.h +++ b/arch/arm/include/asm/pgtable-3level.h @@ -213,12 +213,6 @@ static inline pmd_t *pmd_offset(pud_t *pud, unsigned long addr) #define pmd_present(pmd) (pmd_isset((pmd), L_PMD_SECT_VALID)) #define pmd_young(pmd) (pmd_isset((pmd), PMD_SECT_AF)) -#define pte_special(pte) (pte_isset((pte), L_PTE_SPECIAL)) -static inline pte_t pte_mkspecial(pte_t pte) -{ - pte_val(pte) |= L_PTE_SPECIAL; - return pte; -} #define pmd_write(pmd) (pmd_isclear((pmd), L_PMD_SECT_RDONLY)) #define pmd_dirty(pmd) (pmd_isset((pmd), L_PMD_SECT_DIRTY)) diff --git a/arch/arm/include/asm/pgtable.h b/arch/arm/include/asm/pgtable.h index a757401129f9..6cc7ce0e423e 100644 --- a/arch/arm/include/asm/pgtable.h +++ b/arch/arm/include/asm/pgtable.h @@ -228,6 +228,11 @@ static inline pte_t *pmd_page_vaddr(pmd_t pmd) #define pte_dirty(pte) (pte_isset((pte), L_PTE_DIRTY)) #define pte_young(pte) (pte_isset((pte), L_PTE_YOUNG)) #define pte_exec(pte) (pte_isclear((pte), L_PTE_XN)) +#ifdef CONFIG_ARCH_HAS_PTE_SPECIAL +#define pte_special(pte) (pte_isset((pte), L_PTE_SPECIAL)) +#else +#define pte_special(pte) (0) +#endif #define pte_valid_user(pte) \ (pte_valid(pte) && pte_isset((pte), L_PTE_USER) && pte_young(pte)) @@ -318,6 +323,14 @@ static inline pte_t pte_mknexec(pte_t pte) return set_pte_bit(pte, __pgprot(L_PTE_XN)); } +#ifdef CONFIG_ARCH_HAS_PTE_SPECIAL +static inline pte_t pte_mkspecial(pte_t pte) +{ + return set_pte_bit(pte, __pgprot(L_PTE_SPECIAL)); +} +#else +static inline pte_t pte_mkspecial(pte_t pte) { return pte; } +#endif static inline pte_t pte_modify(pte_t pte, pgprot_t newprot) { const pteval_t mask = L_PTE_XN | L_PTE_RDONLY | L_PTE_USER | -- 2.19.0.rc1.350.ge57e33dbd1-goog ^ permalink raw reply related [flat|nested] 12+ messages in thread
* [RFC 3/3] arm: mm: support get_user_pages_fast 2018-09-06 10:22 ` Minchan Kim @ 2018-09-06 10:22 ` Minchan Kim -1 siblings, 0 replies; 12+ messages in thread From: Minchan Kim @ 2018-09-06 10:22 UTC (permalink / raw) To: Andrew Morton, linux Cc: steve.capper, will.deacon, catalin.marinas, linux-kernel, linux-arm-kernel, kernel-team, android-treble-mediatek-ext, Minchan Kim Recently, there was a report get_user_pages_fast helps app launching speed due to reducing uninterruptible sleep time because we don't need to contend for mmap_sem, I believe. With get_user_pages_fast, that uniterruptible sleep time is reduced about 5~10% by testing. Cc: Russell King <linux@armlinux.org.uk> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Steve Capper <steve.capper@linaro.org> Signed-off-by: Minchan Kim <minchan@kernel.org> --- arch/arm/mm/Makefile | 6 ++ arch/arm/mm/gup.c | 221 +++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 227 insertions(+) create mode 100644 arch/arm/mm/gup.c diff --git a/arch/arm/mm/Makefile b/arch/arm/mm/Makefile index 7cb1699fbfc4..f55f96d56843 100644 --- a/arch/arm/mm/Makefile +++ b/arch/arm/mm/Makefile @@ -13,6 +13,12 @@ obj-y += nommu.o obj-$(CONFIG_ARM_MPU) += pmsa-v7.o pmsa-v8.o endif +ifneq ($(CONFIG_ARM_LPAE),y) +ifeq ($(CONFIG_ARCH_HAS_PTE_SPECIAL),y) +obj-$(CONFIG_MMU) += gup.o +endif +endif + obj-$(CONFIG_ARM_PTDUMP_CORE) += dump.o obj-$(CONFIG_ARM_PTDUMP_DEBUGFS) += ptdump_debugfs.o obj-$(CONFIG_MODULES) += proc-syms.o diff --git a/arch/arm/mm/gup.c b/arch/arm/mm/gup.c new file mode 100644 index 000000000000..44e12fb7430e --- /dev/null +++ b/arch/arm/mm/gup.c @@ -0,0 +1,221 @@ +// SPDX-License-Identifier: GPL-2.0 +#include <linux/mm.h> +#include <linux/uaccess.h> +#include <linux/pagemap.h> +#include <asm/pgtable.h> + +static inline pte_t gup_get_pte(pte_t *ptep) +{ + return READ_ONCE(*ptep); +} + +static int gup_pte_range(pmd_t pmd, unsigned long addr, unsigned long end, + int write, struct page **pages, int *nr) +{ + int ret = 0; + pte_t *ptep, *ptem; + + ptem = ptep = pte_offset_map(&pmd, addr); + do { + pte_t pte = gup_get_pte(ptep); + struct page *page; + + if (!pte_access_permitted(pte, write)) + goto pte_unmap; + + if (pte_special(pte)) + goto pte_unmap; + + VM_BUG_ON(!pfn_valid(pte_pfn(pte))); + page = pte_page(pte); + + if (!page_cache_get_speculative(page)) + goto pte_unmap; + + if (unlikely(pte_val(pte) != pte_val(*ptep))) { + put_page(page); + goto pte_unmap; + } + + SetPageReferenced(page); + pages[*nr] = page; + (*nr)++; + + } while (ptep++, addr += PAGE_SIZE, addr != end); + + ret = 1; + +pte_unmap: + pte_unmap(ptem); + return ret; +} + +static int gup_pmd_range(pud_t pud, unsigned long addr, unsigned long end, + int write, struct page **pages, int *nr) +{ + unsigned long next; + pmd_t *pmdp; + + pmdp = pmd_offset(&pud, addr); + do { + pmd_t pmd = READ_ONCE(*pmdp); + + next = pmd_addr_end(addr, end); + if (!pmd_present(pmd)) + return 0; + else if (!gup_pte_range(pmd, addr, next, write, pages, nr)) + return 0; + } while (pmdp++, addr = next, addr != end); + + return 1; +} + +static int gup_pud_range(p4d_t *p4dp, unsigned long addr, unsigned long end, + int write, struct page **pages, int *nr) +{ + unsigned long next; + pud_t *pudp; + + pudp = pud_offset(p4dp, addr); + do { + pud_t pud = READ_ONCE(*pudp); + + next = pud_addr_end(addr, end); + if (pud_none(pud)) + return 0; + else if (!gup_pmd_range(pud, addr, next, write, pages, nr)) + return 0; + } while (pudp++, addr = next, addr != end); + + return 1; +} + +static int gup_p4d_range(pgd_t *pgdp, unsigned long addr, unsigned long end, + int write, struct page **pages, int *nr) +{ + unsigned long next; + p4d_t *p4dp; + + p4dp = p4d_offset(pgdp, addr); + do { + next = p4d_addr_end(addr, end); + if (p4d_none(*p4dp)) { + return 0; + } else if (!gup_pud_range(p4dp, addr, next, write, pages, nr)) + return 0; + } while (p4dp++, addr = next, addr != end); + + return 1; +} + + +static void gup_pgd_range(unsigned long addr, unsigned long end, + int write, struct page **pages, int *nr) +{ + unsigned long next; + pgd_t *pgdp; + + pgdp = pgd_offset(current->mm, addr); + do { + next = pgd_addr_end(addr, end); + if (pgd_none(*pgdp)) + return; + else if (!gup_p4d_range(pgdp, addr, next, write, pages, nr)) + break; + } while (pgdp++, addr = next, addr != end); +} + +bool gup_fast_permitted(unsigned long start, int nr_pages, int write) +{ + unsigned long len, end; + + len = (unsigned long) nr_pages << PAGE_SHIFT; + end = start + len; + return end >= start; +} + +/* + * Like get_user_pages_fast() except its IRQ-safe in that it won't fall + * back to the regular GUP. + */ +int __get_user_pages_fast(unsigned long start, int nr_pages, int write, + struct page **pages) +{ + unsigned long addr, len, end; + unsigned long flags; + int nr = 0; + + start &= PAGE_MASK; + addr = start; + len = (unsigned long) nr_pages << PAGE_SHIFT; + end = start + len; + + if (unlikely(!access_ok(write ? VERIFY_WRITE : VERIFY_READ, + (void __user *)start, len))) + return 0; + + /* + * Disable interrupts. We use the nested form as we can already have + * interrupts disabled by get_futex_key. + * + * With interrupts disabled, we block page table pages from being + * freed from under us. See mmu_gather_tlb in asm-generic/tlb.h + * for more details. + * + * We do not adopt an rcu_read_lock(.) here as we also want to + * block IPIs that come from THPs splitting. + */ + + if (gup_fast_permitted(start, nr_pages, write)) { + local_irq_save(flags); + gup_pgd_range(addr, end, write, pages, &nr); + local_irq_restore(flags); + } + + return nr; +} + +int get_user_pages_fast(unsigned long start, int nr_pages, int write, + struct page **pages) +{ + unsigned long addr, len, end; + int nr = 0, ret = 0; + + start &= PAGE_MASK; + addr = start; + len = (unsigned long) nr_pages << PAGE_SHIFT; + end = start + len; + + if (nr_pages <= 0) + return 0; + + if (unlikely(!access_ok(write ? VERIFY_WRITE : VERIFY_READ, + (void __user *)start, len))) + return -EFAULT; + + if (gup_fast_permitted(start, nr_pages, write)) { + local_irq_disable(); + gup_pgd_range(addr, end, write, pages, &nr); + local_irq_enable(); + ret = nr; + } + + if (nr < nr_pages) { + /* Try to get the remaining pages with get_user_pages */ + start += nr << PAGE_SHIFT; + pages += nr; + + ret = get_user_pages_unlocked(start, nr_pages - nr, pages, + write ? FOLL_WRITE : 0); + + /* Have to be a bit careful with return values */ + if (nr > 0) { + if (ret < 0) + ret = nr; + else + ret += nr; + } + } + + return ret; +} -- 2.19.0.rc1.350.ge57e33dbd1-goog ^ permalink raw reply related [flat|nested] 12+ messages in thread
* [RFC 3/3] arm: mm: support get_user_pages_fast @ 2018-09-06 10:22 ` Minchan Kim 0 siblings, 0 replies; 12+ messages in thread From: Minchan Kim @ 2018-09-06 10:22 UTC (permalink / raw) To: linux-arm-kernel Recently, there was a report get_user_pages_fast helps app launching speed due to reducing uninterruptible sleep time because we don't need to contend for mmap_sem, I believe. With get_user_pages_fast, that uniterruptible sleep time is reduced about 5~10% by testing. Cc: Russell King <linux@armlinux.org.uk> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Steve Capper <steve.capper@linaro.org> Signed-off-by: Minchan Kim <minchan@kernel.org> --- arch/arm/mm/Makefile | 6 ++ arch/arm/mm/gup.c | 221 +++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 227 insertions(+) create mode 100644 arch/arm/mm/gup.c diff --git a/arch/arm/mm/Makefile b/arch/arm/mm/Makefile index 7cb1699fbfc4..f55f96d56843 100644 --- a/arch/arm/mm/Makefile +++ b/arch/arm/mm/Makefile @@ -13,6 +13,12 @@ obj-y += nommu.o obj-$(CONFIG_ARM_MPU) += pmsa-v7.o pmsa-v8.o endif +ifneq ($(CONFIG_ARM_LPAE),y) +ifeq ($(CONFIG_ARCH_HAS_PTE_SPECIAL),y) +obj-$(CONFIG_MMU) += gup.o +endif +endif + obj-$(CONFIG_ARM_PTDUMP_CORE) += dump.o obj-$(CONFIG_ARM_PTDUMP_DEBUGFS) += ptdump_debugfs.o obj-$(CONFIG_MODULES) += proc-syms.o diff --git a/arch/arm/mm/gup.c b/arch/arm/mm/gup.c new file mode 100644 index 000000000000..44e12fb7430e --- /dev/null +++ b/arch/arm/mm/gup.c @@ -0,0 +1,221 @@ +// SPDX-License-Identifier: GPL-2.0 +#include <linux/mm.h> +#include <linux/uaccess.h> +#include <linux/pagemap.h> +#include <asm/pgtable.h> + +static inline pte_t gup_get_pte(pte_t *ptep) +{ + return READ_ONCE(*ptep); +} + +static int gup_pte_range(pmd_t pmd, unsigned long addr, unsigned long end, + int write, struct page **pages, int *nr) +{ + int ret = 0; + pte_t *ptep, *ptem; + + ptem = ptep = pte_offset_map(&pmd, addr); + do { + pte_t pte = gup_get_pte(ptep); + struct page *page; + + if (!pte_access_permitted(pte, write)) + goto pte_unmap; + + if (pte_special(pte)) + goto pte_unmap; + + VM_BUG_ON(!pfn_valid(pte_pfn(pte))); + page = pte_page(pte); + + if (!page_cache_get_speculative(page)) + goto pte_unmap; + + if (unlikely(pte_val(pte) != pte_val(*ptep))) { + put_page(page); + goto pte_unmap; + } + + SetPageReferenced(page); + pages[*nr] = page; + (*nr)++; + + } while (ptep++, addr += PAGE_SIZE, addr != end); + + ret = 1; + +pte_unmap: + pte_unmap(ptem); + return ret; +} + +static int gup_pmd_range(pud_t pud, unsigned long addr, unsigned long end, + int write, struct page **pages, int *nr) +{ + unsigned long next; + pmd_t *pmdp; + + pmdp = pmd_offset(&pud, addr); + do { + pmd_t pmd = READ_ONCE(*pmdp); + + next = pmd_addr_end(addr, end); + if (!pmd_present(pmd)) + return 0; + else if (!gup_pte_range(pmd, addr, next, write, pages, nr)) + return 0; + } while (pmdp++, addr = next, addr != end); + + return 1; +} + +static int gup_pud_range(p4d_t *p4dp, unsigned long addr, unsigned long end, + int write, struct page **pages, int *nr) +{ + unsigned long next; + pud_t *pudp; + + pudp = pud_offset(p4dp, addr); + do { + pud_t pud = READ_ONCE(*pudp); + + next = pud_addr_end(addr, end); + if (pud_none(pud)) + return 0; + else if (!gup_pmd_range(pud, addr, next, write, pages, nr)) + return 0; + } while (pudp++, addr = next, addr != end); + + return 1; +} + +static int gup_p4d_range(pgd_t *pgdp, unsigned long addr, unsigned long end, + int write, struct page **pages, int *nr) +{ + unsigned long next; + p4d_t *p4dp; + + p4dp = p4d_offset(pgdp, addr); + do { + next = p4d_addr_end(addr, end); + if (p4d_none(*p4dp)) { + return 0; + } else if (!gup_pud_range(p4dp, addr, next, write, pages, nr)) + return 0; + } while (p4dp++, addr = next, addr != end); + + return 1; +} + + +static void gup_pgd_range(unsigned long addr, unsigned long end, + int write, struct page **pages, int *nr) +{ + unsigned long next; + pgd_t *pgdp; + + pgdp = pgd_offset(current->mm, addr); + do { + next = pgd_addr_end(addr, end); + if (pgd_none(*pgdp)) + return; + else if (!gup_p4d_range(pgdp, addr, next, write, pages, nr)) + break; + } while (pgdp++, addr = next, addr != end); +} + +bool gup_fast_permitted(unsigned long start, int nr_pages, int write) +{ + unsigned long len, end; + + len = (unsigned long) nr_pages << PAGE_SHIFT; + end = start + len; + return end >= start; +} + +/* + * Like get_user_pages_fast() except its IRQ-safe in that it won't fall + * back to the regular GUP. + */ +int __get_user_pages_fast(unsigned long start, int nr_pages, int write, + struct page **pages) +{ + unsigned long addr, len, end; + unsigned long flags; + int nr = 0; + + start &= PAGE_MASK; + addr = start; + len = (unsigned long) nr_pages << PAGE_SHIFT; + end = start + len; + + if (unlikely(!access_ok(write ? VERIFY_WRITE : VERIFY_READ, + (void __user *)start, len))) + return 0; + + /* + * Disable interrupts. We use the nested form as we can already have + * interrupts disabled by get_futex_key. + * + * With interrupts disabled, we block page table pages from being + * freed from under us. See mmu_gather_tlb in asm-generic/tlb.h + * for more details. + * + * We do not adopt an rcu_read_lock(.) here as we also want to + * block IPIs that come from THPs splitting. + */ + + if (gup_fast_permitted(start, nr_pages, write)) { + local_irq_save(flags); + gup_pgd_range(addr, end, write, pages, &nr); + local_irq_restore(flags); + } + + return nr; +} + +int get_user_pages_fast(unsigned long start, int nr_pages, int write, + struct page **pages) +{ + unsigned long addr, len, end; + int nr = 0, ret = 0; + + start &= PAGE_MASK; + addr = start; + len = (unsigned long) nr_pages << PAGE_SHIFT; + end = start + len; + + if (nr_pages <= 0) + return 0; + + if (unlikely(!access_ok(write ? VERIFY_WRITE : VERIFY_READ, + (void __user *)start, len))) + return -EFAULT; + + if (gup_fast_permitted(start, nr_pages, write)) { + local_irq_disable(); + gup_pgd_range(addr, end, write, pages, &nr); + local_irq_enable(); + ret = nr; + } + + if (nr < nr_pages) { + /* Try to get the remaining pages with get_user_pages */ + start += nr << PAGE_SHIFT; + pages += nr; + + ret = get_user_pages_unlocked(start, nr_pages - nr, pages, + write ? FOLL_WRITE : 0); + + /* Have to be a bit careful with return values */ + if (nr > 0) { + if (ret < 0) + ret = nr; + else + ret += nr; + } + } + + return ret; +} -- 2.19.0.rc1.350.ge57e33dbd1-goog ^ permalink raw reply related [flat|nested] 12+ messages in thread
end of thread, other threads:[~2018-09-14 6:27 UTC | newest] Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2018-09-06 10:22 [RFC 0/3] arm: support get_user_pages_fast Minchan Kim 2018-09-06 10:22 ` Minchan Kim 2018-09-06 10:22 ` [RFC 1/3] arm: mm: reordering memory type table Minchan Kim 2018-09-06 10:22 ` Minchan Kim 2018-09-10 16:50 ` Catalin Marinas 2018-09-10 16:50 ` Catalin Marinas 2018-09-14 6:26 ` Minchan Kim 2018-09-14 6:26 ` Minchan Kim 2018-09-06 10:22 ` [RFC 2/3] arm: mm: introduce L_PTE_SPECIAL Minchan Kim 2018-09-06 10:22 ` Minchan Kim 2018-09-06 10:22 ` [RFC 3/3] arm: mm: support get_user_pages_fast Minchan Kim 2018-09-06 10:22 ` Minchan Kim
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.