All of lore.kernel.org
 help / color / mirror / Atom feed
From: Arnd Bergmann <arnd@arndb.de>
To: Russell King - ARM Linux admin <linux@armlinux.org.uk>
Cc: Catalin Marinas <catalin.marinas@arm.com>,
	Nishanth Menon <nm@ti.com>,
	Santosh Shilimkar <santosh.shilimkar@oracle.com>,
	Tero Kristo <t-kristo@ti.com>,
	Linux ARM <linux-arm-kernel@lists.infradead.org>,
	Michal Hocko <mhocko@suse.com>, Rik van Riel <riel@surriel.com>,
	Santosh Shilimkar <ssantosh@kernel.org>,
	Dave Chinner <david@fromorbit.com>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Linux-MM <linux-mm@kvack.org>, Yafang Shao <laoar.shao@gmail.com>,
	Al Viro <viro@zeniv.linux.org.uk>,
	Johannes Weiner <hannes@cmpxchg.org>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	kernel-team@fb.com, Kishon Vijay Abraham I <kishon@ti.com>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Roman Gushchin <guro@fb.com>
Subject: Re: [PATCH] vfs: keep inodes with page cache off the inode shrinker LRU
Date: Mon, 9 Mar 2020 20:46:18 +0100	[thread overview]
Message-ID: <CAK8P3a2yyJLmkifpSabMwtUiAvumMPwLEzT5RpsBA=LYn=ZXUw@mail.gmail.com> (raw)
In-Reply-To: <20200309160919.GM25745@shell.armlinux.org.uk>

On Mon, Mar 9, 2020 at 5:09 PM Russell King - ARM Linux admin
<linux@armlinux.org.uk> wrote:
> On Mon, Mar 09, 2020 at 03:59:45PM +0000, Catalin Marinas wrote:
> > On Sun, Mar 08, 2020 at 11:58:52AM +0100, Arnd Bergmann wrote:
> > > - revisit CONFIG_VMSPLIT_4G_4G for arm32 (and maybe mips32)
> > >   to see if it can be done, and what the overhead is. This is probably
> > >   more work than the others combined, but also the most promising
> > >   as it allows the most user address space and physical ram to be used.
> >
> > A rough outline of such support (and likely to miss some corner cases):
> >
> > 1. Kernel runs with its own ASID and non-global page tables.
> >
> > 2. Trampoline code on exception entry/exit to handle the TTBR0 switching
> >    between user and kernel.
> >
> > 3. uaccess routines need to be reworked to pin the user pages in memory
> >    (get_user_pages()) and access them via the kernel address space.
> >
> > Point 3 is probably the ugliest and it would introduce a noticeable
> > slowdown in certain syscalls.

There are probably a number of ways to do the basic design. The idea
I had (again, probably missing more corner cases than either of you
two that actually understand the details of the mmu):

- Assuming we have LPAE, run the kernel vmlinux and modules inside
  the vmalloc space, in the top 256MB or 512MB on TTBR1

- Map all the physical RAM (up to 3.75GB) into a reserved ASID
  with TTBR0

- Flip TTBR0 on kernel entry/exit, and again during user access.

This is probably more work to implement than your idea, but
I would hope this has a lower overhead on most microarchitectures
as it doesn't require pinning the pages. Depending on the
microarchitecture, I'd hope the overhead would be comparable
to that of ARM64_SW_TTBR0_PAN.

> We also need to consider that it has implications for the single-kernel
> support; a kernel doing this kind of switching would likely be horrid
> for a kernel supporting v6+ with VIPT aliasing caches.  Would we be
> adding a new red line between kernels supporting VIPT-aliasing caches
> (present in earlier v6 implementations) and kernels using this system?

I would initially do it for LPAE only, given that this is already an
incompatible config option. I don't think there are any v6 machines with
more than 1GB of RAM (the maximum for AST2500), and the only distro
that ships a v6+ multiplatform kernel is Raspbian, which in turn needs
a separate LPAE kernel for the large-memory machines anyway.

Only doing it for LPAE would still cover the vast majority of systems that
actually shipped with more than 2GB. There are a couple of exceptions,
i.e. early  Cubox i4x4, the Calxeda Highbank developer system and the
Novena Laptop, which I would guess have a limited life expectancy
(before users stop updating kernels) no longer than the 8GB
Keystone-2.

Based on that, I would hope that the ARMv7 distros can keep shipping
the two kernel images they already ship:

- The non-LPAE kernel modified to VMSPLIT_2G_OPT, not using highmem
  on anything up to 2GB, but still supporting the handful of remaining
  Cortex-A9s with 4GB using highmem until they are completely obsolete.

- The LPAE kernel modified to use a newly added VMSPLIT_4G_4G,
   with details to be worked out.

Most new systems tend to be based on Cortex-A7 with no more than 2GB,
so those could run either configuration well.  If we find the 2GB of user
address space too limiting for the non-LPAE config, or I missed some
important pre-LPAE systems with 4GB that need to be supported for longer
than other highmem systems, that can probably be added later.

    Arnd

WARNING: multiple messages have this Message-ID (diff)
From: Arnd Bergmann <arnd@arndb.de>
To: Russell King - ARM Linux admin <linux@armlinux.org.uk>
Cc: Nishanth Menon <nm@ti.com>, Michal Hocko <mhocko@suse.com>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Rik van Riel <riel@surriel.com>,
	Catalin Marinas <catalin.marinas@arm.com>,
	Roman Gushchin <guro@fb.com>,
	Santosh Shilimkar <santosh.shilimkar@oracle.com>,
	Dave Chinner <david@fromorbit.com>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Kishon Vijay Abraham I <kishon@ti.com>,
	Tero Kristo <t-kristo@ti.com>, Linux-MM <linux-mm@kvack.org>,
	Yafang Shao <laoar.shao@gmail.com>,
	Al Viro <viro@zeniv.linux.org.uk>,
	Santosh Shilimkar <ssantosh@kernel.org>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	kernel-team@fb.com,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Linux ARM <linux-arm-kernel@lists.infradead.org>
Subject: Re: [PATCH] vfs: keep inodes with page cache off the inode shrinker LRU
Date: Mon, 9 Mar 2020 20:46:18 +0100	[thread overview]
Message-ID: <CAK8P3a2yyJLmkifpSabMwtUiAvumMPwLEzT5RpsBA=LYn=ZXUw@mail.gmail.com> (raw)
In-Reply-To: <20200309160919.GM25745@shell.armlinux.org.uk>

On Mon, Mar 9, 2020 at 5:09 PM Russell King - ARM Linux admin
<linux@armlinux.org.uk> wrote:
> On Mon, Mar 09, 2020 at 03:59:45PM +0000, Catalin Marinas wrote:
> > On Sun, Mar 08, 2020 at 11:58:52AM +0100, Arnd Bergmann wrote:
> > > - revisit CONFIG_VMSPLIT_4G_4G for arm32 (and maybe mips32)
> > >   to see if it can be done, and what the overhead is. This is probably
> > >   more work than the others combined, but also the most promising
> > >   as it allows the most user address space and physical ram to be used.
> >
> > A rough outline of such support (and likely to miss some corner cases):
> >
> > 1. Kernel runs with its own ASID and non-global page tables.
> >
> > 2. Trampoline code on exception entry/exit to handle the TTBR0 switching
> >    between user and kernel.
> >
> > 3. uaccess routines need to be reworked to pin the user pages in memory
> >    (get_user_pages()) and access them via the kernel address space.
> >
> > Point 3 is probably the ugliest and it would introduce a noticeable
> > slowdown in certain syscalls.

There are probably a number of ways to do the basic design. The idea
I had (again, probably missing more corner cases than either of you
two that actually understand the details of the mmu):

- Assuming we have LPAE, run the kernel vmlinux and modules inside
  the vmalloc space, in the top 256MB or 512MB on TTBR1

- Map all the physical RAM (up to 3.75GB) into a reserved ASID
  with TTBR0

- Flip TTBR0 on kernel entry/exit, and again during user access.

This is probably more work to implement than your idea, but
I would hope this has a lower overhead on most microarchitectures
as it doesn't require pinning the pages. Depending on the
microarchitecture, I'd hope the overhead would be comparable
to that of ARM64_SW_TTBR0_PAN.

> We also need to consider that it has implications for the single-kernel
> support; a kernel doing this kind of switching would likely be horrid
> for a kernel supporting v6+ with VIPT aliasing caches.  Would we be
> adding a new red line between kernels supporting VIPT-aliasing caches
> (present in earlier v6 implementations) and kernels using this system?

I would initially do it for LPAE only, given that this is already an
incompatible config option. I don't think there are any v6 machines with
more than 1GB of RAM (the maximum for AST2500), and the only distro
that ships a v6+ multiplatform kernel is Raspbian, which in turn needs
a separate LPAE kernel for the large-memory machines anyway.

Only doing it for LPAE would still cover the vast majority of systems that
actually shipped with more than 2GB. There are a couple of exceptions,
i.e. early  Cubox i4x4, the Calxeda Highbank developer system and the
Novena Laptop, which I would guess have a limited life expectancy
(before users stop updating kernels) no longer than the 8GB
Keystone-2.

Based on that, I would hope that the ARMv7 distros can keep shipping
the two kernel images they already ship:

- The non-LPAE kernel modified to VMSPLIT_2G_OPT, not using highmem
  on anything up to 2GB, but still supporting the handful of remaining
  Cortex-A9s with 4GB using highmem until they are completely obsolete.

- The LPAE kernel modified to use a newly added VMSPLIT_4G_4G,
   with details to be worked out.

Most new systems tend to be based on Cortex-A7 with no more than 2GB,
so those could run either configuration well.  If we find the 2GB of user
address space too limiting for the non-LPAE config, or I missed some
important pre-LPAE systems with 4GB that need to be supported for longer
than other highmem systems, that can probably be added later.

    Arnd

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  parent reply	other threads:[~2020-03-09 19:46 UTC|newest]

Thread overview: 116+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-02-11 17:55 [PATCH] vfs: keep inodes with page cache off the inode shrinker LRU Johannes Weiner
2020-02-11 18:20 ` Johannes Weiner
2020-02-11 19:05 ` Rik van Riel
2020-02-11 19:05   ` Rik van Riel
2020-02-11 19:31   ` Johannes Weiner
2020-02-11 23:44     ` Andrew Morton
2020-02-12  0:28       ` Linus Torvalds
2020-02-12  0:28         ` Linus Torvalds
2020-02-12  0:47         ` Andrew Morton
2020-02-12  1:03           ` Linus Torvalds
2020-02-12  1:03             ` Linus Torvalds
2020-02-12  1:03             ` Linus Torvalds
2020-02-12  8:50             ` Russell King - ARM Linux admin
2020-02-12  8:50               ` Russell King - ARM Linux admin
2020-02-13  9:50               ` Lucas Stach
2020-02-13  9:50                 ` Lucas Stach
2020-02-13  9:50                 ` Lucas Stach
2020-02-13 16:52               ` Arnd Bergmann
2020-02-13 16:52                 ` Arnd Bergmann
2020-02-13 16:52                 ` Arnd Bergmann
2020-02-15 11:25                 ` Geert Uytterhoeven
2020-02-15 11:25                   ` [cip-dev] " Geert Uytterhoeven
2020-02-15 11:25                   ` Geert Uytterhoeven
2020-02-15 11:25                   ` Geert Uytterhoeven
2020-02-15 16:59                   ` Arnd Bergmann
2020-02-15 16:59                     ` [cip-dev] " Arnd Bergmann
2020-02-15 16:59                     ` Arnd Bergmann
2020-02-15 16:59                     ` Arnd Bergmann
2020-02-16  9:44                     ` Geert Uytterhoeven
2020-02-16  9:44                       ` [cip-dev] " Geert Uytterhoeven
2020-02-16  9:44                       ` Geert Uytterhoeven
2020-02-16  9:44                       ` Geert Uytterhoeven
2020-02-16 19:54                       ` Chris Paterson
2020-02-16 19:54                         ` [cip-dev] " Chris Paterson
2020-02-16 19:54                         ` Chris Paterson
2020-02-16 20:38                         ` Arnd Bergmann
2020-02-16 20:38                           ` [cip-dev] " Arnd Bergmann
2020-02-16 20:38                           ` Arnd Bergmann
2020-02-16 20:38                           ` Arnd Bergmann
2020-02-20 14:35                           ` Chris Paterson
2020-02-20 14:35                             ` [cip-dev] " Chris Paterson
2020-02-20 14:35                             ` Chris Paterson
2020-02-26 18:04                 ` santosh.shilimkar
2020-02-26 18:04                   ` santosh.shilimkar
2020-02-26 21:01                   ` Arnd Bergmann
2020-02-26 21:01                     ` Arnd Bergmann
2020-02-26 21:01                     ` Arnd Bergmann
2020-02-26 21:11                     ` santosh.shilimkar
2020-02-26 21:11                       ` santosh.shilimkar
2020-03-06 20:34                       ` Nishanth Menon
2020-03-06 20:34                         ` Nishanth Menon
2020-03-07  1:08                         ` santosh.shilimkar
2020-03-07  1:08                           ` santosh.shilimkar
2020-03-08 10:58                         ` Arnd Bergmann
2020-03-08 10:58                           ` Arnd Bergmann
2020-03-08 10:58                           ` Arnd Bergmann
2020-03-08 14:19                           ` Russell King - ARM Linux admin
2020-03-08 14:19                             ` Russell King - ARM Linux admin
2020-03-09 13:33                             ` Arnd Bergmann
2020-03-09 13:33                               ` Arnd Bergmann
2020-03-09 13:33                               ` Arnd Bergmann
2020-03-09 14:04                               ` Russell King - ARM Linux admin
2020-03-09 14:04                                 ` Russell King - ARM Linux admin
2020-03-09 15:04                                 ` Arnd Bergmann
2020-03-09 15:04                                   ` Arnd Bergmann
2020-03-09 15:04                                   ` Arnd Bergmann
2020-03-10  9:16                                   ` Michal Hocko
2020-03-10  9:16                                     ` Michal Hocko
2020-03-09 15:59                           ` Catalin Marinas
2020-03-09 15:59                             ` Catalin Marinas
2020-03-09 16:09                             ` Russell King - ARM Linux admin
2020-03-09 16:09                               ` Russell King - ARM Linux admin
2020-03-09 16:57                               ` Catalin Marinas
2020-03-09 16:57                                 ` Catalin Marinas
2020-03-09 19:46                               ` Arnd Bergmann [this message]
2020-03-09 19:46                                 ` Arnd Bergmann
2020-03-09 19:46                                 ` Arnd Bergmann
2020-03-11 14:29                                 ` Catalin Marinas
2020-03-11 14:29                                   ` Catalin Marinas
2020-03-11 16:59                                   ` Arnd Bergmann
2020-03-11 16:59                                     ` Arnd Bergmann
2020-03-11 16:59                                     ` Arnd Bergmann
2020-03-11 17:26                                     ` Catalin Marinas
2020-03-11 17:26                                       ` Catalin Marinas
2020-03-11 22:21                                       ` Arnd Bergmann
2020-03-11 22:21                                         ` Arnd Bergmann
2020-03-11 22:21                                         ` Arnd Bergmann
2020-02-12  3:58         ` Matthew Wilcox
2020-02-12  8:09         ` Michal Hocko
2020-02-17 13:31         ` Pavel Machek
2020-02-12 16:35       ` Johannes Weiner
2020-02-12 18:26         ` Andrew Morton
2020-02-12 18:52           ` Johannes Weiner
2020-02-12 12:25 ` Yafang Shao
2020-02-12 12:25   ` Yafang Shao
2020-02-12 16:42   ` Johannes Weiner
2020-02-13  1:47     ` Yafang Shao
2020-02-13  1:47       ` Yafang Shao
2020-02-13 13:46       ` Johannes Weiner
2020-02-14  2:02         ` Yafang Shao
2020-02-14  2:02           ` Yafang Shao
2020-02-13 18:34 ` [PATCH v2] " Johannes Weiner
2020-02-14 16:53 ` [PATCH] " kbuild test robot
2020-02-14 16:53   ` kbuild test robot
2020-02-14 21:30 ` kbuild test robot
2020-02-14 21:30   ` kbuild test robot
2020-02-14 21:30 ` [PATCH] vfs: fix boolreturn.cocci warnings kbuild test robot
2020-02-14 21:30   ` kbuild test robot
2020-05-12 21:29 ` [PATCH] vfs: keep inodes with page cache off the inode shrinker LRU Johannes Weiner
2020-05-13  1:32   ` Yafang Shao
2020-05-13  1:32     ` Yafang Shao
2020-05-13 13:00     ` Johannes Weiner
2020-05-13 21:15   ` Andrew Morton
2020-05-14 11:27     ` Johannes Weiner
2020-05-14  2:24   ` Andrew Morton
2020-05-14 10:37     ` Johannes Weiner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAK8P3a2yyJLmkifpSabMwtUiAvumMPwLEzT5RpsBA=LYn=ZXUw@mail.gmail.com' \
    --to=arnd@arndb.de \
    --cc=akpm@linux-foundation.org \
    --cc=catalin.marinas@arm.com \
    --cc=david@fromorbit.com \
    --cc=guro@fb.com \
    --cc=hannes@cmpxchg.org \
    --cc=kernel-team@fb.com \
    --cc=kishon@ti.com \
    --cc=laoar.shao@gmail.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux@armlinux.org.uk \
    --cc=mhocko@suse.com \
    --cc=nm@ti.com \
    --cc=riel@surriel.com \
    --cc=santosh.shilimkar@oracle.com \
    --cc=ssantosh@kernel.org \
    --cc=t-kristo@ti.com \
    --cc=torvalds@linux-foundation.org \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.