All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ryan Roberts <ryan.roberts@arm.com>
To: David Hildenbrand <david@redhat.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Matthew Wilcox <willy@infradead.org>,
	Yin Fengwei <fengwei.yin@intel.com>, Yu Zhao <yuzhao@google.com>,
	Catalin Marinas <catalin.marinas@arm.com>,
	Anshuman Khandual <anshuman.khandual@arm.com>,
	Yang Shi <shy828301@gmail.com>,
	"Huang, Ying" <ying.huang@intel.com>, Zi Yan <ziy@nvidia.com>,
	Luis Chamberlain <mcgrof@kernel.org>,
	Itaru Kitayama <itaru.kitayama@gmail.com>,
	"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
	John Hubbard <jhubbard@nvidia.com>,
	David Rientjes <rientjes@google.com>,
	Vlastimil Babka <vbabka@suse.cz>, Hugh Dickins <hughd@google.com>,
	Kefeng Wang <wangkefeng.wang@huawei.com>,
	Barry Song <21cnbao@gmail.com>,
	Alistair Popple <apopple@nvidia.com>
Cc: linux-mm@kvack.org, linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org, Barry Song <v-songbaohua@oppo.com>
Subject: Re: [PATCH v9 03/10] mm: thp: Introduce multi-size THP sysfs interface
Date: Tue, 12 Dec 2023 15:32:29 +0000	[thread overview]
Message-ID: <e424982c-8a2f-4c98-83aa-fdb0ee765776@arm.com> (raw)
In-Reply-To: <ff7a3e9c-53cb-4283-9298-781d4fb7c7f8@redhat.com>

On 12/12/2023 14:54, David Hildenbrand wrote:
> On 07.12.23 17:12, Ryan Roberts wrote:
>> In preparation for adding support for anonymous multi-size THP,
>> introduce new sysfs structure that will be used to control the new
>> behaviours. A new directory is added under transparent_hugepage for each
>> supported THP size, and contains an `enabled` file, which can be set to
>> "inherit" (to inherit the global setting), "always", "madvise" or
>> "never". For now, the kernel still only supports PMD-sized anonymous
>> THP, so only 1 directory is populated.
>>
>> The first half of the change converts transhuge_vma_suitable() and
>> hugepage_vma_check() so that they take a bitfield of orders for which
>> the user wants to determine support, and the functions filter out all
>> the orders that can't be supported, given the current sysfs
>> configuration and the VMA dimensions. The resulting functions are
>> renamed to thp_vma_suitable_orders() and thp_vma_allowable_orders()
>> respectively. Convenience functions that take a single, unencoded order
>> and return a boolean are also defined as thp_vma_suitable_order() and
>> thp_vma_allowable_order().
>>
>> The second half of the change implements the new sysfs interface. It has
>> been done so that each supported THP size has a `struct thpsize`, which
>> describes the relevant metadata and is itself a kobject. This is pretty
>> minimal for now, but should make it easy to add new per-thpsize files to
>> the interface if needed in future (e.g. per-size defrag). Rather than
>> keep the `enabled` state directly in the struct thpsize, I've elected to
>> directly encode it into huge_anon_orders_[always|madvise|inherit]
>> bitfields since this reduces the amount of work required in
>> thp_vma_allowable_orders() which is called for every page fault.
>>
>> See Documentation/admin-guide/mm/transhuge.rst, as modified by this
>> commit, for details of how the new sysfs interface works.
>>
>> Reviewed-by: Barry Song <v-songbaohua@oppo.com>
>> Tested-by: Kefeng Wang <wangkefeng.wang@huawei.com>
>> Tested-by: John Hubbard <jhubbard@nvidia.com>
>> Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
>> ---
> 
> [...]
> 
>> +
>> +static ssize_t thpsize_enabled_store(struct kobject *kobj,
>> +                     struct kobj_attribute *attr,
>> +                     const char *buf, size_t count)
>> +{
>> +    int order = to_thpsize(kobj)->order;
>> +    ssize_t ret = count;
>> +
>> +    if (sysfs_streq(buf, "always")) {
>> +        spin_lock(&huge_anon_orders_lock);
>> +        clear_bit(order, &huge_anon_orders_inherit);
>> +        clear_bit(order, &huge_anon_orders_madvise);
>> +        set_bit(order, &huge_anon_orders_always);
>> +        spin_unlock(&huge_anon_orders_lock);
>> +    } else if (sysfs_streq(buf, "inherit")) {
>> +        spin_lock(&huge_anon_orders_lock);
>> +        clear_bit(order, &huge_anon_orders_always);
>> +        clear_bit(order, &huge_anon_orders_madvise);
>> +        set_bit(order, &huge_anon_orders_inherit);
>> +        spin_unlock(&huge_anon_orders_lock);
>> +    } else if (sysfs_streq(buf, "madvise")) {
>> +        spin_lock(&huge_anon_orders_lock);
>> +        clear_bit(order, &huge_anon_orders_always);
>> +        clear_bit(order, &huge_anon_orders_inherit);
>> +        set_bit(order, &huge_anon_orders_madvise);
>> +        spin_unlock(&huge_anon_orders_lock);
>> +    } else if (sysfs_streq(buf, "never")) {
>> +        spin_lock(&huge_anon_orders_lock);
>> +        clear_bit(order, &huge_anon_orders_always);
>> +        clear_bit(order, &huge_anon_orders_inherit);
>> +        clear_bit(order, &huge_anon_orders_madvise);
>> +        spin_unlock(&huge_anon_orders_lock);
> 
> Why not perform lock/unlock only once in surrounding code? :)

I was nervous that sysfs_streq() may be unhappy in atomic context... Unfounded?

> 
> 
> Much better
> 
> Acked-by: David Hildenbrand <david@redhat.com>
> 


WARNING: multiple messages have this Message-ID (diff)
From: Ryan Roberts <ryan.roberts@arm.com>
To: David Hildenbrand <david@redhat.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Matthew Wilcox <willy@infradead.org>,
	Yin Fengwei <fengwei.yin@intel.com>, Yu Zhao <yuzhao@google.com>,
	Catalin Marinas <catalin.marinas@arm.com>,
	Anshuman Khandual <anshuman.khandual@arm.com>,
	Yang Shi <shy828301@gmail.com>,
	"Huang, Ying" <ying.huang@intel.com>, Zi Yan <ziy@nvidia.com>,
	Luis Chamberlain <mcgrof@kernel.org>,
	Itaru Kitayama <itaru.kitayama@gmail.com>,
	"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
	John Hubbard <jhubbard@nvidia.com>,
	David Rientjes <rientjes@google.com>,
	Vlastimil Babka <vbabka@suse.cz>, Hugh Dickins <hughd@google.com>,
	Kefeng Wang <wangkefeng.wang@huawei.com>,
	Barry Song <21cnbao@gmail.com>,
	Alistair Popple <apopple@nvidia.com>
Cc: linux-mm@kvack.org, linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org, Barry Song <v-songbaohua@oppo.com>
Subject: Re: [PATCH v9 03/10] mm: thp: Introduce multi-size THP sysfs interface
Date: Tue, 12 Dec 2023 15:32:29 +0000	[thread overview]
Message-ID: <e424982c-8a2f-4c98-83aa-fdb0ee765776@arm.com> (raw)
In-Reply-To: <ff7a3e9c-53cb-4283-9298-781d4fb7c7f8@redhat.com>

On 12/12/2023 14:54, David Hildenbrand wrote:
> On 07.12.23 17:12, Ryan Roberts wrote:
>> In preparation for adding support for anonymous multi-size THP,
>> introduce new sysfs structure that will be used to control the new
>> behaviours. A new directory is added under transparent_hugepage for each
>> supported THP size, and contains an `enabled` file, which can be set to
>> "inherit" (to inherit the global setting), "always", "madvise" or
>> "never". For now, the kernel still only supports PMD-sized anonymous
>> THP, so only 1 directory is populated.
>>
>> The first half of the change converts transhuge_vma_suitable() and
>> hugepage_vma_check() so that they take a bitfield of orders for which
>> the user wants to determine support, and the functions filter out all
>> the orders that can't be supported, given the current sysfs
>> configuration and the VMA dimensions. The resulting functions are
>> renamed to thp_vma_suitable_orders() and thp_vma_allowable_orders()
>> respectively. Convenience functions that take a single, unencoded order
>> and return a boolean are also defined as thp_vma_suitable_order() and
>> thp_vma_allowable_order().
>>
>> The second half of the change implements the new sysfs interface. It has
>> been done so that each supported THP size has a `struct thpsize`, which
>> describes the relevant metadata and is itself a kobject. This is pretty
>> minimal for now, but should make it easy to add new per-thpsize files to
>> the interface if needed in future (e.g. per-size defrag). Rather than
>> keep the `enabled` state directly in the struct thpsize, I've elected to
>> directly encode it into huge_anon_orders_[always|madvise|inherit]
>> bitfields since this reduces the amount of work required in
>> thp_vma_allowable_orders() which is called for every page fault.
>>
>> See Documentation/admin-guide/mm/transhuge.rst, as modified by this
>> commit, for details of how the new sysfs interface works.
>>
>> Reviewed-by: Barry Song <v-songbaohua@oppo.com>
>> Tested-by: Kefeng Wang <wangkefeng.wang@huawei.com>
>> Tested-by: John Hubbard <jhubbard@nvidia.com>
>> Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
>> ---
> 
> [...]
> 
>> +
>> +static ssize_t thpsize_enabled_store(struct kobject *kobj,
>> +                     struct kobj_attribute *attr,
>> +                     const char *buf, size_t count)
>> +{
>> +    int order = to_thpsize(kobj)->order;
>> +    ssize_t ret = count;
>> +
>> +    if (sysfs_streq(buf, "always")) {
>> +        spin_lock(&huge_anon_orders_lock);
>> +        clear_bit(order, &huge_anon_orders_inherit);
>> +        clear_bit(order, &huge_anon_orders_madvise);
>> +        set_bit(order, &huge_anon_orders_always);
>> +        spin_unlock(&huge_anon_orders_lock);
>> +    } else if (sysfs_streq(buf, "inherit")) {
>> +        spin_lock(&huge_anon_orders_lock);
>> +        clear_bit(order, &huge_anon_orders_always);
>> +        clear_bit(order, &huge_anon_orders_madvise);
>> +        set_bit(order, &huge_anon_orders_inherit);
>> +        spin_unlock(&huge_anon_orders_lock);
>> +    } else if (sysfs_streq(buf, "madvise")) {
>> +        spin_lock(&huge_anon_orders_lock);
>> +        clear_bit(order, &huge_anon_orders_always);
>> +        clear_bit(order, &huge_anon_orders_inherit);
>> +        set_bit(order, &huge_anon_orders_madvise);
>> +        spin_unlock(&huge_anon_orders_lock);
>> +    } else if (sysfs_streq(buf, "never")) {
>> +        spin_lock(&huge_anon_orders_lock);
>> +        clear_bit(order, &huge_anon_orders_always);
>> +        clear_bit(order, &huge_anon_orders_inherit);
>> +        clear_bit(order, &huge_anon_orders_madvise);
>> +        spin_unlock(&huge_anon_orders_lock);
> 
> Why not perform lock/unlock only once in surrounding code? :)

I was nervous that sysfs_streq() may be unhappy in atomic context... Unfounded?

> 
> 
> Much better
> 
> Acked-by: David Hildenbrand <david@redhat.com>
> 


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  reply	other threads:[~2023-12-12 15:32 UTC|newest]

Thread overview: 78+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-12-07 16:12 [PATCH v9 00/10] Multi-size THP for anonymous memory Ryan Roberts
2023-12-07 16:12 ` Ryan Roberts
2023-12-07 16:12 ` [PATCH v9 01/10] mm: Allow deferred splitting of arbitrary anon large folios Ryan Roberts
2023-12-07 16:12   ` Ryan Roberts
2023-12-07 16:12 ` [PATCH v9 02/10] mm: Non-pmd-mappable, large folios for folio_add_new_anon_rmap() Ryan Roberts
2023-12-07 16:12   ` Ryan Roberts
2024-01-13 22:42   ` Jiri Olsa
2024-01-13 22:42     ` Jiri Olsa
2024-01-14 17:33     ` David Hildenbrand
2024-01-14 17:33       ` David Hildenbrand
2024-01-14 20:55       ` Jiri Olsa
2024-01-14 20:55         ` Jiri Olsa
2024-01-15  8:50         ` Ryan Roberts
2024-01-15  8:50           ` Ryan Roberts
2024-01-15  9:38           ` David Hildenbrand
2024-01-15  9:38             ` David Hildenbrand
2024-01-24 11:15           ` Sven Schnelle
2024-01-24 11:15             ` Sven Schnelle
2024-01-24 11:19             ` Jiri Olsa
2024-01-24 11:19               ` Jiri Olsa
2024-01-24 12:02               ` Ryan Roberts
2024-01-24 12:02                 ` Ryan Roberts
2024-01-24 12:06                 ` Jiri Olsa
2024-01-24 12:06                   ` Jiri Olsa
2024-01-24 12:17                   ` Ryan Roberts
2024-01-24 12:17                     ` Ryan Roberts
2024-01-24 12:28                     ` Sven Schnelle
2024-01-24 12:28                       ` Sven Schnelle
2024-01-24 12:42                       ` Ryan Roberts
2024-01-24 12:42                         ` Ryan Roberts
2023-12-07 16:12 ` [PATCH v9 03/10] mm: thp: Introduce multi-size THP sysfs interface Ryan Roberts
2023-12-07 16:12   ` Ryan Roberts
2023-12-12 14:54   ` David Hildenbrand
2023-12-12 14:54     ` David Hildenbrand
2023-12-12 15:32     ` Ryan Roberts [this message]
2023-12-12 15:32       ` Ryan Roberts
2023-12-12 16:27       ` Andrew Morton
2023-12-12 16:27         ` Andrew Morton
2023-12-07 16:12 ` [PATCH v9 04/10] mm: thp: Support allocation of anonymous multi-size THP Ryan Roberts
2023-12-07 16:12   ` Ryan Roberts
2023-12-12 15:02   ` David Hildenbrand
2023-12-12 15:02     ` David Hildenbrand
2023-12-12 15:38     ` Ryan Roberts
2023-12-12 15:38       ` Ryan Roberts
2023-12-12 16:35       ` David Hildenbrand
2023-12-12 16:35         ` David Hildenbrand
2023-12-13  7:21   ` Dan Carpenter
2023-12-13  7:21     ` Dan Carpenter
2023-12-14 10:54     ` Ryan Roberts
2023-12-14 10:54       ` Ryan Roberts
2023-12-14 11:30       ` Dan Carpenter
2023-12-14 11:30         ` Dan Carpenter
2023-12-14 12:12         ` Ryan Roberts
2023-12-14 12:12           ` Ryan Roberts
2023-12-14 16:02         ` [PATCH] mm: Resolve some multi-size THP review nits Ryan Roberts
2023-12-14 16:02           ` Ryan Roberts
2023-12-07 16:12 ` [PATCH v9 05/10] selftests/mm/kugepaged: Restore thp settings at exit Ryan Roberts
2023-12-07 16:12   ` Ryan Roberts
2023-12-07 16:12 ` [PATCH v9 06/10] selftests/mm: Factor out thp settings management Ryan Roberts
2023-12-07 16:12   ` Ryan Roberts
2023-12-07 16:12 ` [PATCH v9 07/10] selftests/mm: Support multi-size THP interface in thp_settings Ryan Roberts
2023-12-07 16:12   ` Ryan Roberts
2023-12-07 16:12 ` [PATCH v9 08/10] selftests/mm/khugepaged: Enlighten for multi-size THP Ryan Roberts
2023-12-07 16:12   ` Ryan Roberts
2023-12-07 16:12 ` [PATCH v9 09/10] selftests/mm/cow: Generalize do_run_with_thp() helper Ryan Roberts
2023-12-07 16:12   ` Ryan Roberts
2024-01-03  6:21   ` Itaru Kitayama
2024-01-03  6:21     ` Itaru Kitayama
2024-01-03  8:33     ` Ryan Roberts
2024-01-03  8:33       ` Ryan Roberts
2024-01-04  0:09       ` Itaru Kitayama
2024-01-04  0:09         ` Itaru Kitayama
2023-12-07 16:12 ` [PATCH v9 10/10] selftests/mm/cow: Add tests for anonymous multi-size THP Ryan Roberts
2023-12-07 16:12   ` Ryan Roberts
2023-12-07 22:05 ` [PATCH v9 00/10] Multi-size THP for anonymous memory Andrew Morton
2023-12-07 22:05   ` Andrew Morton
2023-12-11 11:51   ` Ryan Roberts
2023-12-11 11:51     ` Ryan Roberts

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=e424982c-8a2f-4c98-83aa-fdb0ee765776@arm.com \
    --to=ryan.roberts@arm.com \
    --cc=21cnbao@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=anshuman.khandual@arm.com \
    --cc=apopple@nvidia.com \
    --cc=catalin.marinas@arm.com \
    --cc=david@redhat.com \
    --cc=fengwei.yin@intel.com \
    --cc=hughd@google.com \
    --cc=itaru.kitayama@gmail.com \
    --cc=jhubbard@nvidia.com \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mcgrof@kernel.org \
    --cc=rientjes@google.com \
    --cc=shy828301@gmail.com \
    --cc=v-songbaohua@oppo.com \
    --cc=vbabka@suse.cz \
    --cc=wangkefeng.wang@huawei.com \
    --cc=willy@infradead.org \
    --cc=ying.huang@intel.com \
    --cc=yuzhao@google.com \
    --cc=ziy@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.