All of lore.kernel.org
 help / color / mirror / Atom feed
From: shuah <shuah@kernel.org>
To: Mina Almasry <almasrymina@google.com>, mike.kravetz@oracle.com
Cc: rientjes@google.com, shakeelb@google.com, gthelen@google.com,
	akpm@linux-foundation.org, khalid.aziz@oracle.com,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	linux-kselftest@vger.kernel.org, cgroups@vger.kernel.org,
	aneesh.kumar@linux.vnet.ibm.com, mkoutny@suse.com,
	Hillf Danton <hdanton@sina.com>, shuah <shuah@kernel.org>
Subject: Re: [PATCH v4 9/9] hugetlb_cgroup: Add hugetlb_cgroup reservation docs
Date: Mon, 16 Sep 2019 19:58:20 -0600	[thread overview]
Message-ID: <9fdff535-5f36-ca91-3905-630c18858170@kernel.org> (raw)
In-Reply-To: <20190910233146.206080-10-almasrymina@google.com>

On 9/10/19 5:31 PM, Mina Almasry wrote:
> Add docs for how to use hugetlb_cgroup reservations, and their behavior.
> 
> Signed-off-by: Mina Almasry <almasrymina@google.com>
> Acked-by: Hillf Danton <hdanton@sina.com>
> ---
>   .../admin-guide/cgroup-v1/hugetlb.rst         | 84 ++++++++++++++++---
>   1 file changed, 73 insertions(+), 11 deletions(-)
> 
> diff --git a/Documentation/admin-guide/cgroup-v1/hugetlb.rst b/Documentation/admin-guide/cgroup-v1/hugetlb.rst
> index a3902aa253a96..cc6eb859fc722 100644
> --- a/Documentation/admin-guide/cgroup-v1/hugetlb.rst
> +++ b/Documentation/admin-guide/cgroup-v1/hugetlb.rst
> @@ -2,13 +2,6 @@
>   HugeTLB Controller
>   ==================
> 
> -The HugeTLB controller allows to limit the HugeTLB usage per control group and
> -enforces the controller limit during page fault. Since HugeTLB doesn't
> -support page reclaim, enforcing the limit at page fault time implies that,
> -the application will get SIGBUS signal if it tries to access HugeTLB pages
> -beyond its limit. This requires the application to know beforehand how much
> -HugeTLB pages it would require for its use.
> -
>   HugeTLB controller can be created by first mounting the cgroup filesystem.
> 
>   # mount -t cgroup -o hugetlb none /sys/fs/cgroup
> @@ -28,10 +21,14 @@ process (bash) into it.
> 
>   Brief summary of control files::
> 
> - hugetlb.<hugepagesize>.limit_in_bytes     # set/show limit of "hugepagesize" hugetlb usage
> - hugetlb.<hugepagesize>.max_usage_in_bytes # show max "hugepagesize" hugetlb  usage recorded
> - hugetlb.<hugepagesize>.usage_in_bytes     # show current usage for "hugepagesize" hugetlb
> - hugetlb.<hugepagesize>.failcnt		   # show the number of allocation failure due to HugeTLB limit
> + hugetlb.<hugepagesize>.reservation_limit_in_bytes     # set/show limit of "hugepagesize" hugetlb reservations
> + hugetlb.<hugepagesize>.reservation_max_usage_in_bytes # show max "hugepagesize" hugetlb reservations recorded
> + hugetlb.<hugepagesize>.reservation_usage_in_bytes     # show current reservations for "hugepagesize" hugetlb
> + hugetlb.<hugepagesize>.reservation_failcnt            # show the number of allocation failure due to HugeTLB reservation limit
> + hugetlb.<hugepagesize>.limit_in_bytes                 # set/show limit of "hugepagesize" hugetlb faults
> + hugetlb.<hugepagesize>.max_usage_in_bytes             # show max "hugepagesize" hugetlb  usage recorded
> + hugetlb.<hugepagesize>.usage_in_bytes                 # show current usage for "hugepagesize" hugetlb
> + hugetlb.<hugepagesize>.failcnt                        # show the number of allocation failure due to HugeTLB usage limit
> 
>   For a system supporting three hugepage sizes (64k, 32M and 1G), the control
>   files include::
> @@ -40,11 +37,76 @@ files include::
>     hugetlb.1GB.max_usage_in_bytes
>     hugetlb.1GB.usage_in_bytes
>     hugetlb.1GB.failcnt
> +  hugetlb.1GB.reservation_limit_in_bytes
> +  hugetlb.1GB.reservation_max_usage_in_bytes
> +  hugetlb.1GB.reservation_usage_in_bytes
> +  hugetlb.1GB.reservation_failcnt
>     hugetlb.64KB.limit_in_bytes
>     hugetlb.64KB.max_usage_in_bytes
>     hugetlb.64KB.usage_in_bytes
>     hugetlb.64KB.failcnt
> +  hugetlb.64KB.reservation_limit_in_bytes
> +  hugetlb.64KB.reservation_max_usage_in_bytes
> +  hugetlb.64KB.reservation_usage_in_bytes
> +  hugetlb.64KB.reservation_failcnt
>     hugetlb.32MB.limit_in_bytes
>     hugetlb.32MB.max_usage_in_bytes
>     hugetlb.32MB.usage_in_bytes
>     hugetlb.32MB.failcnt
> +  hugetlb.32MB.reservation_limit_in_bytes
> +  hugetlb.32MB.reservation_max_usage_in_bytes
> +  hugetlb.32MB.reservation_usage_in_bytes
> +  hugetlb.32MB.reservation_failcnt
> +
> +
> +1. Reservation limits
> +
> +The HugeTLB controller allows to limit the HugeTLB reservations per control
> +group and enforces the controller limit at reservation time. Reservation limits
> +are superior to Page fault limits (see section 2), since Reservation limits are
> +enforced at reservation time, and never causes the application to get SIGBUS
> +signal. Instead, if the application is violating its limits, then it gets an
> +error on reservation time, i.e. the mmap or shmget return an error.
> +
> +
> +2. Page fault limits
> +
> +The HugeTLB controller allows to limit the HugeTLB usage (page fault) per
> +control group and enforces the controller limit during page fault. Since HugeTLB
> +doesn't support page reclaim, enforcing the limit at page fault time implies
> +that, the application will get SIGBUS signal if it tries to access HugeTLB
> +pages beyond its limit. This requires the application to know beforehand how
> +much HugeTLB pages it would require for its use.
> +
> +
> +3. Caveats with shared memory
> +
> +a. Charging and uncharging:
> +
> +For shared hugetlb memory, both hugetlb reservation and usage (page faults) are
> +charged to the first task that causes the memory to be reserved or faulted,
> +and all subsequent uses of this reserved or faulted memory is done without
> +charging.
> +
> +Shared hugetlb memory is only uncharged when it is unreseved or deallocated.

Spelling?

> +This is usually when the hugetlbfs file is deleted, and not when the task that
> +caused the reservation or fault has exited.
> +
> +b. Interaction between reservation limit and fault limit.
> +
> +Generally, it's not recommended to set both of the reservation limit and fault
> +limit in a cgroup. For private memory, the fault usage cannot exceed the
> +reservation usage, so if you set both, one of those limits will be useless.
> +

Is this enforced? What happens when attempt is made to set fault limit
on a cgroup that has reservation limit and vice versa.

> +For shared memory, a cgroup's fault usage may be greater than its reservation
> +usage, so some care needs to be taken. Consider this example:
> +
> +- Task A reserves 4 pages in a shared hugetlbfs file. Cgroup A will get
> +  4 reservations charged to it and no faults charged to it.
> +- Task B reserves and faults the same 4 pages as Task A. Cgroup B will get no
> +  reservation charge, but will get charged 4 faulted pages. If Cgroup B's limit
> +  is less than 4, then Task B will get a SIGBUS.
> +
> +For the above scenario, it's not recommended for the userspace to set both
> +reservation limits and fault limits, but it is still allowed to in case it sees
> +some use for it.

What would be the scenarios where setting both could be useful? Please 
explain.
> --
> 2.23.0.162.g0b9fbb3734-goog
> 

thanks,
-- Shuah

  reply	other threads:[~2019-09-17  1:58 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-09-10 23:31 [PATCH v4 0/9] hugetlb_cgroup: Add hugetlb_cgroup reservation limits Mina Almasry
2019-09-10 23:31 ` Mina Almasry
2019-09-10 23:31 ` [PATCH v4 1/9] hugetlb_cgroup: Add hugetlb_cgroup reservation counter Mina Almasry
2019-09-10 23:31   ` Mina Almasry
2019-09-16 23:43   ` shuah
2019-09-10 23:31 ` [PATCH v4 2/9] hugetlb_cgroup: add interface for charge/uncharge hugetlb reservations Mina Almasry
2019-09-10 23:31   ` Mina Almasry
2019-09-17  1:29   ` shuah
2019-09-10 23:31 ` [PATCH v4 3/9] hugetlb_cgroup: add reservation accounting for private mappings Mina Almasry
2019-09-10 23:31   ` Mina Almasry
2019-09-10 23:31 ` [PATCH v4 4/9] hugetlb: region_chg provides only cache entry Mina Almasry
2019-09-10 23:31   ` Mina Almasry
2019-09-16 22:17   ` Mike Kravetz
2019-09-10 23:31 ` [PATCH v4 5/9] hugetlb: remove duplicated code Mina Almasry
2019-09-10 23:31   ` Mina Almasry
2019-09-16 22:25   ` Mike Kravetz
2019-09-10 23:31 ` [PATCH v4 6/9] hugetlb: disable region_add file_region coalescing Mina Almasry
2019-09-10 23:31   ` Mina Almasry
2019-09-16 23:57   ` Mike Kravetz
2019-09-17  0:16     ` Mina Almasry
2019-09-10 23:31 ` [PATCH v4 7/9] hugetlb_cgroup: add accounting for shared mappings Mina Almasry
2019-09-10 23:31   ` Mina Almasry
2019-09-10 23:31 ` [PATCH v4 8/9] hugetlb_cgroup: Add hugetlb_cgroup reservation tests Mina Almasry
2019-09-10 23:31   ` Mina Almasry
2019-09-17  1:52   ` shuah
2019-09-19  1:53     ` Mina Almasry
2019-09-10 23:31 ` [PATCH v4 9/9] hugetlb_cgroup: Add hugetlb_cgroup reservation docs Mina Almasry
2019-09-10 23:31   ` Mina Almasry
2019-09-17  1:58   ` shuah [this message]
2019-09-11  8:35 ` [PATCH v4 3/9] hugetlb_cgroup: add reservation accounting for private mappings Hillf Danton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=9fdff535-5f36-ca91-3905-630c18858170@kernel.org \
    --to=shuah@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=almasrymina@google.com \
    --cc=aneesh.kumar@linux.vnet.ibm.com \
    --cc=cgroups@vger.kernel.org \
    --cc=gthelen@google.com \
    --cc=hdanton@sina.com \
    --cc=khalid.aziz@oracle.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mike.kravetz@oracle.com \
    --cc=mkoutny@suse.com \
    --cc=rientjes@google.com \
    --cc=shakeelb@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.