All of lore.kernel.org
 help / color / mirror / Atom feed
From: Kirill Tkhai <ktkhai@virtuozzo.com>
To: Minchan Kim <minchan@kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>
Cc: LKML <linux-kernel@vger.kernel.org>,
	linux-mm <linux-mm@kvack.org>,
	linux-api@vger.kernel.org, oleksandr@redhat.com,
	Suren Baghdasaryan <surenb@google.com>,
	Tim Murray <timmurray@google.com>,
	Daniel Colascione <dancol@google.com>,
	Sandeep Patil <sspatil@google.com>,
	Sonny Rao <sonnyrao@google.com>,
	Brian Geffon <bgeffon@google.com>, Michal Hocko <mhocko@suse.com>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Shakeel Butt <shakeelb@google.com>,
	John Dias <joaodias@google.com>,
	christian.brauner@ubuntu.com, sjpark@amazon.de,
	Minchan Kim <minchan@google.com>
Subject: Re: [PATCH v2 4/5] mm/madvise: allow KSM hints for remote API
Date: Fri, 17 Jan 2020 13:13:14 +0300	[thread overview]
Message-ID: <37338e14-5a55-1926-b6c1-5f98b6a6fdb5@virtuozzo.com> (raw)
In-Reply-To: <20200116235953.163318-5-minchan@kernel.org>

On 17.01.2020 02:59, Minchan Kim wrote:
> From: Oleksandr Natalenko <oleksandr@redhat.com>
> 
> It all began with the fact that KSM works only on memory that is marked
> by madvise(). And the only way to get around that is to either:
> 
>   * use LD_PRELOAD; or
>   * patch the kernel with something like UKSM or PKSM.
> 
> (i skip ptrace can of worms here intentionally)
> 
> To overcome this restriction, lets employ a new remote madvise API. This
> can be used by some small userspace helper daemon that will do auto-KSM
> job for us.
> 
> I think of two major consumers of remote KSM hints:
> 
>   * hosts, that run containers, especially similar ones and especially in
>     a trusted environment, sharing the same runtime like Node.js;
> 
>   * heavy applications, that can be run in multiple instances, not
>     limited to opensource ones like Firefox, but also those that cannot be
>     modified since they are binary-only and, maybe, statically linked.
> 
> Speaking of statistics, more numbers can be found in the very first
> submission, that is related to this one [1]. For my current setup with
> two Firefox instances I get 100 to 200 MiB saved for the second instance
> depending on the amount of tabs.
> 
> 1 FF instance with 15 tabs:
> 
>    $ echo "$(cat /sys/kernel/mm/ksm/pages_sharing) * 4 / 1024" | bc
>    410
> 
> 2 FF instances, second one has 12 tabs (all the tabs are different):
> 
>    $ echo "$(cat /sys/kernel/mm/ksm/pages_sharing) * 4 / 1024" | bc
>    592
> 
> At the very moment I do not have specific numbers for containerised
> workload, but those should be comparable in case the containers share
> similar/same runtime.
> 
> [1] https://lore.kernel.org/patchwork/patch/1012142/
> 
> Signed-off-by: Oleksandr Natalenko <oleksandr@redhat.com>
> Signed-off-by: Minchan Kim <minchan@google.com>
> ---
>  mm/madvise.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/mm/madvise.c b/mm/madvise.c
> index 84cffd0900f1..89557998d287 100644
> --- a/mm/madvise.c
> +++ b/mm/madvise.c
> @@ -1000,6 +1000,8 @@ process_madvise_behavior_valid(int behavior)
>  	switch (behavior) {
>  	case MADV_COLD:
>  	case MADV_PAGEOUT:
> +	case MADV_MERGEABLE:
> +	case MADV_UNMERGEABLE:
>  		return true;
>  	default:
>  		return false;

Remote madvise on KSM parameters should be OK.

One thing is madvise_behavior_valid() places MADV_MERGEABLE/UNMERGEABLE
in #ifdef brackes, so -EINVAL is returned by madvise() syscall if KSM
is not enabled. Here we should follow the same way for symmetry.

WARNING: multiple messages have this Message-ID (diff)
From: Kirill Tkhai <ktkhai-5HdwGun5lf+gSpxsJD1C4w@public.gmane.org>
To: Minchan Kim <minchan-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>,
	Andrew Morton
	<akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>
Cc: LKML <linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
	linux-mm <linux-mm-Bw31MaZKKs3YtjvyW6yDsg@public.gmane.org>,
	linux-api-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	oleksandr-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org,
	Suren Baghdasaryan
	<surenb-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>,
	Tim Murray <timmurray-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>,
	Daniel Colascione
	<dancol-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>,
	Sandeep Patil <sspatil-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>,
	Sonny Rao <sonnyrao-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>,
	Brian Geffon <bgeffon-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>,
	Michal Hocko <mhocko-IBi9RG/b67k@public.gmane.org>,
	Johannes Weiner <hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org>,
	Shakeel Butt <shakeelb-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>,
	John Dias <joaodias-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>,
	christian.brauner-GeWIH/nMZzLQT0dZR+AlfA@public.gmane.org,
	sjpark-ebkRAfMGSJGzQB+pC5nmwQ@public.gmane.org,
	Minchan Kim <minchan-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
Subject: Re: [PATCH v2 4/5] mm/madvise: allow KSM hints for remote API
Date: Fri, 17 Jan 2020 13:13:14 +0300	[thread overview]
Message-ID: <37338e14-5a55-1926-b6c1-5f98b6a6fdb5@virtuozzo.com> (raw)
In-Reply-To: <20200116235953.163318-5-minchan-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>

On 17.01.2020 02:59, Minchan Kim wrote:
> From: Oleksandr Natalenko <oleksandr-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
> 
> It all began with the fact that KSM works only on memory that is marked
> by madvise(). And the only way to get around that is to either:
> 
>   * use LD_PRELOAD; or
>   * patch the kernel with something like UKSM or PKSM.
> 
> (i skip ptrace can of worms here intentionally)
> 
> To overcome this restriction, lets employ a new remote madvise API. This
> can be used by some small userspace helper daemon that will do auto-KSM
> job for us.
> 
> I think of two major consumers of remote KSM hints:
> 
>   * hosts, that run containers, especially similar ones and especially in
>     a trusted environment, sharing the same runtime like Node.js;
> 
>   * heavy applications, that can be run in multiple instances, not
>     limited to opensource ones like Firefox, but also those that cannot be
>     modified since they are binary-only and, maybe, statically linked.
> 
> Speaking of statistics, more numbers can be found in the very first
> submission, that is related to this one [1]. For my current setup with
> two Firefox instances I get 100 to 200 MiB saved for the second instance
> depending on the amount of tabs.
> 
> 1 FF instance with 15 tabs:
> 
>    $ echo "$(cat /sys/kernel/mm/ksm/pages_sharing) * 4 / 1024" | bc
>    410
> 
> 2 FF instances, second one has 12 tabs (all the tabs are different):
> 
>    $ echo "$(cat /sys/kernel/mm/ksm/pages_sharing) * 4 / 1024" | bc
>    592
> 
> At the very moment I do not have specific numbers for containerised
> workload, but those should be comparable in case the containers share
> similar/same runtime.
> 
> [1] https://lore.kernel.org/patchwork/patch/1012142/
> 
> Signed-off-by: Oleksandr Natalenko <oleksandr-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
> Signed-off-by: Minchan Kim <minchan-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
> ---
>  mm/madvise.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/mm/madvise.c b/mm/madvise.c
> index 84cffd0900f1..89557998d287 100644
> --- a/mm/madvise.c
> +++ b/mm/madvise.c
> @@ -1000,6 +1000,8 @@ process_madvise_behavior_valid(int behavior)
>  	switch (behavior) {
>  	case MADV_COLD:
>  	case MADV_PAGEOUT:
> +	case MADV_MERGEABLE:
> +	case MADV_UNMERGEABLE:
>  		return true;
>  	default:
>  		return false;

Remote madvise on KSM parameters should be OK.

One thing is madvise_behavior_valid() places MADV_MERGEABLE/UNMERGEABLE
in #ifdef brackes, so -EINVAL is returned by madvise() syscall if KSM
is not enabled. Here we should follow the same way for symmetry.

  reply	other threads:[~2020-01-17 10:13 UTC|newest]

Thread overview: 65+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-01-16 23:59 [PATCH v2 0/5] introduce memory hinting API for external process Minchan Kim
2020-01-16 23:59 ` [PATCH v2 1/5] mm: factor out madvise's core functionality Minchan Kim
2020-01-17 10:02   ` Kirill Tkhai
2020-01-17 10:02     ` Kirill Tkhai
2020-01-17 18:14     ` Minchan Kim
2020-01-17 18:14       ` Minchan Kim
2020-01-16 23:59 ` [PATCH v2 2/5] mm: introduce external memory hinting API Minchan Kim
2020-01-16 23:59   ` Minchan Kim
2020-01-17 11:52   ` Michal Hocko
2020-01-17 11:52     ` Michal Hocko
2020-01-17 15:58     ` Kirill A. Shutemov
2020-01-17 15:58       ` Kirill A. Shutemov
2020-01-17 17:32       ` Minchan Kim
2020-01-17 17:32         ` Minchan Kim
2020-01-17 21:26         ` Kirill A. Shutemov
2020-01-17 21:26           ` Kirill A. Shutemov
2020-01-18  9:40           ` SeongJae Park
2020-01-18  9:40             ` SeongJae Park
2020-01-19 16:14           ` sspatil
2020-01-19 16:14             ` sspatil-hpIqsD4AKlfQT0dZR+AlfA
2020-01-20  7:58             ` Michal Hocko
2020-01-20 10:39               ` Kirill Tkhai
2020-01-20 10:39                 ` Kirill Tkhai
2020-01-21 18:32               ` Minchan Kim
2020-01-22  8:28                 ` Michal Hocko
2020-01-22  8:28                   ` Michal Hocko
2020-01-22  9:36                   ` SeongJae Park
2020-01-22  9:36                     ` SeongJae Park
2020-01-22 10:02                     ` Michal Hocko
2020-01-22 10:02                       ` Michal Hocko
2020-01-22 13:28                       ` SeongJae Park
2020-01-22 13:28                         ` SeongJae Park
2020-01-23  1:41                   ` Minchan Kim
2020-01-23  1:41                     ` Minchan Kim
2020-01-23  9:13                     ` Michal Hocko
2020-01-21 18:11           ` Minchan Kim
2020-01-21 18:11             ` Minchan Kim
2020-01-22 10:44             ` Oleksandr Natalenko
2020-01-23  1:43               ` Minchan Kim
2020-01-23  7:29                 ` Oleksandr Natalenko
2020-01-17 17:25     ` Minchan Kim
2020-01-17 17:25       ` Minchan Kim
2020-01-20  8:03       ` Michal Hocko
2020-01-20  8:03         ` Michal Hocko
2020-01-20 10:24     ` Kirill Tkhai
2020-01-20 10:24       ` Kirill Tkhai
2020-01-20 11:27       ` Michal Hocko
2020-01-20 11:27         ` Michal Hocko
2020-01-20 12:39         ` Kirill A. Shutemov
2020-01-20 13:24           ` Michal Hocko
2020-01-20 13:24             ` Michal Hocko
2020-01-20 14:21             ` Kirill A. Shutemov
2020-01-20 15:44               ` Michal Hocko
2020-01-20 15:44                 ` Michal Hocko
2020-01-21 18:43             ` Minchan Kim
2020-01-21 18:43               ` Minchan Kim
2020-01-16 23:59 ` [PATCH v2 3/5] mm/madvise: employ mmget_still_valid for write lock Minchan Kim
2020-01-16 23:59 ` [PATCH v2 4/5] mm/madvise: allow KSM hints for remote API Minchan Kim
2020-01-17 10:13   ` Kirill Tkhai [this message]
2020-01-17 10:13     ` Kirill Tkhai
2020-01-17 12:34     ` Oleksandr Natalenko
2020-01-17 12:34       ` Oleksandr Natalenko
2020-01-21 17:45       ` Minchan Kim
2020-01-21 17:45         ` Minchan Kim
2020-01-16 23:59 ` [PATCH v2 5/5] mm: support both pid and pidfd for process_madvise Minchan Kim

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=37338e14-5a55-1926-b6c1-5f98b6a6fdb5@virtuozzo.com \
    --to=ktkhai@virtuozzo.com \
    --cc=akpm@linux-foundation.org \
    --cc=bgeffon@google.com \
    --cc=christian.brauner@ubuntu.com \
    --cc=dancol@google.com \
    --cc=hannes@cmpxchg.org \
    --cc=joaodias@google.com \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.com \
    --cc=minchan@google.com \
    --cc=minchan@kernel.org \
    --cc=oleksandr@redhat.com \
    --cc=shakeelb@google.com \
    --cc=sjpark@amazon.de \
    --cc=sonnyrao@google.com \
    --cc=sspatil@google.com \
    --cc=surenb@google.com \
    --cc=timmurray@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.