linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH RFC v2 0/4] mm/ksm: add option to automerge VMAs
@ 2019-05-14 13:16 Oleksandr Natalenko
  2019-05-14 13:16 ` [PATCH RFC v2 1/4] mm/ksm: introduce ksm_enter() helper Oleksandr Natalenko
                   ` (4 more replies)
  0 siblings, 5 replies; 21+ messages in thread
From: Oleksandr Natalenko @ 2019-05-14 13:16 UTC (permalink / raw)
  To: linux-kernel
  Cc: Kirill Tkhai, Vlastimil Babka, Michal Hocko, Matthew Wilcox,
	Pavel Tatashin, Timofey Titovets, Aaron Tomlin, Grzegorz Halat,
	linux-mm

By default, KSM works only on memory that is marked by madvise(). And the
only way to get around that is to either:

  * use LD_PRELOAD; or
  * patch the kernel with something like UKSM or PKSM.

Instead, lets implement a sysfs knob, which allows marking VMAs as
mergeable. This can be used manually on some task in question or by some
small userspace helper daemon.

The knob is named "force_madvise", and it is write-only. It accepts a PID
to act on. To mark the VMAs as mergeable, use:

   # echo PID > /sys/kernel/mm/ksm/force_madvise

To unmerge all the VMAs, use the same approach, prepending the PID with
the "minus" sign:

   # echo -PID > /sys/kernel/mm/ksm/force_madvise

This patchset is based on earlier Timofey's submission [1], but it doesn't
use dedicated kthread to walk through the list of tasks/VMAs. Instead,
it is up to userspace to traverse all the tasks in /proc if needed.

The previous suggestion [2] was based on amending do_anonymous_page()
handler to implement fully automatic mode, but this approach was
incorrect due to improper locking and not desired due to excessive
complexity.

The current approach just implements minimal interface and leaves the
decision on how and when to act to userspace.

Thanks.

[1] https://lore.kernel.org/patchwork/patch/1012142/
[2] http://lkml.iu.edu/hypermail/linux/kernel/1905.1/02417.html

Oleksandr Natalenko (4):
  mm/ksm: introduce ksm_enter() helper
  mm/ksm: introduce ksm_leave() helper
  mm/ksm: introduce force_madvise knob
  mm/ksm: add force merging/unmerging documentation

 Documentation/admin-guide/mm/ksm.rst |  11 ++
 mm/ksm.c                             | 160 +++++++++++++++++++++------
 2 files changed, 137 insertions(+), 34 deletions(-)

-- 
2.21.0


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH RFC v2 1/4] mm/ksm: introduce ksm_enter() helper
  2019-05-14 13:16 [PATCH RFC v2 0/4] mm/ksm: add option to automerge VMAs Oleksandr Natalenko
@ 2019-05-14 13:16 ` Oleksandr Natalenko
  2019-05-14 13:16 ` [PATCH RFC v2 2/4] mm/ksm: introduce ksm_leave() helper Oleksandr Natalenko
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 21+ messages in thread
From: Oleksandr Natalenko @ 2019-05-14 13:16 UTC (permalink / raw)
  To: linux-kernel
  Cc: Kirill Tkhai, Vlastimil Babka, Michal Hocko, Matthew Wilcox,
	Pavel Tatashin, Timofey Titovets, Aaron Tomlin, Grzegorz Halat,
	linux-mm

Move MADV_MERGEABLE part of ksm_madvise() into a dedicated helper since
it will be further used for marking VMAs to be merged forcibly.

This does not bring any functional changes.

Signed-off-by: Oleksandr Natalenko <oleksandr@redhat.com>
---
 mm/ksm.c | 60 +++++++++++++++++++++++++++++++++-----------------------
 1 file changed, 36 insertions(+), 24 deletions(-)

diff --git a/mm/ksm.c b/mm/ksm.c
index fc64874dc6f4..02fdbee394cc 100644
--- a/mm/ksm.c
+++ b/mm/ksm.c
@@ -2442,41 +2442,53 @@ static int ksm_scan_thread(void *nothing)
 	return 0;
 }
 
-int ksm_madvise(struct vm_area_struct *vma, unsigned long start,
-		unsigned long end, int advice, unsigned long *vm_flags)
+static int ksm_enter(struct mm_struct *mm, struct vm_area_struct *vma,
+		unsigned long *vm_flags)
 {
-	struct mm_struct *mm = vma->vm_mm;
 	int err;
 
-	switch (advice) {
-	case MADV_MERGEABLE:
-		/*
-		 * Be somewhat over-protective for now!
-		 */
-		if (*vm_flags & (VM_MERGEABLE | VM_SHARED  | VM_MAYSHARE   |
-				 VM_PFNMAP    | VM_IO      | VM_DONTEXPAND |
-				 VM_HUGETLB | VM_MIXEDMAP))
-			return 0;		/* just ignore the advice */
+	/*
+	 * Be somewhat over-protective for now!
+	 */
+	if (*vm_flags & (VM_MERGEABLE | VM_SHARED  | VM_MAYSHARE   |
+			 VM_PFNMAP    | VM_IO      | VM_DONTEXPAND |
+			 VM_HUGETLB | VM_MIXEDMAP))
+		return 0;		/* just ignore the advice */
 
-		if (vma_is_dax(vma))
-			return 0;
+	if (vma_is_dax(vma))
+		return 0;
 
 #ifdef VM_SAO
-		if (*vm_flags & VM_SAO)
-			return 0;
+	if (*vm_flags & VM_SAO)
+		return 0;
 #endif
 #ifdef VM_SPARC_ADI
-		if (*vm_flags & VM_SPARC_ADI)
-			return 0;
+	if (*vm_flags & VM_SPARC_ADI)
+		return 0;
 #endif
 
-		if (!test_bit(MMF_VM_MERGEABLE, &mm->flags)) {
-			err = __ksm_enter(mm);
-			if (err)
-				return err;
-		}
+	if (!test_bit(MMF_VM_MERGEABLE, &mm->flags)) {
+		err = __ksm_enter(mm);
+		if (err)
+			return err;
+	}
+
+	*vm_flags |= VM_MERGEABLE;
+
+	return 0;
+}
 
-		*vm_flags |= VM_MERGEABLE;
+int ksm_madvise(struct vm_area_struct *vma, unsigned long start,
+		unsigned long end, int advice, unsigned long *vm_flags)
+{
+	struct mm_struct *mm = vma->vm_mm;
+	int err;
+
+	switch (advice) {
+	case MADV_MERGEABLE:
+		err = ksm_enter(mm, vma, vm_flags);
+		if (err)
+			return err;
 		break;
 
 	case MADV_UNMERGEABLE:
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH RFC v2 2/4] mm/ksm: introduce ksm_leave() helper
  2019-05-14 13:16 [PATCH RFC v2 0/4] mm/ksm: add option to automerge VMAs Oleksandr Natalenko
  2019-05-14 13:16 ` [PATCH RFC v2 1/4] mm/ksm: introduce ksm_enter() helper Oleksandr Natalenko
@ 2019-05-14 13:16 ` Oleksandr Natalenko
  2019-05-14 13:16 ` [PATCH RFC v2 3/4] mm/ksm: introduce force_madvise knob Oleksandr Natalenko
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 21+ messages in thread
From: Oleksandr Natalenko @ 2019-05-14 13:16 UTC (permalink / raw)
  To: linux-kernel
  Cc: Kirill Tkhai, Vlastimil Babka, Michal Hocko, Matthew Wilcox,
	Pavel Tatashin, Timofey Titovets, Aaron Tomlin, Grzegorz Halat,
	linux-mm

Move MADV_UNMERGEABLE part of ksm_madvise() into a dedicated helper
since it will be further used for unmerging VMAs forcibly.

This does not bring any functional changes.

Signed-off-by: Oleksandr Natalenko <oleksandr@redhat.com>
---
 mm/ksm.c | 32 ++++++++++++++++++++++----------
 1 file changed, 22 insertions(+), 10 deletions(-)

diff --git a/mm/ksm.c b/mm/ksm.c
index 02fdbee394cc..e9f3901168bb 100644
--- a/mm/ksm.c
+++ b/mm/ksm.c
@@ -2478,6 +2478,25 @@ static int ksm_enter(struct mm_struct *mm, struct vm_area_struct *vma,
 	return 0;
 }
 
+static int ksm_leave(struct vm_area_struct *vma, unsigned long start,
+		unsigned long end, unsigned long *vm_flags)
+{
+	int err;
+
+	if (!(*vm_flags & VM_MERGEABLE))
+		return 0;		/* just ignore the advice */
+
+	if (vma->anon_vma) {
+		err = unmerge_ksm_pages(vma, start, end);
+		if (err)
+			return err;
+	}
+
+	*vm_flags &= ~VM_MERGEABLE;
+
+	return 0;
+}
+
 int ksm_madvise(struct vm_area_struct *vma, unsigned long start,
 		unsigned long end, int advice, unsigned long *vm_flags)
 {
@@ -2492,16 +2511,9 @@ int ksm_madvise(struct vm_area_struct *vma, unsigned long start,
 		break;
 
 	case MADV_UNMERGEABLE:
-		if (!(*vm_flags & VM_MERGEABLE))
-			return 0;		/* just ignore the advice */
-
-		if (vma->anon_vma) {
-			err = unmerge_ksm_pages(vma, start, end);
-			if (err)
-				return err;
-		}
-
-		*vm_flags &= ~VM_MERGEABLE;
+		err = ksm_leave(vma, start, end, vm_flags);
+		if (err)
+			return err;
 		break;
 	}
 
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH RFC v2 3/4] mm/ksm: introduce force_madvise knob
  2019-05-14 13:16 [PATCH RFC v2 0/4] mm/ksm: add option to automerge VMAs Oleksandr Natalenko
  2019-05-14 13:16 ` [PATCH RFC v2 1/4] mm/ksm: introduce ksm_enter() helper Oleksandr Natalenko
  2019-05-14 13:16 ` [PATCH RFC v2 2/4] mm/ksm: introduce ksm_leave() helper Oleksandr Natalenko
@ 2019-05-14 13:16 ` Oleksandr Natalenko
  2019-05-14 13:22   ` Aaron Tomlin
  2019-05-14 13:16 ` [PATCH RFC v2 4/4] mm/ksm: add force merging/unmerging documentation Oleksandr Natalenko
  2019-05-14 14:41 ` [PATCH RFC v2 0/4] mm/ksm: add option to automerge VMAs Michal Hocko
  4 siblings, 1 reply; 21+ messages in thread
From: Oleksandr Natalenko @ 2019-05-14 13:16 UTC (permalink / raw)
  To: linux-kernel
  Cc: Kirill Tkhai, Vlastimil Babka, Michal Hocko, Matthew Wilcox,
	Pavel Tatashin, Timofey Titovets, Aaron Tomlin, Grzegorz Halat,
	linux-mm

Present a new sysfs knob to mark task's anonymous memory as mergeable.

To force merging task's VMAs, its PID is echoed in a write-only file:

   # echo PID > /sys/kernel/mm/ksm/force_madvise

Force unmerging is done similarly, but with "minus" sign:

   # echo -PID > /sys/kernel/mm/ksm/force_madvise

"0" or "-0" can be used to control the current task.

To achieve this, previously introduced ksm_enter()/ksm_leave() helpers
are used in the "store" handler.

Signed-off-by: Oleksandr Natalenko <oleksandr@redhat.com>
---
 mm/ksm.c | 68 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 68 insertions(+)

diff --git a/mm/ksm.c b/mm/ksm.c
index e9f3901168bb..22c59fb03d3a 100644
--- a/mm/ksm.c
+++ b/mm/ksm.c
@@ -2879,10 +2879,77 @@ static void wait_while_offlining(void)
 
 #define KSM_ATTR_RO(_name) \
 	static struct kobj_attribute _name##_attr = __ATTR_RO(_name)
+#define KSM_ATTR_WO(_name) \
+	static struct kobj_attribute _name##_attr = __ATTR_WO(_name)
 #define KSM_ATTR(_name) \
 	static struct kobj_attribute _name##_attr = \
 		__ATTR(_name, 0644, _name##_show, _name##_store)
 
+static ssize_t force_madvise_store(struct kobject *kobj,
+				     struct kobj_attribute *attr,
+				     const char *buf, size_t count)
+{
+	int err;
+	pid_t pid;
+	bool merge = true;
+	struct task_struct *tsk;
+	struct mm_struct *mm;
+	struct vm_area_struct *vma;
+
+	err = kstrtoint(buf, 10, &pid);
+	if (err)
+		return -EINVAL;
+
+	if (pid < 0) {
+		pid = abs(pid);
+		merge = false;
+	}
+
+	if (!pid && *buf == '-')
+		merge = false;
+
+	rcu_read_lock();
+	if (pid) {
+		tsk = find_task_by_vpid(pid);
+		if (!tsk) {
+			err = -ESRCH;
+			rcu_read_unlock();
+			goto out;
+		}
+	} else {
+		tsk = current;
+	}
+
+	tsk = tsk->group_leader;
+
+	get_task_struct(tsk);
+	rcu_read_unlock();
+
+	mm = get_task_mm(tsk);
+	if (!mm) {
+		err = -EINVAL;
+		goto out_put_task_struct;
+	}
+	down_write(&mm->mmap_sem);
+	vma = mm->mmap;
+	while (vma) {
+		if (merge)
+			ksm_enter(vma->vm_mm, vma, &vma->vm_flags);
+		else
+			ksm_leave(vma, vma->vm_start, vma->vm_end, &vma->vm_flags);
+		vma = vma->vm_next;
+	}
+	up_write(&mm->mmap_sem);
+	mmput(mm);
+
+out_put_task_struct:
+	put_task_struct(tsk);
+
+out:
+	return err ? err : count;
+}
+KSM_ATTR_WO(force_madvise);
+
 static ssize_t sleep_millisecs_show(struct kobject *kobj,
 				    struct kobj_attribute *attr, char *buf)
 {
@@ -3185,6 +3252,7 @@ static ssize_t full_scans_show(struct kobject *kobj,
 KSM_ATTR_RO(full_scans);
 
 static struct attribute *ksm_attrs[] = {
+	&force_madvise_attr.attr,
 	&sleep_millisecs_attr.attr,
 	&pages_to_scan_attr.attr,
 	&run_attr.attr,
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH RFC v2 4/4] mm/ksm: add force merging/unmerging documentation
  2019-05-14 13:16 [PATCH RFC v2 0/4] mm/ksm: add option to automerge VMAs Oleksandr Natalenko
                   ` (2 preceding siblings ...)
  2019-05-14 13:16 ` [PATCH RFC v2 3/4] mm/ksm: introduce force_madvise knob Oleksandr Natalenko
@ 2019-05-14 13:16 ` Oleksandr Natalenko
  2019-05-15  0:53   ` Timofey Titovets
  2019-05-14 14:41 ` [PATCH RFC v2 0/4] mm/ksm: add option to automerge VMAs Michal Hocko
  4 siblings, 1 reply; 21+ messages in thread
From: Oleksandr Natalenko @ 2019-05-14 13:16 UTC (permalink / raw)
  To: linux-kernel
  Cc: Kirill Tkhai, Vlastimil Babka, Michal Hocko, Matthew Wilcox,
	Pavel Tatashin, Timofey Titovets, Aaron Tomlin, Grzegorz Halat,
	linux-mm

Document respective sysfs knob.

Signed-off-by: Oleksandr Natalenko <oleksandr@redhat.com>
---
 Documentation/admin-guide/mm/ksm.rst | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/Documentation/admin-guide/mm/ksm.rst b/Documentation/admin-guide/mm/ksm.rst
index 9303786632d1..4302b92910ec 100644
--- a/Documentation/admin-guide/mm/ksm.rst
+++ b/Documentation/admin-guide/mm/ksm.rst
@@ -78,6 +78,17 @@ KSM daemon sysfs interface
 The KSM daemon is controlled by sysfs files in ``/sys/kernel/mm/ksm/``,
 readable by all but writable only by root:
 
+force_madvise
+        write-only control to force merging/unmerging for specific
+        task.
+
+        To mark the VMAs as mergeable, use:
+        ``echo PID > /sys/kernel/mm/ksm/force_madvise``
+
+        To unmerge all the VMAs, use:
+        ``echo -PID > /sys/kernel/mm/ksm/force_madvise``
+        (note the prepending "minus")
+
 pages_to_scan
         how many pages to scan before ksmd goes to sleep
         e.g. ``echo 100 > /sys/kernel/mm/ksm/pages_to_scan``.
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [PATCH RFC v2 3/4] mm/ksm: introduce force_madvise knob
  2019-05-14 13:16 ` [PATCH RFC v2 3/4] mm/ksm: introduce force_madvise knob Oleksandr Natalenko
@ 2019-05-14 13:22   ` Aaron Tomlin
  2019-05-15  0:48     ` Timofey Titovets
  0 siblings, 1 reply; 21+ messages in thread
From: Aaron Tomlin @ 2019-05-14 13:22 UTC (permalink / raw)
  To: Oleksandr Natalenko
  Cc: linux-kernel, Kirill Tkhai, Vlastimil Babka, Michal Hocko,
	Matthew Wilcox, Pavel Tatashin, Timofey Titovets, Grzegorz Halat,
	linux-mm

On Tue 2019-05-14 15:16 +0200, Oleksandr Natalenko wrote:
> Present a new sysfs knob to mark task's anonymous memory as mergeable.
> 
> To force merging task's VMAs, its PID is echoed in a write-only file:
> 
>    # echo PID > /sys/kernel/mm/ksm/force_madvise
> 
> Force unmerging is done similarly, but with "minus" sign:
> 
>    # echo -PID > /sys/kernel/mm/ksm/force_madvise
> 
> "0" or "-0" can be used to control the current task.
> 
> To achieve this, previously introduced ksm_enter()/ksm_leave() helpers
> are used in the "store" handler.
> 
> Signed-off-by: Oleksandr Natalenko <oleksandr@redhat.com>
> ---
>  mm/ksm.c | 68 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 68 insertions(+)
> 
> diff --git a/mm/ksm.c b/mm/ksm.c
> index e9f3901168bb..22c59fb03d3a 100644
> --- a/mm/ksm.c
> +++ b/mm/ksm.c
> @@ -2879,10 +2879,77 @@ static void wait_while_offlining(void)
>  
>  #define KSM_ATTR_RO(_name) \
>  	static struct kobj_attribute _name##_attr = __ATTR_RO(_name)
> +#define KSM_ATTR_WO(_name) \
> +	static struct kobj_attribute _name##_attr = __ATTR_WO(_name)
>  #define KSM_ATTR(_name) \
>  	static struct kobj_attribute _name##_attr = \
>  		__ATTR(_name, 0644, _name##_show, _name##_store)
>  
> +static ssize_t force_madvise_store(struct kobject *kobj,
> +				     struct kobj_attribute *attr,
> +				     const char *buf, size_t count)
> +{
> +	int err;
> +	pid_t pid;
> +	bool merge = true;
> +	struct task_struct *tsk;
> +	struct mm_struct *mm;
> +	struct vm_area_struct *vma;
> +
> +	err = kstrtoint(buf, 10, &pid);
> +	if (err)
> +		return -EINVAL;
> +
> +	if (pid < 0) {
> +		pid = abs(pid);
> +		merge = false;
> +	}
> +
> +	if (!pid && *buf == '-')
> +		merge = false;
> +
> +	rcu_read_lock();
> +	if (pid) {
> +		tsk = find_task_by_vpid(pid);
> +		if (!tsk) {
> +			err = -ESRCH;
> +			rcu_read_unlock();
> +			goto out;
> +		}
> +	} else {
> +		tsk = current;
> +	}
> +
> +	tsk = tsk->group_leader;
> +
> +	get_task_struct(tsk);
> +	rcu_read_unlock();
> +
> +	mm = get_task_mm(tsk);
> +	if (!mm) {
> +		err = -EINVAL;
> +		goto out_put_task_struct;
> +	}
> +	down_write(&mm->mmap_sem);
> +	vma = mm->mmap;
> +	while (vma) {
> +		if (merge)
> +			ksm_enter(vma->vm_mm, vma, &vma->vm_flags);
> +		else
> +			ksm_leave(vma, vma->vm_start, vma->vm_end, &vma->vm_flags);
> +		vma = vma->vm_next;
> +	}
> +	up_write(&mm->mmap_sem);
> +	mmput(mm);
> +
> +out_put_task_struct:
> +	put_task_struct(tsk);
> +
> +out:
> +	return err ? err : count;
> +}
> +KSM_ATTR_WO(force_madvise);
> +
>  static ssize_t sleep_millisecs_show(struct kobject *kobj,
>  				    struct kobj_attribute *attr, char *buf)
>  {
> @@ -3185,6 +3252,7 @@ static ssize_t full_scans_show(struct kobject *kobj,
>  KSM_ATTR_RO(full_scans);
>  
>  static struct attribute *ksm_attrs[] = {
> +	&force_madvise_attr.attr,
>  	&sleep_millisecs_attr.attr,
>  	&pages_to_scan_attr.attr,
>  	&run_attr.attr,

Looks fine to me.

Reviewed-by: Aaron Tomlin <atomlin@redhat.com>

-- 
Aaron Tomlin

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH RFC v2 0/4] mm/ksm: add option to automerge VMAs
  2019-05-14 13:16 [PATCH RFC v2 0/4] mm/ksm: add option to automerge VMAs Oleksandr Natalenko
                   ` (3 preceding siblings ...)
  2019-05-14 13:16 ` [PATCH RFC v2 4/4] mm/ksm: add force merging/unmerging documentation Oleksandr Natalenko
@ 2019-05-14 14:41 ` Michal Hocko
  2019-05-14 14:51   ` Michal Hocko
  4 siblings, 1 reply; 21+ messages in thread
From: Michal Hocko @ 2019-05-14 14:41 UTC (permalink / raw)
  To: Oleksandr Natalenko
  Cc: linux-kernel, Kirill Tkhai, Vlastimil Babka, Matthew Wilcox,
	Pavel Tatashin, Timofey Titovets, Aaron Tomlin, Grzegorz Halat,
	linux-mm, linux-api

[This is adding a new user visible interface so you should be CCing
linux-api mailing list. Also CC Hugh for KSM in general. Done now]

On Tue 14-05-19 15:16:50, Oleksandr Natalenko wrote:
> By default, KSM works only on memory that is marked by madvise(). And the
> only way to get around that is to either:
> 
>   * use LD_PRELOAD; or
>   * patch the kernel with something like UKSM or PKSM.
> 
> Instead, lets implement a sysfs knob, which allows marking VMAs as
> mergeable. This can be used manually on some task in question or by some
> small userspace helper daemon.
> 
> The knob is named "force_madvise", and it is write-only. It accepts a PID
> to act on. To mark the VMAs as mergeable, use:
> 
>    # echo PID > /sys/kernel/mm/ksm/force_madvise
> 
> To unmerge all the VMAs, use the same approach, prepending the PID with
> the "minus" sign:
> 
>    # echo -PID > /sys/kernel/mm/ksm/force_madvise
> 
> This patchset is based on earlier Timofey's submission [1], but it doesn't
> use dedicated kthread to walk through the list of tasks/VMAs. Instead,
> it is up to userspace to traverse all the tasks in /proc if needed.
> 
> The previous suggestion [2] was based on amending do_anonymous_page()
> handler to implement fully automatic mode, but this approach was
> incorrect due to improper locking and not desired due to excessive
> complexity.
> 
> The current approach just implements minimal interface and leaves the
> decision on how and when to act to userspace.

Please make sure to describe a usecase that warrants adding a new
interface we have to maintain for ever.

> 
> Thanks.
> 
> [1] https://lore.kernel.org/patchwork/patch/1012142/
> [2] http://lkml.iu.edu/hypermail/linux/kernel/1905.1/02417.html
> 
> Oleksandr Natalenko (4):
>   mm/ksm: introduce ksm_enter() helper
>   mm/ksm: introduce ksm_leave() helper
>   mm/ksm: introduce force_madvise knob
>   mm/ksm: add force merging/unmerging documentation
> 
>  Documentation/admin-guide/mm/ksm.rst |  11 ++
>  mm/ksm.c                             | 160 +++++++++++++++++++++------
>  2 files changed, 137 insertions(+), 34 deletions(-)
> 
> -- 
> 2.21.0
> 

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH RFC v2 0/4] mm/ksm: add option to automerge VMAs
  2019-05-14 14:41 ` [PATCH RFC v2 0/4] mm/ksm: add option to automerge VMAs Michal Hocko
@ 2019-05-14 14:51   ` Michal Hocko
  2019-05-15  6:25     ` Oleksandr Natalenko
  0 siblings, 1 reply; 21+ messages in thread
From: Michal Hocko @ 2019-05-14 14:51 UTC (permalink / raw)
  To: Oleksandr Natalenko
  Cc: linux-kernel, Kirill Tkhai, Vlastimil Babka, Matthew Wilcox,
	Pavel Tatashin, Timofey Titovets, Aaron Tomlin, Grzegorz Halat,
	linux-mm, linux-api, Hugh Dickins

[Forgot Hugh]

On Tue 14-05-19 16:41:05, Michal Hocko wrote:
> [This is adding a new user visible interface so you should be CCing
> linux-api mailing list. Also CC Hugh for KSM in general. Done now]
> 
> On Tue 14-05-19 15:16:50, Oleksandr Natalenko wrote:
> > By default, KSM works only on memory that is marked by madvise(). And the
> > only way to get around that is to either:
> > 
> >   * use LD_PRELOAD; or
> >   * patch the kernel with something like UKSM or PKSM.
> > 
> > Instead, lets implement a sysfs knob, which allows marking VMAs as
> > mergeable. This can be used manually on some task in question or by some
> > small userspace helper daemon.
> > 
> > The knob is named "force_madvise", and it is write-only. It accepts a PID
> > to act on. To mark the VMAs as mergeable, use:
> > 
> >    # echo PID > /sys/kernel/mm/ksm/force_madvise
> > 
> > To unmerge all the VMAs, use the same approach, prepending the PID with
> > the "minus" sign:
> > 
> >    # echo -PID > /sys/kernel/mm/ksm/force_madvise
> > 
> > This patchset is based on earlier Timofey's submission [1], but it doesn't
> > use dedicated kthread to walk through the list of tasks/VMAs. Instead,
> > it is up to userspace to traverse all the tasks in /proc if needed.
> > 
> > The previous suggestion [2] was based on amending do_anonymous_page()
> > handler to implement fully automatic mode, but this approach was
> > incorrect due to improper locking and not desired due to excessive
> > complexity.
> > 
> > The current approach just implements minimal interface and leaves the
> > decision on how and when to act to userspace.
> 
> Please make sure to describe a usecase that warrants adding a new
> interface we have to maintain for ever.
> 
> > 
> > Thanks.
> > 
> > [1] https://lore.kernel.org/patchwork/patch/1012142/
> > [2] http://lkml.iu.edu/hypermail/linux/kernel/1905.1/02417.html
> > 
> > Oleksandr Natalenko (4):
> >   mm/ksm: introduce ksm_enter() helper
> >   mm/ksm: introduce ksm_leave() helper
> >   mm/ksm: introduce force_madvise knob
> >   mm/ksm: add force merging/unmerging documentation
> > 
> >  Documentation/admin-guide/mm/ksm.rst |  11 ++
> >  mm/ksm.c                             | 160 +++++++++++++++++++++------
> >  2 files changed, 137 insertions(+), 34 deletions(-)
> > 
> > -- 
> > 2.21.0
> > 
> 
> -- 
> Michal Hocko
> SUSE Labs

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH RFC v2 3/4] mm/ksm: introduce force_madvise knob
  2019-05-14 13:22   ` Aaron Tomlin
@ 2019-05-15  0:48     ` Timofey Titovets
  0 siblings, 0 replies; 21+ messages in thread
From: Timofey Titovets @ 2019-05-15  0:48 UTC (permalink / raw)
  To: Aaron Tomlin
  Cc: Oleksandr Natalenko, Linux Kernel, Kirill Tkhai, Vlastimil Babka,
	Michal Hocko, Matthew Wilcox, Pavel Tatashin, Grzegorz Halat,
	linux-mm

LGTM

Reviewed-by: Timofey Titovets <nefelim4ag@gmail.com>

вт, 14 мая 2019 г. в 16:22, Aaron Tomlin <atomlin@redhat.com>:
>
> On Tue 2019-05-14 15:16 +0200, Oleksandr Natalenko wrote:
> > Present a new sysfs knob to mark task's anonymous memory as mergeable.
> >
> > To force merging task's VMAs, its PID is echoed in a write-only file:
> >
> >    # echo PID > /sys/kernel/mm/ksm/force_madvise
> >
> > Force unmerging is done similarly, but with "minus" sign:
> >
> >    # echo -PID > /sys/kernel/mm/ksm/force_madvise
> >
> > "0" or "-0" can be used to control the current task.
> >
> > To achieve this, previously introduced ksm_enter()/ksm_leave() helpers
> > are used in the "store" handler.
> >
> > Signed-off-by: Oleksandr Natalenko <oleksandr@redhat.com>
> > ---
> >  mm/ksm.c | 68 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> >  1 file changed, 68 insertions(+)
> >
> > diff --git a/mm/ksm.c b/mm/ksm.c
> > index e9f3901168bb..22c59fb03d3a 100644
> > --- a/mm/ksm.c
> > +++ b/mm/ksm.c
> > @@ -2879,10 +2879,77 @@ static void wait_while_offlining(void)
> >
> >  #define KSM_ATTR_RO(_name) \
> >       static struct kobj_attribute _name##_attr = __ATTR_RO(_name)
> > +#define KSM_ATTR_WO(_name) \
> > +     static struct kobj_attribute _name##_attr = __ATTR_WO(_name)
> >  #define KSM_ATTR(_name) \
> >       static struct kobj_attribute _name##_attr = \
> >               __ATTR(_name, 0644, _name##_show, _name##_store)
> >
> > +static ssize_t force_madvise_store(struct kobject *kobj,
> > +                                  struct kobj_attribute *attr,
> > +                                  const char *buf, size_t count)
> > +{
> > +     int err;
> > +     pid_t pid;
> > +     bool merge = true;
> > +     struct task_struct *tsk;
> > +     struct mm_struct *mm;
> > +     struct vm_area_struct *vma;
> > +
> > +     err = kstrtoint(buf, 10, &pid);
> > +     if (err)
> > +             return -EINVAL;
> > +
> > +     if (pid < 0) {
> > +             pid = abs(pid);
> > +             merge = false;
> > +     }
> > +
> > +     if (!pid && *buf == '-')
> > +             merge = false;
> > +
> > +     rcu_read_lock();
> > +     if (pid) {
> > +             tsk = find_task_by_vpid(pid);
> > +             if (!tsk) {
> > +                     err = -ESRCH;
> > +                     rcu_read_unlock();
> > +                     goto out;
> > +             }
> > +     } else {
> > +             tsk = current;
> > +     }
> > +
> > +     tsk = tsk->group_leader;
> > +
> > +     get_task_struct(tsk);
> > +     rcu_read_unlock();
> > +
> > +     mm = get_task_mm(tsk);
> > +     if (!mm) {
> > +             err = -EINVAL;
> > +             goto out_put_task_struct;
> > +     }
> > +     down_write(&mm->mmap_sem);
> > +     vma = mm->mmap;
> > +     while (vma) {
> > +             if (merge)
> > +                     ksm_enter(vma->vm_mm, vma, &vma->vm_flags);
> > +             else
> > +                     ksm_leave(vma, vma->vm_start, vma->vm_end, &vma->vm_flags);
> > +             vma = vma->vm_next;
> > +     }
> > +     up_write(&mm->mmap_sem);
> > +     mmput(mm);
> > +
> > +out_put_task_struct:
> > +     put_task_struct(tsk);
> > +
> > +out:
> > +     return err ? err : count;
> > +}
> > +KSM_ATTR_WO(force_madvise);
> > +
> >  static ssize_t sleep_millisecs_show(struct kobject *kobj,
> >                                   struct kobj_attribute *attr, char *buf)
> >  {
> > @@ -3185,6 +3252,7 @@ static ssize_t full_scans_show(struct kobject *kobj,
> >  KSM_ATTR_RO(full_scans);
> >
> >  static struct attribute *ksm_attrs[] = {
> > +     &force_madvise_attr.attr,
> >       &sleep_millisecs_attr.attr,
> >       &pages_to_scan_attr.attr,
> >       &run_attr.attr,
>
> Looks fine to me.
>
> Reviewed-by: Aaron Tomlin <atomlin@redhat.com>
>
> --
> Aaron Tomlin



-- 
Have a nice day,
Timofey.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH RFC v2 4/4] mm/ksm: add force merging/unmerging documentation
  2019-05-14 13:16 ` [PATCH RFC v2 4/4] mm/ksm: add force merging/unmerging documentation Oleksandr Natalenko
@ 2019-05-15  0:53   ` Timofey Titovets
  2019-05-15  6:26     ` Oleksandr Natalenko
  0 siblings, 1 reply; 21+ messages in thread
From: Timofey Titovets @ 2019-05-15  0:53 UTC (permalink / raw)
  To: Oleksandr Natalenko
  Cc: Linux Kernel, Kirill Tkhai, Vlastimil Babka, Michal Hocko,
	Matthew Wilcox, Pavel Tatashin, Aaron Tomlin, Grzegorz Halat,
	linux-mm

LGTM for whole series

Reviewed-by: Timofey Titovets <nefelim4ag@gmail.com>

вт, 14 мая 2019 г. в 16:17, Oleksandr Natalenko <oleksandr@redhat.com>:
>
> Document respective sysfs knob.
>
> Signed-off-by: Oleksandr Natalenko <oleksandr@redhat.com>
> ---
>  Documentation/admin-guide/mm/ksm.rst | 11 +++++++++++
>  1 file changed, 11 insertions(+)
>
> diff --git a/Documentation/admin-guide/mm/ksm.rst b/Documentation/admin-guide/mm/ksm.rst
> index 9303786632d1..4302b92910ec 100644
> --- a/Documentation/admin-guide/mm/ksm.rst
> +++ b/Documentation/admin-guide/mm/ksm.rst
> @@ -78,6 +78,17 @@ KSM daemon sysfs interface
>  The KSM daemon is controlled by sysfs files in ``/sys/kernel/mm/ksm/``,
>  readable by all but writable only by root:
>
> +force_madvise
> +        write-only control to force merging/unmerging for specific
> +        task.
> +
> +        To mark the VMAs as mergeable, use:
> +        ``echo PID > /sys/kernel/mm/ksm/force_madvise``
> +
> +        To unmerge all the VMAs, use:
> +        ``echo -PID > /sys/kernel/mm/ksm/force_madvise``
> +        (note the prepending "minus")
> +
In patch 3/4 you have special case with PID 0,
may be that also must be documented here?

>  pages_to_scan
>          how many pages to scan before ksmd goes to sleep
>          e.g. ``echo 100 > /sys/kernel/mm/ksm/pages_to_scan``.
> --
> 2.21.0
>


--
Have a nice day,
Timofey.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH RFC v2 0/4] mm/ksm: add option to automerge VMAs
  2019-05-14 14:51   ` Michal Hocko
@ 2019-05-15  6:25     ` Oleksandr Natalenko
  2019-05-15  6:53       ` Michal Hocko
  0 siblings, 1 reply; 21+ messages in thread
From: Oleksandr Natalenko @ 2019-05-15  6:25 UTC (permalink / raw)
  To: Michal Hocko
  Cc: linux-kernel, Kirill Tkhai, Vlastimil Babka, Matthew Wilcox,
	Pavel Tatashin, Timofey Titovets, Aaron Tomlin, Grzegorz Halat,
	linux-mm, linux-api, Hugh Dickins

Hi.

On Tue, May 14, 2019 at 04:51:22PM +0200, Michal Hocko wrote:
> [Forgot Hugh]
> 
> On Tue 14-05-19 16:41:05, Michal Hocko wrote:
> > [This is adding a new user visible interface so you should be CCing
> > linux-api mailing list. Also CC Hugh for KSM in general. Done now]

Right, thanks for taking care of this.

> > On Tue 14-05-19 15:16:50, Oleksandr Natalenko wrote:
> > > By default, KSM works only on memory that is marked by madvise(). And the
> > > only way to get around that is to either:
> > > 
> > >   * use LD_PRELOAD; or
> > >   * patch the kernel with something like UKSM or PKSM.
> > > 
> > > Instead, lets implement a sysfs knob, which allows marking VMAs as
> > > mergeable. This can be used manually on some task in question or by some
> > > small userspace helper daemon.
> > > 
> > > The knob is named "force_madvise", and it is write-only. It accepts a PID
> > > to act on. To mark the VMAs as mergeable, use:
> > > 
> > >    # echo PID > /sys/kernel/mm/ksm/force_madvise
> > > 
> > > To unmerge all the VMAs, use the same approach, prepending the PID with
> > > the "minus" sign:
> > > 
> > >    # echo -PID > /sys/kernel/mm/ksm/force_madvise
> > > 
> > > This patchset is based on earlier Timofey's submission [1], but it doesn't
> > > use dedicated kthread to walk through the list of tasks/VMAs. Instead,
> > > it is up to userspace to traverse all the tasks in /proc if needed.
> > > 
> > > The previous suggestion [2] was based on amending do_anonymous_page()
> > > handler to implement fully automatic mode, but this approach was
> > > incorrect due to improper locking and not desired due to excessive
> > > complexity.
> > > 
> > > The current approach just implements minimal interface and leaves the
> > > decision on how and when to act to userspace.
> > 
> > Please make sure to describe a usecase that warrants adding a new
> > interface we have to maintain for ever.

I think of two major consumers of this interface:

1) hosts, that run containers, especially similar ones and especially in
a trusted environment;

2) heavy applications, that can be run in multiple instances, not
limited to opensource ones like Firefox, but also those that cannot be
modified.

I'll add this justification to the cover letter once I send another
iteration of this submission if necessary.

Thank you.

> > 
> > > 
> > > Thanks.
> > > 
> > > [1] https://lore.kernel.org/patchwork/patch/1012142/
> > > [2] http://lkml.iu.edu/hypermail/linux/kernel/1905.1/02417.html
> > > 
> > > Oleksandr Natalenko (4):
> > >   mm/ksm: introduce ksm_enter() helper
> > >   mm/ksm: introduce ksm_leave() helper
> > >   mm/ksm: introduce force_madvise knob
> > >   mm/ksm: add force merging/unmerging documentation
> > > 
> > >  Documentation/admin-guide/mm/ksm.rst |  11 ++
> > >  mm/ksm.c                             | 160 +++++++++++++++++++++------
> > >  2 files changed, 137 insertions(+), 34 deletions(-)
> > > 
> > > -- 
> > > 2.21.0
> > > 
> > 
> > -- 
> > Michal Hocko
> > SUSE Labs
> 
> -- 
> Michal Hocko
> SUSE Labs

-- 
  Best regards,
    Oleksandr Natalenko (post-factum)
    Senior Software Maintenance Engineer

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH RFC v2 4/4] mm/ksm: add force merging/unmerging documentation
  2019-05-15  0:53   ` Timofey Titovets
@ 2019-05-15  6:26     ` Oleksandr Natalenko
  0 siblings, 0 replies; 21+ messages in thread
From: Oleksandr Natalenko @ 2019-05-15  6:26 UTC (permalink / raw)
  To: Timofey Titovets
  Cc: Linux Kernel, Kirill Tkhai, Vlastimil Babka, Michal Hocko,
	Matthew Wilcox, Pavel Tatashin, Aaron Tomlin, Grzegorz Halat,
	linux-mm, linux-api, Hugh Dickins

Hi.

On Wed, May 15, 2019 at 03:53:55AM +0300, Timofey Titovets wrote:
> LGTM for whole series
> 
> Reviewed-by: Timofey Titovets <nefelim4ag@gmail.com>
> 
> вт, 14 мая 2019 г. в 16:17, Oleksandr Natalenko <oleksandr@redhat.com>:
> >
> > Document respective sysfs knob.
> >
> > Signed-off-by: Oleksandr Natalenko <oleksandr@redhat.com>
> > ---
> >  Documentation/admin-guide/mm/ksm.rst | 11 +++++++++++
> >  1 file changed, 11 insertions(+)
> >
> > diff --git a/Documentation/admin-guide/mm/ksm.rst b/Documentation/admin-guide/mm/ksm.rst
> > index 9303786632d1..4302b92910ec 100644
> > --- a/Documentation/admin-guide/mm/ksm.rst
> > +++ b/Documentation/admin-guide/mm/ksm.rst
> > @@ -78,6 +78,17 @@ KSM daemon sysfs interface
> >  The KSM daemon is controlled by sysfs files in ``/sys/kernel/mm/ksm/``,
> >  readable by all but writable only by root:
> >
> > +force_madvise
> > +        write-only control to force merging/unmerging for specific
> > +        task.
> > +
> > +        To mark the VMAs as mergeable, use:
> > +        ``echo PID > /sys/kernel/mm/ksm/force_madvise``
> > +
> > +        To unmerge all the VMAs, use:
> > +        ``echo -PID > /sys/kernel/mm/ksm/force_madvise``
> > +        (note the prepending "minus")
> > +
> In patch 3/4 you have special case with PID 0,
> may be that also must be documented here?

Thanks for the review. Yes, this is a valid point, I'll document it too.

> 
> >  pages_to_scan
> >          how many pages to scan before ksmd goes to sleep
> >          e.g. ``echo 100 > /sys/kernel/mm/ksm/pages_to_scan``.
> > --
> > 2.21.0
> >
> 
> 
> --
> Have a nice day,
> Timofey.

-- 
  Best regards,
    Oleksandr Natalenko (post-factum)
    Senior Software Maintenance Engineer

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH RFC v2 0/4] mm/ksm: add option to automerge VMAs
  2019-05-15  6:25     ` Oleksandr Natalenko
@ 2019-05-15  6:53       ` Michal Hocko
  2019-05-15  7:37         ` Oleksandr Natalenko
  2019-05-15 14:51         ` Michal Hocko
  0 siblings, 2 replies; 21+ messages in thread
From: Michal Hocko @ 2019-05-15  6:53 UTC (permalink / raw)
  To: Oleksandr Natalenko
  Cc: linux-kernel, Kirill Tkhai, Vlastimil Babka, Matthew Wilcox,
	Pavel Tatashin, Timofey Titovets, Aaron Tomlin, Grzegorz Halat,
	linux-mm, linux-api, Hugh Dickins

On Wed 15-05-19 08:25:23, Oleksandr Natalenko wrote:
[...]
> > > Please make sure to describe a usecase that warrants adding a new
> > > interface we have to maintain for ever.
> 
> I think of two major consumers of this interface:
> 
> 1) hosts, that run containers, especially similar ones and especially in
> a trusted environment;
> 
> 2) heavy applications, that can be run in multiple instances, not
> limited to opensource ones like Firefox, but also those that cannot be
> modified.

This is way too generic. Please provide something more specific. Ideally
with numbers. Why those usecases cannot use an existing interfaces.
Remember you are trying to add a new user interface which we will have
to maintain for ever.

I will try to comment on the interface itself later. But I have to say
that I am not impressed. Abusing sysfs for per process features is quite
gross to be honest.

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH RFC v2 0/4] mm/ksm: add option to automerge VMAs
  2019-05-15  6:53       ` Michal Hocko
@ 2019-05-15  7:37         ` Oleksandr Natalenko
  2019-05-15  8:33           ` Michal Hocko
  2019-05-15 14:51         ` Michal Hocko
  1 sibling, 1 reply; 21+ messages in thread
From: Oleksandr Natalenko @ 2019-05-15  7:37 UTC (permalink / raw)
  To: Michal Hocko
  Cc: linux-kernel, Kirill Tkhai, Vlastimil Babka, Matthew Wilcox,
	Pavel Tatashin, Timofey Titovets, Aaron Tomlin, Grzegorz Halat,
	linux-mm, linux-api, Hugh Dickins

Hi.

On Wed, May 15, 2019 at 08:53:11AM +0200, Michal Hocko wrote:
> On Wed 15-05-19 08:25:23, Oleksandr Natalenko wrote:
> [...]
> > > > Please make sure to describe a usecase that warrants adding a new
> > > > interface we have to maintain for ever.
> > 
> > I think of two major consumers of this interface:
> > 
> > 1) hosts, that run containers, especially similar ones and especially in
> > a trusted environment;
> > 
> > 2) heavy applications, that can be run in multiple instances, not
> > limited to opensource ones like Firefox, but also those that cannot be
> > modified.
> 
> This is way too generic. Please provide something more specific. Ideally
> with numbers. Why those usecases cannot use an existing interfaces.
> Remember you are trying to add a new user interface which we will have
> to maintain for ever.

For my current setup with 2 Firefox instances I get 100 to 200 MiB saved
for the second instance depending on the amount of tabs.

1 FF instance with 15 tabs:

$ echo "$(cat /sys/kernel/mm/ksm/pages_sharing) * 4 / 1024" | bc
410

2 FF instances, second one has 12 tabs (all the tabs are different):

$ echo "$(cat /sys/kernel/mm/ksm/pages_sharing) * 4 / 1024" | bc
592

At the very moment I do not have specific numbers for containerised
workload, but those should be similar in case the containers share
similar/same runtime (like multiple Node.js containers etc).

Answering your question regarding using existing interfaces, since
there's only one, madvise(2), this requires modifying all the
applications one wants to de-duplicate. In case of containers with
arbitrary content or in case of binary-only apps this is pretty hard if
not impossible to do properly.

> I will try to comment on the interface itself later. But I have to say
> that I am not impressed. Abusing sysfs for per process features is quite
> gross to be honest.

Sure, please do.

Thanks for your time and inputs.

-- 
  Best regards,
    Oleksandr Natalenko (post-factum)
    Senior Software Maintenance Engineer

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH RFC v2 0/4] mm/ksm: add option to automerge VMAs
  2019-05-15  7:37         ` Oleksandr Natalenko
@ 2019-05-15  8:33           ` Michal Hocko
  2019-05-15  8:51             ` Oleksandr Natalenko
  0 siblings, 1 reply; 21+ messages in thread
From: Michal Hocko @ 2019-05-15  8:33 UTC (permalink / raw)
  To: Oleksandr Natalenko
  Cc: linux-kernel, Kirill Tkhai, Vlastimil Babka, Matthew Wilcox,
	Pavel Tatashin, Timofey Titovets, Aaron Tomlin, Grzegorz Halat,
	linux-mm, linux-api, Hugh Dickins

On Wed 15-05-19 09:37:23, Oleksandr Natalenko wrote:
[...]
> > This is way too generic. Please provide something more specific. Ideally
> > with numbers. Why those usecases cannot use an existing interfaces.
> > Remember you are trying to add a new user interface which we will have
> > to maintain for ever.
> 
> For my current setup with 2 Firefox instances I get 100 to 200 MiB saved
> for the second instance depending on the amount of tabs.

What does prevent Firefox (an opensource project) to be updated to use
the explicit merging?

[...]

> Answering your question regarding using existing interfaces, since
> there's only one, madvise(2), this requires modifying all the
> applications one wants to de-duplicate. In case of containers with
> arbitrary content or in case of binary-only apps this is pretty hard if
> not impossible to do properly.

OK, this makes more sense. Please note that there are other people who
would like to see certain madvise operations to be done on a remote
process - e.g. to allow external memory management (Android would like
to control memory aging so something like MADV_DONTNEED without loosing
content and more probably) and potentially other madvise operations.
Or maybe we need a completely new interface other than madvise.

In general, having a more generic API that would cover more usecases is
definitely much more preferable than one ad-hoc API that handles a very
specific usecase. So please try to think about a more generic
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH RFC v2 0/4] mm/ksm: add option to automerge VMAs
  2019-05-15  8:33           ` Michal Hocko
@ 2019-05-15  8:51             ` Oleksandr Natalenko
  2019-05-15 14:24               ` Michal Hocko
  0 siblings, 1 reply; 21+ messages in thread
From: Oleksandr Natalenko @ 2019-05-15  8:51 UTC (permalink / raw)
  To: Michal Hocko
  Cc: linux-kernel, Kirill Tkhai, Vlastimil Babka, Matthew Wilcox,
	Pavel Tatashin, Timofey Titovets, Aaron Tomlin, Grzegorz Halat,
	linux-mm, linux-api, Hugh Dickins

On Wed, May 15, 2019 at 10:33:21AM +0200, Michal Hocko wrote:
> > For my current setup with 2 Firefox instances I get 100 to 200 MiB saved
> > for the second instance depending on the amount of tabs.
> 
> What does prevent Firefox (an opensource project) to be updated to use
> the explicit merging?

This was rather an example of a big project. Other big projects may be
closed source, of course.

And yes, with regard to FF specifically I think nothing prevents it from
being modified appropriately.

> > Answering your question regarding using existing interfaces, since
> > there's only one, madvise(2), this requires modifying all the
> > applications one wants to de-duplicate. In case of containers with
> > arbitrary content or in case of binary-only apps this is pretty hard if
> > not impossible to do properly.
> 
> OK, this makes more sense. Please note that there are other people who
> would like to see certain madvise operations to be done on a remote
> process - e.g. to allow external memory management (Android would like
> to control memory aging so something like MADV_DONTNEED without loosing
> content and more probably) and potentially other madvise operations.
> Or maybe we need a completely new interface other than madvise.

I didn't know about those intentions. Could you please point me to a
relevant discussion so that I can check the details?

> In general, having a more generic API that would cover more usecases is
> definitely much more preferable than one ad-hoc API that handles a very
> specific usecase. So please try to think about a more generic

Yup, I see now. Since you are aware of ongoing intentions, please do Cc
those people then and/or let me know about previous discussions please.
That way thinking of how a new API should be implemented (be it a sysfs
file or something else) should be easier and more visible.

Thanks.

-- 
  Best regards,
    Oleksandr Natalenko (post-factum)
    Senior Software Maintenance Engineer

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH RFC v2 0/4] mm/ksm: add option to automerge VMAs
  2019-05-15  8:51             ` Oleksandr Natalenko
@ 2019-05-15 14:24               ` Michal Hocko
  0 siblings, 0 replies; 21+ messages in thread
From: Michal Hocko @ 2019-05-15 14:24 UTC (permalink / raw)
  To: Oleksandr Natalenko
  Cc: linux-kernel, Kirill Tkhai, Vlastimil Babka, Matthew Wilcox,
	Pavel Tatashin, Timofey Titovets, Aaron Tomlin, Grzegorz Halat,
	linux-mm, linux-api, Hugh Dickins

On Wed 15-05-19 10:51:58, Oleksandr Natalenko wrote:
> On Wed, May 15, 2019 at 10:33:21AM +0200, Michal Hocko wrote:
> > > For my current setup with 2 Firefox instances I get 100 to 200 MiB saved
> > > for the second instance depending on the amount of tabs.
> > 
> > What does prevent Firefox (an opensource project) to be updated to use
> > the explicit merging?
> 
> This was rather an example of a big project. Other big projects may be
> closed source, of course.

Again, specific examples are usually considered a much better
justification than "something might use the feature".

[...]

> > OK, this makes more sense. Please note that there are other people who
> > would like to see certain madvise operations to be done on a remote
> > process - e.g. to allow external memory management (Android would like
> > to control memory aging so something like MADV_DONTNEED without loosing
> > content and more probably) and potentially other madvise operations.
> > Or maybe we need a completely new interface other than madvise.
> 
> I didn't know about those intentions. Could you please point me to a
> relevant discussion so that I can check the details?

I am sorry I do not have any specific links to patches under discussion.
We have discussed that topic at LSFMM this year
(https://lwn.net/Articles/787217/) and Google guys should be sending
something soon.
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH RFC v2 0/4] mm/ksm: add option to automerge VMAs
  2019-05-15  6:53       ` Michal Hocko
  2019-05-15  7:37         ` Oleksandr Natalenko
@ 2019-05-15 14:51         ` Michal Hocko
  2019-05-15 15:15           ` Greg KH
  1 sibling, 1 reply; 21+ messages in thread
From: Michal Hocko @ 2019-05-15 14:51 UTC (permalink / raw)
  To: Oleksandr Natalenko
  Cc: linux-kernel, Kirill Tkhai, Vlastimil Babka, Matthew Wilcox,
	Pavel Tatashin, Timofey Titovets, Aaron Tomlin, Grzegorz Halat,
	linux-mm, linux-api, Hugh Dickins, Suren Baghdasaryan,
	Minchan Kim

[Cc Suren and Minchan - the email thread starts here 20190514131654.25463-1-oleksandr@redhat.com]

On Wed 15-05-19 08:53:11, Michal Hocko wrote:
[...]
> I will try to comment on the interface itself later. But I have to say
> that I am not impressed. Abusing sysfs for per process features is quite
> gross to be honest.

I have already commented on this in other email. I consider sysfs an
unsuitable interface for per-process API. Not to mention this particular
one is very KSM specific while the question about setting different
hints on memory of a remote process is a more generic question. As
already mentioned there are usecases where people would like to say
that a certain memory is cold from outside of the process context (e.g.
monitor application). So essentially a form of a user space memory
management. And this usecase sounds a bit similar to me and having a
common api sounds more sensible to me.

One thing we were discussing at LSFMM this year was a way to either
provide madvise_remote(pid, addr, length, advice) or a fadvise
alternative over /proc/<pid>/map_vmas/<range> file descriptors
(essentially resembling the existing map_files api) to achieve such a
functionality. This is still a very rough idea but the api would sound
much more generic to me and it would allow much wider range of usecases.

But maybe I am completely wrong and this is just opens a can of worms
that we do not want.
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH RFC v2 0/4] mm/ksm: add option to automerge VMAs
  2019-05-15 14:51         ` Michal Hocko
@ 2019-05-15 15:15           ` Greg KH
  2019-05-16  7:47             ` Michal Hocko
  0 siblings, 1 reply; 21+ messages in thread
From: Greg KH @ 2019-05-15 15:15 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Oleksandr Natalenko, linux-kernel, Kirill Tkhai, Vlastimil Babka,
	Matthew Wilcox, Pavel Tatashin, Timofey Titovets, Aaron Tomlin,
	Grzegorz Halat, linux-mm, linux-api, Hugh Dickins,
	Suren Baghdasaryan, Minchan Kim

On Wed, May 15, 2019 at 04:51:51PM +0200, Michal Hocko wrote:
> [Cc Suren and Minchan - the email thread starts here 20190514131654.25463-1-oleksandr@redhat.com]
> 
> On Wed 15-05-19 08:53:11, Michal Hocko wrote:
> [...]
> > I will try to comment on the interface itself later. But I have to say
> > that I am not impressed. Abusing sysfs for per process features is quite
> > gross to be honest.
> 
> I have already commented on this in other email. I consider sysfs an
> unsuitable interface for per-process API.

Wait, what?  A new sysfs file/directory per process?  That's crazy, no
one must have benchmarked it :)

And I agree, sysfs is not for that, please don't abuse it.  Proc is for
process apis.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH RFC v2 0/4] mm/ksm: add option to automerge VMAs
  2019-05-15 15:15           ` Greg KH
@ 2019-05-16  7:47             ` Michal Hocko
  2019-05-16  7:53               ` Greg KH
  0 siblings, 1 reply; 21+ messages in thread
From: Michal Hocko @ 2019-05-16  7:47 UTC (permalink / raw)
  To: Greg KH
  Cc: Oleksandr Natalenko, linux-kernel, Kirill Tkhai, Vlastimil Babka,
	Matthew Wilcox, Pavel Tatashin, Timofey Titovets, Aaron Tomlin,
	Grzegorz Halat, linux-mm, linux-api, Hugh Dickins,
	Suren Baghdasaryan, Minchan Kim

On Wed 15-05-19 17:15:57, Greg KH wrote:
> On Wed, May 15, 2019 at 04:51:51PM +0200, Michal Hocko wrote:
> > [Cc Suren and Minchan - the email thread starts here 20190514131654.25463-1-oleksandr@redhat.com]
> > 
> > On Wed 15-05-19 08:53:11, Michal Hocko wrote:
> > [...]
> > > I will try to comment on the interface itself later. But I have to say
> > > that I am not impressed. Abusing sysfs for per process features is quite
> > > gross to be honest.
> > 
> > I have already commented on this in other email. I consider sysfs an
> > unsuitable interface for per-process API.
> 
> Wait, what?  A new sysfs file/directory per process?  That's crazy, no
> one must have benchmarked it :)

Just to clarify, that was not a per process file but rather per process API.
Essentially echo $PID > $SYSFS_SPECIAL_FILE

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH RFC v2 0/4] mm/ksm: add option to automerge VMAs
  2019-05-16  7:47             ` Michal Hocko
@ 2019-05-16  7:53               ` Greg KH
  0 siblings, 0 replies; 21+ messages in thread
From: Greg KH @ 2019-05-16  7:53 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Oleksandr Natalenko, linux-kernel, Kirill Tkhai, Vlastimil Babka,
	Matthew Wilcox, Pavel Tatashin, Timofey Titovets, Aaron Tomlin,
	Grzegorz Halat, linux-mm, linux-api, Hugh Dickins,
	Suren Baghdasaryan, Minchan Kim

On Thu, May 16, 2019 at 09:47:13AM +0200, Michal Hocko wrote:
> On Wed 15-05-19 17:15:57, Greg KH wrote:
> > On Wed, May 15, 2019 at 04:51:51PM +0200, Michal Hocko wrote:
> > > [Cc Suren and Minchan - the email thread starts here 20190514131654.25463-1-oleksandr@redhat.com]
> > > 
> > > On Wed 15-05-19 08:53:11, Michal Hocko wrote:
> > > [...]
> > > > I will try to comment on the interface itself later. But I have to say
> > > > that I am not impressed. Abusing sysfs for per process features is quite
> > > > gross to be honest.
> > > 
> > > I have already commented on this in other email. I consider sysfs an
> > > unsuitable interface for per-process API.
> > 
> > Wait, what?  A new sysfs file/directory per process?  That's crazy, no
> > one must have benchmarked it :)
> 
> Just to clarify, that was not a per process file but rather per process API.
> Essentially echo $PID > $SYSFS_SPECIAL_FILE

Ick, no, that's not ok either.  sysfs files are not a replacement for
syscalls :)

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2019-05-16  7:53 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-05-14 13:16 [PATCH RFC v2 0/4] mm/ksm: add option to automerge VMAs Oleksandr Natalenko
2019-05-14 13:16 ` [PATCH RFC v2 1/4] mm/ksm: introduce ksm_enter() helper Oleksandr Natalenko
2019-05-14 13:16 ` [PATCH RFC v2 2/4] mm/ksm: introduce ksm_leave() helper Oleksandr Natalenko
2019-05-14 13:16 ` [PATCH RFC v2 3/4] mm/ksm: introduce force_madvise knob Oleksandr Natalenko
2019-05-14 13:22   ` Aaron Tomlin
2019-05-15  0:48     ` Timofey Titovets
2019-05-14 13:16 ` [PATCH RFC v2 4/4] mm/ksm: add force merging/unmerging documentation Oleksandr Natalenko
2019-05-15  0:53   ` Timofey Titovets
2019-05-15  6:26     ` Oleksandr Natalenko
2019-05-14 14:41 ` [PATCH RFC v2 0/4] mm/ksm: add option to automerge VMAs Michal Hocko
2019-05-14 14:51   ` Michal Hocko
2019-05-15  6:25     ` Oleksandr Natalenko
2019-05-15  6:53       ` Michal Hocko
2019-05-15  7:37         ` Oleksandr Natalenko
2019-05-15  8:33           ` Michal Hocko
2019-05-15  8:51             ` Oleksandr Natalenko
2019-05-15 14:24               ` Michal Hocko
2019-05-15 14:51         ` Michal Hocko
2019-05-15 15:15           ` Greg KH
2019-05-16  7:47             ` Michal Hocko
2019-05-16  7:53               ` Greg KH

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).