linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/6] mm: use down_read_killable for locking mmap_sem in proc
@ 2019-06-09 10:08 Konstantin Khlebnikov
  2019-06-09 10:08 ` [PATCH v2 1/6] proc: use down_read_killable mmap_sem for /proc/pid/maps Konstantin Khlebnikov
                   ` (5 more replies)
  0 siblings, 6 replies; 12+ messages in thread
From: Konstantin Khlebnikov @ 2019-06-09 10:08 UTC (permalink / raw)
  To: linux-mm, Andrew Morton, linux-kernel
  Cc: Oleg Nesterov, Matthew Wilcox, Michal Hocko, Cyrill Gorcunov,
	Kirill Tkhai, Michal Koutný,
	Al Viro, Roman Gushchin

v1:
https://lore.kernel.org/lkml/155790967258.1319.11531787078240675602.stgit@buzz/

v1 "mm: use down_read_killable for locking mmap_sem in access_remote_vm"
https://lore.kernel.org/lkml/155790847881.2798.7160461383704600177.stgit@buzz/

changes since v1:
* update comments and collect acks/reviews.

---

Konstantin Khlebnikov (6):
      proc: use down_read_killable mmap_sem for /proc/pid/maps
      proc: use down_read_killable mmap_sem for /proc/pid/smaps_rollup
      proc: use down_read_killable mmap_sem for /proc/pid/pagemap
      proc: use down_read_killable mmap_sem for /proc/pid/clear_refs
      proc: use down_read_killable mmap_sem for /proc/pid/map_files
      mm: use down_read_killable for locking mmap_sem in access_remote_vm


 fs/proc/base.c       |   27 +++++++++++++++++++++------
 fs/proc/task_mmu.c   |   23 ++++++++++++++++++-----
 fs/proc/task_nommu.c |    6 +++++-
 mm/memory.c          |    4 +++-
 mm/nommu.c           |    3 ++-
 5 files changed, 49 insertions(+), 14 deletions(-)

--
Signature

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH v2 1/6] proc: use down_read_killable mmap_sem for /proc/pid/maps
  2019-06-09 10:08 [PATCH v2 0/6] mm: use down_read_killable for locking mmap_sem in proc Konstantin Khlebnikov
@ 2019-06-09 10:08 ` Konstantin Khlebnikov
  2019-06-09 10:08 ` [PATCH v2 2/6] proc: use down_read_killable mmap_sem for /proc/pid/smaps_rollup Konstantin Khlebnikov
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 12+ messages in thread
From: Konstantin Khlebnikov @ 2019-06-09 10:08 UTC (permalink / raw)
  To: linux-mm, Andrew Morton, linux-kernel
  Cc: Oleg Nesterov, Matthew Wilcox, Michal Hocko, Cyrill Gorcunov,
	Kirill Tkhai, Michal Koutný,
	Al Viro, Roman Gushchin

Do not stuck forever if something wrong.
Killable lock allows to cleanup stuck tasks and simplifies investigation.

This function also used for /proc/pid/smaps.

Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Reviewed-by: Roman Gushchin <guro@fb.com>
Reviewed-by: Cyrill Gorcunov <gorcunov@gmail.com>
Reviewed-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Acked-by: Michal Hocko <mhocko@suse.com>
---
 fs/proc/task_mmu.c   |    6 +++++-
 fs/proc/task_nommu.c |    6 +++++-
 2 files changed, 10 insertions(+), 2 deletions(-)

diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
index 01d4eb0e6bd1..2bf210229daf 100644
--- a/fs/proc/task_mmu.c
+++ b/fs/proc/task_mmu.c
@@ -166,7 +166,11 @@ static void *m_start(struct seq_file *m, loff_t *ppos)
 	if (!mm || !mmget_not_zero(mm))
 		return NULL;
 
-	down_read(&mm->mmap_sem);
+	if (down_read_killable(&mm->mmap_sem)) {
+		mmput(mm);
+		return ERR_PTR(-EINTR);
+	}
+
 	hold_task_mempolicy(priv);
 	priv->tail_vma = get_gate_vma(mm);
 
diff --git a/fs/proc/task_nommu.c b/fs/proc/task_nommu.c
index 36bf0f2e102e..7907e6419e57 100644
--- a/fs/proc/task_nommu.c
+++ b/fs/proc/task_nommu.c
@@ -211,7 +211,11 @@ static void *m_start(struct seq_file *m, loff_t *pos)
 	if (!mm || !mmget_not_zero(mm))
 		return NULL;
 
-	down_read(&mm->mmap_sem);
+	if (down_read_killable(&mm->mmap_sem)) {
+		mmput(mm);
+		return ERR_PTR(-EINTR);
+	}
+
 	/* start from the Nth VMA */
 	for (p = rb_first(&mm->mm_rb); p; p = rb_next(p))
 		if (n-- == 0)


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v2 2/6] proc: use down_read_killable mmap_sem for /proc/pid/smaps_rollup
  2019-06-09 10:08 [PATCH v2 0/6] mm: use down_read_killable for locking mmap_sem in proc Konstantin Khlebnikov
  2019-06-09 10:08 ` [PATCH v2 1/6] proc: use down_read_killable mmap_sem for /proc/pid/maps Konstantin Khlebnikov
@ 2019-06-09 10:08 ` Konstantin Khlebnikov
  2019-06-09 10:08 ` [PATCH v2 3/6] proc: use down_read_killable mmap_sem for /proc/pid/pagemap Konstantin Khlebnikov
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 12+ messages in thread
From: Konstantin Khlebnikov @ 2019-06-09 10:08 UTC (permalink / raw)
  To: linux-mm, Andrew Morton, linux-kernel
  Cc: Oleg Nesterov, Matthew Wilcox, Michal Hocko, Cyrill Gorcunov,
	Kirill Tkhai, Michal Koutný,
	Al Viro, Roman Gushchin

Do not stuck forever if something wrong.
Killable lock allows to cleanup stuck tasks and simplifies investigation.

Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Reviewed-by: Roman Gushchin <guro@fb.com>
Reviewed-by: Cyrill Gorcunov <gorcunov@gmail.com>
Reviewed-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Acked-by: Michal Hocko <mhocko@suse.com>
---
 fs/proc/task_mmu.c |    8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
index 2bf210229daf..781879a91e3b 100644
--- a/fs/proc/task_mmu.c
+++ b/fs/proc/task_mmu.c
@@ -832,7 +832,10 @@ static int show_smaps_rollup(struct seq_file *m, void *v)
 
 	memset(&mss, 0, sizeof(mss));
 
-	down_read(&mm->mmap_sem);
+	ret = down_read_killable(&mm->mmap_sem);
+	if (ret)
+		goto out_put_mm;
+
 	hold_task_mempolicy(priv);
 
 	for (vma = priv->mm->mmap; vma; vma = vma->vm_next) {
@@ -849,8 +852,9 @@ static int show_smaps_rollup(struct seq_file *m, void *v)
 
 	release_task_mempolicy(priv);
 	up_read(&mm->mmap_sem);
-	mmput(mm);
 
+out_put_mm:
+	mmput(mm);
 out_put_task:
 	put_task_struct(priv->task);
 	priv->task = NULL;


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v2 3/6] proc: use down_read_killable mmap_sem for /proc/pid/pagemap
  2019-06-09 10:08 [PATCH v2 0/6] mm: use down_read_killable for locking mmap_sem in proc Konstantin Khlebnikov
  2019-06-09 10:08 ` [PATCH v2 1/6] proc: use down_read_killable mmap_sem for /proc/pid/maps Konstantin Khlebnikov
  2019-06-09 10:08 ` [PATCH v2 2/6] proc: use down_read_killable mmap_sem for /proc/pid/smaps_rollup Konstantin Khlebnikov
@ 2019-06-09 10:08 ` Konstantin Khlebnikov
  2019-06-09 10:08 ` [PATCH v2 4/6] proc: use down_read_killable mmap_sem for /proc/pid/clear_refs Konstantin Khlebnikov
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 12+ messages in thread
From: Konstantin Khlebnikov @ 2019-06-09 10:08 UTC (permalink / raw)
  To: linux-mm, Andrew Morton, linux-kernel
  Cc: Oleg Nesterov, Matthew Wilcox, Michal Hocko, Cyrill Gorcunov,
	Kirill Tkhai, Michal Koutný,
	Al Viro, Roman Gushchin

Do not stuck forever if something wrong.
Killable lock allows to cleanup stuck tasks and simplifies investigation.

Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Reviewed-by: Roman Gushchin <guro@fb.com>
Reviewed-by: Cyrill Gorcunov <gorcunov@gmail.com>
Reviewed-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Acked-by: Michal Hocko <mhocko@suse.com>
---
 fs/proc/task_mmu.c |    4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
index 781879a91e3b..78bed6adc62d 100644
--- a/fs/proc/task_mmu.c
+++ b/fs/proc/task_mmu.c
@@ -1547,7 +1547,9 @@ static ssize_t pagemap_read(struct file *file, char __user *buf,
 		/* overflow ? */
 		if (end < start_vaddr || end > end_vaddr)
 			end = end_vaddr;
-		down_read(&mm->mmap_sem);
+		ret = down_read_killable(&mm->mmap_sem);
+		if (ret)
+			goto out_free;
 		ret = walk_page_range(start_vaddr, end, &pagemap_walk);
 		up_read(&mm->mmap_sem);
 		start_vaddr = end;


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v2 4/6] proc: use down_read_killable mmap_sem for /proc/pid/clear_refs
  2019-06-09 10:08 [PATCH v2 0/6] mm: use down_read_killable for locking mmap_sem in proc Konstantin Khlebnikov
                   ` (2 preceding siblings ...)
  2019-06-09 10:08 ` [PATCH v2 3/6] proc: use down_read_killable mmap_sem for /proc/pid/pagemap Konstantin Khlebnikov
@ 2019-06-09 10:08 ` Konstantin Khlebnikov
  2019-06-09 10:09 ` [PATCH v2 5/6] proc: use down_read_killable mmap_sem for /proc/pid/map_files Konstantin Khlebnikov
  2019-06-09 10:09 ` [PATCH v2 6/6] mm: use down_read_killable for locking mmap_sem in access_remote_vm Konstantin Khlebnikov
  5 siblings, 0 replies; 12+ messages in thread
From: Konstantin Khlebnikov @ 2019-06-09 10:08 UTC (permalink / raw)
  To: linux-mm, Andrew Morton, linux-kernel
  Cc: Oleg Nesterov, Matthew Wilcox, Michal Hocko, Cyrill Gorcunov,
	Kirill Tkhai, Michal Koutný,
	Al Viro, Roman Gushchin

Do not stuck forever if something wrong.
Killable lock allows to cleanup stuck tasks and simplifies investigation.

Replace the only unkillable mmap_sem lock in clear_refs_write.

Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Reviewed-by: Roman Gushchin <guro@fb.com>
Reviewed-by: Cyrill Gorcunov <gorcunov@gmail.com>
Reviewed-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Acked-by: Michal Hocko <mhocko@suse.com>
---
 fs/proc/task_mmu.c |    5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
index 78bed6adc62d..7f84d1477b5b 100644
--- a/fs/proc/task_mmu.c
+++ b/fs/proc/task_mmu.c
@@ -1140,7 +1140,10 @@ static ssize_t clear_refs_write(struct file *file, const char __user *buf,
 			goto out_mm;
 		}
 
-		down_read(&mm->mmap_sem);
+		if (down_read_killable(&mm->mmap_sem)) {
+			count = -EINTR;
+			goto out_mm;
+		}
 		tlb_gather_mmu(&tlb, mm, 0, -1);
 		if (type == CLEAR_REFS_SOFT_DIRTY) {
 			for (vma = mm->mmap; vma; vma = vma->vm_next) {


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v2 5/6] proc: use down_read_killable mmap_sem for /proc/pid/map_files
  2019-06-09 10:08 [PATCH v2 0/6] mm: use down_read_killable for locking mmap_sem in proc Konstantin Khlebnikov
                   ` (3 preceding siblings ...)
  2019-06-09 10:08 ` [PATCH v2 4/6] proc: use down_read_killable mmap_sem for /proc/pid/clear_refs Konstantin Khlebnikov
@ 2019-06-09 10:09 ` Konstantin Khlebnikov
  2019-06-12 23:14   ` Andrei Vagin
  2019-06-09 10:09 ` [PATCH v2 6/6] mm: use down_read_killable for locking mmap_sem in access_remote_vm Konstantin Khlebnikov
  5 siblings, 1 reply; 12+ messages in thread
From: Konstantin Khlebnikov @ 2019-06-09 10:09 UTC (permalink / raw)
  To: linux-mm, Andrew Morton, linux-kernel
  Cc: Oleg Nesterov, Matthew Wilcox, Michal Hocko, Cyrill Gorcunov,
	Kirill Tkhai, Michal Koutný,
	Al Viro, Roman Gushchin

Do not stuck forever if something wrong.
Killable lock allows to cleanup stuck tasks and simplifies investigation.

It seems ->d_revalidate() could return any error (except ECHILD) to
abort validation and pass error as result of lookup sequence.

Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Reviewed-by: Roman Gushchin <guro@fb.com>
Reviewed-by: Cyrill Gorcunov <gorcunov@gmail.com>
Reviewed-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Acked-by: Michal Hocko <mhocko@suse.com>
---
 fs/proc/base.c |   27 +++++++++++++++++++++------
 1 file changed, 21 insertions(+), 6 deletions(-)

diff --git a/fs/proc/base.c b/fs/proc/base.c
index 9c8ca6cd3ce4..515ab29c2adf 100644
--- a/fs/proc/base.c
+++ b/fs/proc/base.c
@@ -1962,9 +1962,12 @@ static int map_files_d_revalidate(struct dentry *dentry, unsigned int flags)
 		goto out;
 
 	if (!dname_to_vma_addr(dentry, &vm_start, &vm_end)) {
-		down_read(&mm->mmap_sem);
-		exact_vma_exists = !!find_exact_vma(mm, vm_start, vm_end);
-		up_read(&mm->mmap_sem);
+		status = down_read_killable(&mm->mmap_sem);
+		if (!status) {
+			exact_vma_exists = !!find_exact_vma(mm, vm_start,
+							    vm_end);
+			up_read(&mm->mmap_sem);
+		}
 	}
 
 	mmput(mm);
@@ -2010,8 +2013,11 @@ static int map_files_get_link(struct dentry *dentry, struct path *path)
 	if (rc)
 		goto out_mmput;
 
+	rc = down_read_killable(&mm->mmap_sem);
+	if (rc)
+		goto out_mmput;
+
 	rc = -ENOENT;
-	down_read(&mm->mmap_sem);
 	vma = find_exact_vma(mm, vm_start, vm_end);
 	if (vma && vma->vm_file) {
 		*path = vma->vm_file->f_path;
@@ -2107,7 +2113,10 @@ static struct dentry *proc_map_files_lookup(struct inode *dir,
 	if (!mm)
 		goto out_put_task;
 
-	down_read(&mm->mmap_sem);
+	result = ERR_PTR(-EINTR);
+	if (down_read_killable(&mm->mmap_sem))
+		goto out_put_mm;
+
 	vma = find_exact_vma(mm, vm_start, vm_end);
 	if (!vma)
 		goto out_no_vma;
@@ -2118,6 +2127,7 @@ static struct dentry *proc_map_files_lookup(struct inode *dir,
 
 out_no_vma:
 	up_read(&mm->mmap_sem);
+out_put_mm:
 	mmput(mm);
 out_put_task:
 	put_task_struct(task);
@@ -2160,7 +2170,12 @@ proc_map_files_readdir(struct file *file, struct dir_context *ctx)
 	mm = get_task_mm(task);
 	if (!mm)
 		goto out_put_task;
-	down_read(&mm->mmap_sem);
+
+	ret = down_read_killable(&mm->mmap_sem);
+	if (ret) {
+		mmput(mm);
+		goto out_put_task;
+	}
 
 	nr_files = 0;
 


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v2 6/6] mm: use down_read_killable for locking mmap_sem in access_remote_vm
  2019-06-09 10:08 [PATCH v2 0/6] mm: use down_read_killable for locking mmap_sem in proc Konstantin Khlebnikov
                   ` (4 preceding siblings ...)
  2019-06-09 10:09 ` [PATCH v2 5/6] proc: use down_read_killable mmap_sem for /proc/pid/map_files Konstantin Khlebnikov
@ 2019-06-09 10:09 ` Konstantin Khlebnikov
  5 siblings, 0 replies; 12+ messages in thread
From: Konstantin Khlebnikov @ 2019-06-09 10:09 UTC (permalink / raw)
  To: linux-mm, Andrew Morton, linux-kernel
  Cc: Oleg Nesterov, Matthew Wilcox, Michal Hocko, Cyrill Gorcunov,
	Kirill Tkhai, Michal Koutný,
	Al Viro, Roman Gushchin

This function is used by ptrace and proc files like /proc/pid/cmdline and
/proc/pid/environ.

Access_remote_vm never returns error codes, all errors are ignored and
only size of successfully read data is returned. So, if current task was
killed we'll simply return 0 (bytes read).

Mmap_sem could be locked for a long time or forever if something wrong.
Killable lock allows to cleanup stuck tasks and simplifies investigation.

Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Reviewed-by: Michal Koutný <mkoutny@suse.com>
Acked-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Michal Hocko <mhocko@suse.com>
---
 mm/memory.c |    4 +++-
 mm/nommu.c  |    3 ++-
 2 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/mm/memory.c b/mm/memory.c
index ddf20bd0c317..9a4401d21e94 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -4349,7 +4349,9 @@ int __access_remote_vm(struct task_struct *tsk, struct mm_struct *mm,
 	void *old_buf = buf;
 	int write = gup_flags & FOLL_WRITE;
 
-	down_read(&mm->mmap_sem);
+	if (down_read_killable(&mm->mmap_sem))
+		return 0;
+
 	/* ignore errors, just check how much was successfully transferred */
 	while (len) {
 		int bytes, ret, offset;
diff --git a/mm/nommu.c b/mm/nommu.c
index d8c02fbe03b5..b2823519f8cd 100644
--- a/mm/nommu.c
+++ b/mm/nommu.c
@@ -1792,7 +1792,8 @@ int __access_remote_vm(struct task_struct *tsk, struct mm_struct *mm,
 	struct vm_area_struct *vma;
 	int write = gup_flags & FOLL_WRITE;
 
-	down_read(&mm->mmap_sem);
+	if (down_read_killable(&mm->mmap_sem))
+		return 0;
 
 	/* the access must start within one of the target process's mappings */
 	vma = find_vma(mm, addr);


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 5/6] proc: use down_read_killable mmap_sem for /proc/pid/map_files
  2019-06-09 10:09 ` [PATCH v2 5/6] proc: use down_read_killable mmap_sem for /proc/pid/map_files Konstantin Khlebnikov
@ 2019-06-12 23:14   ` Andrei Vagin
  2019-06-13  0:04     ` Andrew Morton
                       ` (2 more replies)
  0 siblings, 3 replies; 12+ messages in thread
From: Andrei Vagin @ 2019-06-12 23:14 UTC (permalink / raw)
  To: Konstantin Khlebnikov
  Cc: linux-mm, Andrew Morton, linux-kernel, Oleg Nesterov,
	Matthew Wilcox, Michal Hocko, Cyrill Gorcunov, Kirill Tkhai,
	Michal Koutný,
	Al Viro, Roman Gushchin, Dmitry Safonov

On Sun, Jun 09, 2019 at 01:09:00PM +0300, Konstantin Khlebnikov wrote:
> Do not stuck forever if something wrong.
> Killable lock allows to cleanup stuck tasks and simplifies investigation.

This patch breaks the CRIU project, because stat() returns EINTR instead
of ENOENT:

[root@fc24 criu]# stat /proc/self/map_files/0-0
stat: cannot stat '/proc/self/map_files/0-0': Interrupted system call

Here is one inline comment with the fix for this issue.

> 
> It seems ->d_revalidate() could return any error (except ECHILD) to
> abort validation and pass error as result of lookup sequence.
> 
> Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
> Reviewed-by: Roman Gushchin <guro@fb.com>
> Reviewed-by: Cyrill Gorcunov <gorcunov@gmail.com>
> Reviewed-by: Kirill Tkhai <ktkhai@virtuozzo.com>

It was nice to see all four of you in one place :).

> Acked-by: Michal Hocko <mhocko@suse.com>
> ---
>  fs/proc/base.c |   27 +++++++++++++++++++++------
>  1 file changed, 21 insertions(+), 6 deletions(-)
> 
> diff --git a/fs/proc/base.c b/fs/proc/base.c
> index 9c8ca6cd3ce4..515ab29c2adf 100644
> --- a/fs/proc/base.c
> +++ b/fs/proc/base.c
> @@ -1962,9 +1962,12 @@ static int map_files_d_revalidate(struct dentry *dentry, unsigned int flags)
>  		goto out;
>  
>  	if (!dname_to_vma_addr(dentry, &vm_start, &vm_end)) {
> -		down_read(&mm->mmap_sem);
> -		exact_vma_exists = !!find_exact_vma(mm, vm_start, vm_end);
> -		up_read(&mm->mmap_sem);
> +		status = down_read_killable(&mm->mmap_sem);
> +		if (!status) {
> +			exact_vma_exists = !!find_exact_vma(mm, vm_start,
> +							    vm_end);
> +			up_read(&mm->mmap_sem);
> +		}
>  	}
>  
>  	mmput(mm);
> @@ -2010,8 +2013,11 @@ static int map_files_get_link(struct dentry *dentry, struct path *path)
>  	if (rc)
>  		goto out_mmput;
>  
> +	rc = down_read_killable(&mm->mmap_sem);
> +	if (rc)
> +		goto out_mmput;
> +
>  	rc = -ENOENT;
> -	down_read(&mm->mmap_sem);
>  	vma = find_exact_vma(mm, vm_start, vm_end);
>  	if (vma && vma->vm_file) {
>  		*path = vma->vm_file->f_path;
> @@ -2107,7 +2113,10 @@ static struct dentry *proc_map_files_lookup(struct inode *dir,
>  	if (!mm)
>  		goto out_put_task;
>  
> -	down_read(&mm->mmap_sem);
> +	result = ERR_PTR(-EINTR);
> +	if (down_read_killable(&mm->mmap_sem))
> +		goto out_put_mm;
> +

	result = ERR_PTR(-ENOENT);

>  	vma = find_exact_vma(mm, vm_start, vm_end);
>  	if (!vma)
>  		goto out_no_vma;
> @@ -2118,6 +2127,7 @@ static struct dentry *proc_map_files_lookup(struct inode *dir,
>  
>  out_no_vma:
>  	up_read(&mm->mmap_sem);
> +out_put_mm:
>  	mmput(mm);
>  out_put_task:
>  	put_task_struct(task);
> @@ -2160,7 +2170,12 @@ proc_map_files_readdir(struct file *file, struct dir_context *ctx)
>  	mm = get_task_mm(task);
>  	if (!mm)
>  		goto out_put_task;
> -	down_read(&mm->mmap_sem);
> +
> +	ret = down_read_killable(&mm->mmap_sem);
> +	if (ret) {
> +		mmput(mm);
> +		goto out_put_task;
> +	}
>  
>  	nr_files = 0;
>  

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 5/6] proc: use down_read_killable mmap_sem for /proc/pid/map_files
  2019-06-12 23:14   ` Andrei Vagin
@ 2019-06-13  0:04     ` Andrew Morton
  2019-06-13  6:48     ` Cyrill Gorcunov
  2019-06-13  8:15     ` Konstantin Khlebnikov
  2 siblings, 0 replies; 12+ messages in thread
From: Andrew Morton @ 2019-06-13  0:04 UTC (permalink / raw)
  To: Andrei Vagin
  Cc: Konstantin Khlebnikov, linux-mm, linux-kernel, Oleg Nesterov,
	Matthew Wilcox, Michal Hocko, Cyrill Gorcunov, Kirill Tkhai,
	Michal Koutný,
	Al Viro, Roman Gushchin, Dmitry Safonov

On Wed, 12 Jun 2019 16:14:28 -0700 Andrei Vagin <avagin@gmail.com> wrote:

> On Sun, Jun 09, 2019 at 01:09:00PM +0300, Konstantin Khlebnikov wrote:
> > Do not stuck forever if something wrong.
> > Killable lock allows to cleanup stuck tasks and simplifies investigation.
> 
> This patch breaks the CRIU project, because stat() returns EINTR instead
> of ENOENT:
> 
> [root@fc24 criu]# stat /proc/self/map_files/0-0
> stat: cannot stat '/proc/self/map_files/0-0': Interrupted system call
> 
> Here is one inline comment with the fix for this issue.
> 
> > @@ -2107,7 +2113,10 @@ static struct dentry *proc_map_files_lookup(struct inode *dir,
> >  	if (!mm)
> >  		goto out_put_task;
> >  
> > -	down_read(&mm->mmap_sem);
> > +	result = ERR_PTR(-EINTR);
> > +	if (down_read_killable(&mm->mmap_sem))
> > +		goto out_put_mm;
> > +
> 
> 	result = ERR_PTR(-ENOENT);
> 

yes, thanks.

--- a/fs/proc/base.c~proc-use-down_read_killable-mmap_sem-for-proc-pid-map_files-fix
+++ a/fs/proc/base.c
@@ -2117,6 +2117,7 @@ static struct dentry *proc_map_files_loo
 	if (down_read_killable(&mm->mmap_sem))
 		goto out_put_mm;
 
+	result = ERR_PTR(-ENOENT);
 	vma = find_exact_vma(mm, vm_start, vm_end);
 	if (!vma)
 		goto out_no_vma;



We started doing this trick of using

	ret = -EFOO;
	if (something)
		goto out;

decades ago because it generated slightly better code.  I rather doubt
whether that's still the case.

In fact this:

--- a/fs/proc/base.c~a
+++ a/fs/proc/base.c
@@ -2096,35 +2096,45 @@ static struct dentry *proc_map_files_loo
 	struct dentry *result;
 	struct mm_struct *mm;
 
-	result = ERR_PTR(-ENOENT);
 	task = get_proc_task(dir);
-	if (!task)
+	if (!task) {
+		result = ERR_PTR(-ENOENT);
 		goto out;
+	}
 
-	result = ERR_PTR(-EACCES);
-	if (!ptrace_may_access(task, PTRACE_MODE_READ_FSCREDS))
+	if (!ptrace_may_access(task, PTRACE_MODE_READ_FSCREDS)) {
+		result = ERR_PTR(-EACCES);
 		goto out_put_task;
+	}
 
-	result = ERR_PTR(-ENOENT);
-	if (dname_to_vma_addr(dentry, &vm_start, &vm_end))
+	if (dname_to_vma_addr(dentry, &vm_start, &vm_end)) {
+		result = ERR_PTR(-ENOENT);
 		goto out_put_task;
+	}
 
 	mm = get_task_mm(task);
-	if (!mm)
+	if (!mm) {
+		result = ERR_PTR(-ENOENT);
 		goto out_put_task;
+	}
 
-	result = ERR_PTR(-EINTR);
-	if (down_read_killable(&mm->mmap_sem))
+	if (down_read_killable(&mm->mmap_sem)) {
+		result = ERR_PTR(-EINTR);
 		goto out_put_mm;
+	}
 
-	result = ERR_PTR(-ENOENT);
 	vma = find_exact_vma(mm, vm_start, vm_end);
-	if (!vma)
+	if (!vma) {
+		result = ERR_PTR(-ENOENT);
 		goto out_no_vma;
+	}
 
-	if (vma->vm_file)
+	if (vma->vm_file) {
 		result = proc_map_files_instantiate(dentry, task,
 				(void *)(unsigned long)vma->vm_file->f_mode);
+	} else {
+		result = ERR_PTR(-ENOENT);
+	}
 
 out_no_vma:
 	up_read(&mm->mmap_sem);

makes no change to the generated assembly with gcc-7.2.0.

And I do think that the above style is clearer and more reliable, as
this bug demonstrates.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 5/6] proc: use down_read_killable mmap_sem for /proc/pid/map_files
  2019-06-12 23:14   ` Andrei Vagin
  2019-06-13  0:04     ` Andrew Morton
@ 2019-06-13  6:48     ` Cyrill Gorcunov
  2019-06-13  8:15     ` Konstantin Khlebnikov
  2 siblings, 0 replies; 12+ messages in thread
From: Cyrill Gorcunov @ 2019-06-13  6:48 UTC (permalink / raw)
  To: Andrei Vagin
  Cc: Konstantin Khlebnikov, linux-mm, Andrew Morton, linux-kernel,
	Oleg Nesterov, Matthew Wilcox, Michal Hocko, Kirill Tkhai,
	Michal Koutný,
	Al Viro, Roman Gushchin, Dmitry Safonov

On Wed, Jun 12, 2019 at 04:14:28PM -0700, Andrei Vagin wrote:
> On Sun, Jun 09, 2019 at 01:09:00PM +0300, Konstantin Khlebnikov wrote:
> > Do not stuck forever if something wrong.
> > Killable lock allows to cleanup stuck tasks and simplifies investigation.
> 
> This patch breaks the CRIU project, because stat() returns EINTR instead
> of ENOENT:
> 
> [root@fc24 criu]# stat /proc/self/map_files/0-0
> stat: cannot stat '/proc/self/map_files/0-0': Interrupted system call
> 
> Here is one inline comment with the fix for this issue.
> 
> > 
> > It seems ->d_revalidate() could return any error (except ECHILD) to
> > abort validation and pass error as result of lookup sequence.
> > 
> > Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
> > Reviewed-by: Roman Gushchin <guro@fb.com>
> > Reviewed-by: Cyrill Gorcunov <gorcunov@gmail.com>
> > Reviewed-by: Kirill Tkhai <ktkhai@virtuozzo.com>
> 
> It was nice to see all four of you in one place :).

Holymoly ;) And we all managed to miss this error code.
Thanks, Andrew!

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 5/6] proc: use down_read_killable mmap_sem for /proc/pid/map_files
  2019-06-12 23:14   ` Andrei Vagin
  2019-06-13  0:04     ` Andrew Morton
  2019-06-13  6:48     ` Cyrill Gorcunov
@ 2019-06-13  8:15     ` Konstantin Khlebnikov
  2019-06-13 20:32       ` Andrei Vagin
  2 siblings, 1 reply; 12+ messages in thread
From: Konstantin Khlebnikov @ 2019-06-13  8:15 UTC (permalink / raw)
  To: Andrei Vagin
  Cc: linux-mm, Andrew Morton, linux-kernel, Oleg Nesterov,
	Matthew Wilcox, Michal Hocko, Cyrill Gorcunov, Kirill Tkhai,
	Michal Koutný,
	Al Viro, Roman Gushchin, Dmitry Safonov

On 13.06.2019 2:14, Andrei Vagin wrote:
> On Sun, Jun 09, 2019 at 01:09:00PM +0300, Konstantin Khlebnikov wrote:
>> Do not stuck forever if something wrong.
>> Killable lock allows to cleanup stuck tasks and simplifies investigation.
> 
> This patch breaks the CRIU project, because stat() returns EINTR instead
> of ENOENT:
> 
> [root@fc24 criu]# stat /proc/self/map_files/0-0
> stat: cannot stat '/proc/self/map_files/0-0': Interrupted system call

Good catch.

It seems CRIU tests has good coverage for darkest corners of kernel API.
Kernel CI projects should use it. I suppose you know how to promote this. =)

> 
> Here is one inline comment with the fix for this issue.
> 
>>
>> It seems ->d_revalidate() could return any error (except ECHILD) to
>> abort validation and pass error as result of lookup sequence.
>>
>> Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
>> Reviewed-by: Roman Gushchin <guro@fb.com>
>> Reviewed-by: Cyrill Gorcunov <gorcunov@gmail.com>
>> Reviewed-by: Kirill Tkhai <ktkhai@virtuozzo.com>
> 
> It was nice to see all four of you in one place :).
> 
>> Acked-by: Michal Hocko <mhocko@suse.com>
>> ---
>>   fs/proc/base.c |   27 +++++++++++++++++++++------
>>   1 file changed, 21 insertions(+), 6 deletions(-)
>>
>> diff --git a/fs/proc/base.c b/fs/proc/base.c
>> index 9c8ca6cd3ce4..515ab29c2adf 100644
>> --- a/fs/proc/base.c
>> +++ b/fs/proc/base.c
>> @@ -1962,9 +1962,12 @@ static int map_files_d_revalidate(struct dentry *dentry, unsigned int flags)
>>   		goto out;
>>   
>>   	if (!dname_to_vma_addr(dentry, &vm_start, &vm_end)) {
>> -		down_read(&mm->mmap_sem);
>> -		exact_vma_exists = !!find_exact_vma(mm, vm_start, vm_end);
>> -		up_read(&mm->mmap_sem);
>> +		status = down_read_killable(&mm->mmap_sem);
>> +		if (!status) {
>> +			exact_vma_exists = !!find_exact_vma(mm, vm_start,
>> +							    vm_end);
>> +			up_read(&mm->mmap_sem);
>> +		}
>>   	}
>>   
>>   	mmput(mm);
>> @@ -2010,8 +2013,11 @@ static int map_files_get_link(struct dentry *dentry, struct path *path)
>>   	if (rc)
>>   		goto out_mmput;
>>   
>> +	rc = down_read_killable(&mm->mmap_sem);
>> +	if (rc)
>> +		goto out_mmput;
>> +
>>   	rc = -ENOENT;
>> -	down_read(&mm->mmap_sem);
>>   	vma = find_exact_vma(mm, vm_start, vm_end);
>>   	if (vma && vma->vm_file) {
>>   		*path = vma->vm_file->f_path;
>> @@ -2107,7 +2113,10 @@ static struct dentry *proc_map_files_lookup(struct inode *dir,
>>   	if (!mm)
>>   		goto out_put_task;
>>   
>> -	down_read(&mm->mmap_sem);
>> +	result = ERR_PTR(-EINTR);
>> +	if (down_read_killable(&mm->mmap_sem))
>> +		goto out_put_mm;
>> +
> 
> 	result = ERR_PTR(-ENOENT);
> 
>>   	vma = find_exact_vma(mm, vm_start, vm_end);
>>   	if (!vma)
>>   		goto out_no_vma;
>> @@ -2118,6 +2127,7 @@ static struct dentry *proc_map_files_lookup(struct inode *dir,
>>   
>>   out_no_vma:
>>   	up_read(&mm->mmap_sem);
>> +out_put_mm:
>>   	mmput(mm);
>>   out_put_task:
>>   	put_task_struct(task);
>> @@ -2160,7 +2170,12 @@ proc_map_files_readdir(struct file *file, struct dir_context *ctx)
>>   	mm = get_task_mm(task);
>>   	if (!mm)
>>   		goto out_put_task;
>> -	down_read(&mm->mmap_sem);
>> +
>> +	ret = down_read_killable(&mm->mmap_sem);
>> +	if (ret) {
>> +		mmput(mm);
>> +		goto out_put_task;
>> +	}
>>   
>>   	nr_files = 0;
>>   

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 5/6] proc: use down_read_killable mmap_sem for /proc/pid/map_files
  2019-06-13  8:15     ` Konstantin Khlebnikov
@ 2019-06-13 20:32       ` Andrei Vagin
  0 siblings, 0 replies; 12+ messages in thread
From: Andrei Vagin @ 2019-06-13 20:32 UTC (permalink / raw)
  To: Konstantin Khlebnikov
  Cc: linux-mm, Andrew Morton, LKML, Oleg Nesterov, Matthew Wilcox,
	Michal Hocko, Cyrill Gorcunov, Kirill Tkhai, Michal Koutný,
	Al Viro, Roman Gushchin, Dmitry Safonov, Mike Rapoport

On Thu, Jun 13, 2019 at 1:15 AM Konstantin Khlebnikov
<khlebnikov@yandex-team.ru> wrote:
>
> On 13.06.2019 2:14, Andrei Vagin wrote:
> > On Sun, Jun 09, 2019 at 01:09:00PM +0300, Konstantin Khlebnikov wrote:
> >> Do not stuck forever if something wrong.
> >> Killable lock allows to cleanup stuck tasks and simplifies investigation.
> >
> > This patch breaks the CRIU project, because stat() returns EINTR instead
> > of ENOENT:
> >
> > [root@fc24 criu]# stat /proc/self/map_files/0-0
> > stat: cannot stat '/proc/self/map_files/0-0': Interrupted system call
>
> Good catch.
>
> It seems CRIU tests has good coverage for darkest corners of kernel API.
> Kernel CI projects should use it. I suppose you know how to promote this. =)

I remember Mike was trying to add the CRIU test suite into kernel-ci,
but it looks like this ended up with nothing.

The good thing here is that we have our own kernel-ci:
https://travis-ci.org/avagin/linux/builds

Travis-CI doesn't allow to replace the kernel, so we use CRIU to
dump/restore a ssh session and travis doesn't notice when we kexec a
new kernel.

>
> >
> > Here is one inline comment with the fix for this issue.
> >
> >>
> >> It seems ->d_revalidate() could return any error (except ECHILD) to
> >> abort validation and pass error as result of lookup sequence.
> >>
> >> Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
> >> Reviewed-by: Roman Gushchin <guro@fb.com>
> >> Reviewed-by: Cyrill Gorcunov <gorcunov@gmail.com>
> >> Reviewed-by: Kirill Tkhai <ktkhai@virtuozzo.com>
> >
> > It was nice to see all four of you in one place :).
> >
> >> Acked-by: Michal Hocko <mhocko@suse.com>
> >> ---
> >>   fs/proc/base.c |   27 +++++++++++++++++++++------
> >>   1 file changed, 21 insertions(+), 6 deletions(-)
> >>
> >> diff --git a/fs/proc/base.c b/fs/proc/base.c
> >> index 9c8ca6cd3ce4..515ab29c2adf 100644
> >> --- a/fs/proc/base.c
> >> +++ b/fs/proc/base.c
> >> @@ -1962,9 +1962,12 @@ static int map_files_d_revalidate(struct dentry *dentry, unsigned int flags)
> >>              goto out;
> >>
> >>      if (!dname_to_vma_addr(dentry, &vm_start, &vm_end)) {
> >> -            down_read(&mm->mmap_sem);
> >> -            exact_vma_exists = !!find_exact_vma(mm, vm_start, vm_end);
> >> -            up_read(&mm->mmap_sem);
> >> +            status = down_read_killable(&mm->mmap_sem);
> >> +            if (!status) {
> >> +                    exact_vma_exists = !!find_exact_vma(mm, vm_start,
> >> +                                                        vm_end);
> >> +                    up_read(&mm->mmap_sem);
> >> +            }
> >>      }
> >>
> >>      mmput(mm);
> >> @@ -2010,8 +2013,11 @@ static int map_files_get_link(struct dentry *dentry, struct path *path)
> >>      if (rc)
> >>              goto out_mmput;
> >>
> >> +    rc = down_read_killable(&mm->mmap_sem);
> >> +    if (rc)
> >> +            goto out_mmput;
> >> +
> >>      rc = -ENOENT;
> >> -    down_read(&mm->mmap_sem);
> >>      vma = find_exact_vma(mm, vm_start, vm_end);
> >>      if (vma && vma->vm_file) {
> >>              *path = vma->vm_file->f_path;
> >> @@ -2107,7 +2113,10 @@ static struct dentry *proc_map_files_lookup(struct inode *dir,
> >>      if (!mm)
> >>              goto out_put_task;
> >>
> >> -    down_read(&mm->mmap_sem);
> >> +    result = ERR_PTR(-EINTR);
> >> +    if (down_read_killable(&mm->mmap_sem))
> >> +            goto out_put_mm;
> >> +
> >
> >       result = ERR_PTR(-ENOENT);
> >
> >>      vma = find_exact_vma(mm, vm_start, vm_end);
> >>      if (!vma)
> >>              goto out_no_vma;
> >> @@ -2118,6 +2127,7 @@ static struct dentry *proc_map_files_lookup(struct inode *dir,
> >>
> >>   out_no_vma:
> >>      up_read(&mm->mmap_sem);
> >> +out_put_mm:
> >>      mmput(mm);
> >>   out_put_task:
> >>      put_task_struct(task);
> >> @@ -2160,7 +2170,12 @@ proc_map_files_readdir(struct file *file, struct dir_context *ctx)
> >>      mm = get_task_mm(task);
> >>      if (!mm)
> >>              goto out_put_task;
> >> -    down_read(&mm->mmap_sem);
> >> +
> >> +    ret = down_read_killable(&mm->mmap_sem);
> >> +    if (ret) {
> >> +            mmput(mm);
> >> +            goto out_put_task;
> >> +    }
> >>
> >>      nr_files = 0;
> >>

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2019-06-13 20:32 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-06-09 10:08 [PATCH v2 0/6] mm: use down_read_killable for locking mmap_sem in proc Konstantin Khlebnikov
2019-06-09 10:08 ` [PATCH v2 1/6] proc: use down_read_killable mmap_sem for /proc/pid/maps Konstantin Khlebnikov
2019-06-09 10:08 ` [PATCH v2 2/6] proc: use down_read_killable mmap_sem for /proc/pid/smaps_rollup Konstantin Khlebnikov
2019-06-09 10:08 ` [PATCH v2 3/6] proc: use down_read_killable mmap_sem for /proc/pid/pagemap Konstantin Khlebnikov
2019-06-09 10:08 ` [PATCH v2 4/6] proc: use down_read_killable mmap_sem for /proc/pid/clear_refs Konstantin Khlebnikov
2019-06-09 10:09 ` [PATCH v2 5/6] proc: use down_read_killable mmap_sem for /proc/pid/map_files Konstantin Khlebnikov
2019-06-12 23:14   ` Andrei Vagin
2019-06-13  0:04     ` Andrew Morton
2019-06-13  6:48     ` Cyrill Gorcunov
2019-06-13  8:15     ` Konstantin Khlebnikov
2019-06-13 20:32       ` Andrei Vagin
2019-06-09 10:09 ` [PATCH v2 6/6] mm: use down_read_killable for locking mmap_sem in access_remote_vm Konstantin Khlebnikov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).