linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Davidlohr Bueso <dave@stgolabs.net>
To: "Kirill A. Shutemov" <kirill@shutemov.name>,
	Dmitry Vyukov <dvyukov@google.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	dave.hansen@linux.intel.com, Hugh Dickins <hughd@google.com>,
	Joe Perches <joe@perches.com>,
	sds@tycho.nsa.gov, Oleg Nesterov <oleg@redhat.com>,
	"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
	Rik van Riel <riel@redhat.com>,
	mhocko@suse.cz, gang.chen.5i5j@gmail.com,
	Peter Feiner <pfeiner@google.com>,
	aarcange@redhat.com, "linux-mm@kvack.org" <linux-mm@kvack.org>,
	LKML <linux-kernel@vger.kernel.org>,
	syzkaller@googlegroups.com, Kostya Serebryany <kcc@google.com>,
	Alexander Potapenko <glider@google.com>,
	Andrey Konovalov <andreyknvl@google.com>,
	Sasha Levin <sasha.levin@oracle.com>,
	Manfred Spraul <manfred@colorfullife.com>
Subject: Re: GPF in shm_lock ipc
Date: Mon, 12 Oct 2015 20:18:21 -0700	[thread overview]
Message-ID: <20151013031821.GA3052@linux-uzut.site> (raw)
In-Reply-To: <20151012185533.GD3170@linux-uzut.site>

On Mon, 12 Oct 2015, Bueso wrote:

>On Mon, 12 Oct 2015, Kirill A. Shutemov wrote:
>
>>On Mon, Oct 12, 2015 at 10:49:45AM -0700, Davidlohr Bueso wrote:
>>>diff --git a/ipc/shm.c b/ipc/shm.c
>>>index 4178727..9615f19 100644
>>>--- a/ipc/shm.c
>>>+++ b/ipc/shm.c
>>>@@ -385,9 +385,25 @@ static struct mempolicy *shm_get_policy(struct vm_area_struct *vma,
>>> static int shm_mmap(struct file *file, struct vm_area_struct *vma)
>>> {
>>>-	struct shm_file_data *sfd = shm_file_data(file);
>>>+	struct file *vma_file = vma->vm_file;
>>>+	struct shm_file_data *sfd = shm_file_data(vma_file);
>>>+	struct ipc_ids *ids = &shm_ids(sfd->ns);
>>>+	struct kern_ipc_perm *shp;
>>> 	int ret;
>>>+	rcu_read_lock();
>>>+	shp = ipc_obtain_object_check(ids, sfd->id);
>>>+	if (IS_ERR(shp)) {
>>>+		ret = -EINVAL;
>>>+		goto err;
>>>+	}
>>>+
>>>+	if (!ipc_valid_object(shp)) {
>>>+		ret = -EIDRM;
>>>+		goto err;
>>>+	}
>>>+	rcu_read_unlock();
>>>+
>>
>>Hm. Isn't it racy? What prevents IPC_RMID from happening after this point?
>
>Nothing, but that is later caught by shm_open() doing similar checks. We
>basically end up doing a check between ->mmap() calls, which is fair imho.
>Note that this can occur anywhere in ipc as IPC_RMID is a user request/cmd,
>and we try to respect it -- thus you can argue this race anywhere, which is
>why we have EIDRM/EINVL. Ultimately the user should not be doing such hacks
>_anyway_. So I'm not really concerned about it.
>
>Another similar alternative would be perhaps to make shm_lock() return an
>error, and thus propagate that error to mmap return. That way we would have
>a silent way out of the warning scenario (afterward we cannot race as we
>hold the ipc object lock). However, the users would now have to take this
>into account...
>
>     [validity check lockless]
>     ->mmap()
>     [validity check lock]

Something like this, maybe. Although I could easily be missing things...
I've tested it enough to see Dimitry's testcase handled ok, and put it
through ltp. Also adding Manfred to the Cc, who always catches my idiotic
mistakes.

8<---------------------------------------------------------------------
From: Davidlohr Bueso <dave@stgolabs.net>
Date: Mon, 12 Oct 2015 19:38:34 -0700
Subject: [PATCH] ipc/shm: fix handling of (re)attaching to a deleted segment

There are currently two issues when dealing with segments that are
marked for deletion:

(i) With d0edd8528362 (ipc: convert invalid scenarios to use WARN_ON)
we relaxed the system-wide impact of using a deleted segment. However,
we can now perfectly well trigger the warning and then deference a nil
pointer -- where shp does not exist.

(ii) As of a399b29dfbaa (ipc,shm: fix shm_file deletion races) we
forbid attaching/mapping a previously deleted segment; a feature once
unique to Linux, but removed[1] as a side effect of lockless ipc object
lookups and security checks. Similarly, Dmitry Vyukov reported[2] a
simple test case that creates a new vma for a previously deleted
segment, triggering the WARN_ON mentioned in (i).

This patch tries to address (i) by moving the shp error check out
of shm_lock() and handled by the caller instead. The benefit of this
is that it allows better handling out of situations where we end up
returning ERMID or EINVAL. Specifically, there are three callers
of shm_lock which we must look into:

  - open/close -- which we ensure to never do any operations on
                  the pairs, thus becoming no-ops if found a prev
		 IPC_RMID.

  - loosing the reference of nattch upon shmat(2) -- not feasible.

In addition, the common WARN_ON call is technically removed, but
we add a new one for the bogus shmat(2) case, which is definitely
unacceptable to race with RMID if nattch is bumped up.

To address (ii), a new shm_check_vma_validity() helper is added
(for lack of a better name), which attempts to detect early on
any races with RMID, before doing the full ->mmap. There is still
a window between the callback and the shm_open call where we can
race with IPC_RMID. If this is the case, it is handled by the next
shm_lock().

shm_mmap:
     [shm validity checks lockless]
     ->mmap()
     [shm validity checks lock] <-- at this point there after there
                                    is no race as we hold the ipc
                                    object lock.

[1] https://lkml.org/lkml/2015/10/12/483
[2] https://lkml.org/lkml/2015/10/12/284

Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
---
  ipc/shm.c | 78 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++----
  1 file changed, 73 insertions(+), 5 deletions(-)

diff --git a/ipc/shm.c b/ipc/shm.c
index 4178727..47a7a67 100644
--- a/ipc/shm.c
+++ b/ipc/shm.c
@@ -156,11 +156,10 @@ static inline struct shmid_kernel *shm_lock(struct ipc_namespace *ns, int id)
  	struct kern_ipc_perm *ipcp = ipc_lock(&shm_ids(ns), id);
  
  	/*
-	 * We raced in the idr lookup or with shm_destroy().  Either way, the
-	 * ID is busted.
+	 * Callers of shm_lock() must validate the status of the returned
+	 * ipc object pointer (as returned by ipc_lock()), and error out as
+	 * appropriate.
  	 */
-	WARN_ON(IS_ERR(ipcp));
-
  	return container_of(ipcp, struct shmid_kernel, shm_perm);
  }
  
@@ -194,6 +193,15 @@ static void shm_open(struct vm_area_struct *vma)
  	struct shmid_kernel *shp;
  
  	shp = shm_lock(sfd->ns, sfd->id);
+	/*
+	 * We raced in the idr lookup or with shm_destroy().
+	 * Either way, the ID is busted. In the same scenario,
+	 * but for the close counter-part, the nattch counter
+	 * is never decreased, thus we can safely return.
+	 */
+	if (IS_ERR(shp))
+		return; /* no-op */
+
  	shp->shm_atim = get_seconds();
  	shp->shm_lprid = task_tgid_vnr(current);
  	shp->shm_nattch++;
@@ -218,6 +226,7 @@ static void shm_destroy(struct ipc_namespace *ns, struct shmid_kernel *shp)
  	ns->shm_tot -= (shp->shm_segsz + PAGE_SIZE - 1) >> PAGE_SHIFT;
  	shm_rmid(ns, shp);
  	shm_unlock(shp);
+
  	if (!is_file_hugepages(shm_file))
  		shmem_lock(shm_file, 0, shp->mlock_user);
  	else if (shp->mlock_user)
@@ -258,8 +267,17 @@ static void shm_close(struct vm_area_struct *vma)
  	struct ipc_namespace *ns = sfd->ns;
  
  	down_write(&shm_ids(ns).rwsem);
-	/* remove from the list of attaches of the shm segment */
  	shp = shm_lock(ns, sfd->id);
+	/*
+	 * We raced in the idr lookup or with shm_destroy().
+	 * Either way, the ID is busted. In the same scenario,
+	 * but for the open counter-part, the nattch counter
+	 * is never increased, thus we can safely return.
+	 */
+	if (IS_ERR(shp))
+		goto done; /* no-op */
+
+	/* Remove from the list of attaches of the shm segment */
  	shp->shm_lprid = task_tgid_vnr(current);
  	shp->shm_dtim = get_seconds();
  	shp->shm_nattch--;
@@ -267,6 +285,7 @@ static void shm_close(struct vm_area_struct *vma)
  		shm_destroy(ns, shp);
  	else
  		shm_unlock(shp);
+done:
  	up_write(&shm_ids(ns).rwsem);
  }
  
@@ -383,14 +402,50 @@ static struct mempolicy *shm_get_policy(struct vm_area_struct *vma,
  }
  #endif
  
+static inline int shm_check_vma_validity(struct vm_area_struct *vma)
+{
+	struct file *file = vma->vm_file;
+	struct shm_file_data *sfd = shm_file_data(file);
+	struct ipc_ids *ids = &shm_ids(sfd->ns);
+	struct kern_ipc_perm *shp;
+	int ret = 0;
+
+	rcu_read_lock();
+	shp = ipc_obtain_object_idr(ids, sfd->id);
+	if (IS_ERR(shp)) {
+		ret = -EINVAL;
+		goto out;
+	}
+
+	if (!ipc_valid_object(shp)) {
+		ret = -EIDRM;
+		goto out;
+	}
+out:
+	rcu_read_unlock();
+	return ret;
+}
+
  static int shm_mmap(struct file *file, struct vm_area_struct *vma)
  {
  	struct shm_file_data *sfd = shm_file_data(file);
  	int ret;
  
+	/*
+	 * Ensure that we have not raced with IPC_RMID, such that
+	 * we avoid doing the ->mmap altogether. This is a preventive
+	 * lockless check, and thus exposed to races during the mmap.
+	 * However, this is later caught in shm_open(), and handled
+	 * accordingly.
+	 */
+	ret = shm_check_vma_validity(vma);
+	if (ret)
+		return ret;
+
  	ret = sfd->file->f_op->mmap(sfd->file, vma);
  	if (ret != 0)
  		return ret;
+
  	sfd->vm_ops = vma->vm_ops;
  #ifdef CONFIG_MMU
  	WARN_ON(!sfd->vm_ops->fault);
@@ -1193,6 +1248,19 @@ out_fput:
  out_nattch:
  	down_write(&shm_ids(ns).rwsem);
  	shp = shm_lock(ns, shmid);
+	if (unlikely(IS_ERR(shp))) {
+		up_write(&shm_ids(ns).rwsem);
+		err = PTR_ERR(shp);
+		/*
+		 * Before dropping the lock, nattch was incremented,
+		 * thus we cannot race with IPC_RMID (ipc object is
+		 * marked IPC_PRIVATE). As such, this scenario should
+		 * _never_ occur.
+		 */
+		 WARN_ON(1);
+		 goto out;
+	}
+
  	shp->shm_nattch--;
  	if (shm_may_destroy(ns, shp))
  		shm_destroy(ns, shp);
-- 
2.1.4


  reply	other threads:[~2015-10-13  3:18 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-10-12  9:55 GPF in shm_lock ipc Dmitry Vyukov
2015-10-12 11:41 ` Vlastimil Babka
2015-10-12 11:44   ` Dmitry Vyukov
2015-10-12 12:27 ` Kirill A. Shutemov
2015-10-12 17:49   ` Davidlohr Bueso
2015-10-12 18:10     ` Kirill A. Shutemov
2015-10-12 18:55       ` Davidlohr Bueso
2015-10-13  3:18         ` Davidlohr Bueso [this message]
2015-10-13 12:30           ` Kirill A. Shutemov
2015-10-29 15:33             ` Dmitry Vyukov
2015-11-05 14:23               ` Kirill A. Shutemov
2015-12-21 15:44                 ` Dmitry Vyukov
2016-01-02 11:33                   ` Manfred Spraul
2016-01-02 12:19                     ` Dmitry Vyukov
2016-01-02 15:58                       ` Manfred Spraul
2016-02-02  3:25                   ` Andrew Morton
2016-02-02 21:32                     ` Dmitry Vyukov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20151013031821.GA3052@linux-uzut.site \
    --to=dave@stgolabs.net \
    --cc=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=andreyknvl@google.com \
    --cc=dave.hansen@linux.intel.com \
    --cc=dvyukov@google.com \
    --cc=gang.chen.5i5j@gmail.com \
    --cc=glider@google.com \
    --cc=hughd@google.com \
    --cc=joe@perches.com \
    --cc=kcc@google.com \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=kirill@shutemov.name \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=manfred@colorfullife.com \
    --cc=mhocko@suse.cz \
    --cc=oleg@redhat.com \
    --cc=pfeiner@google.com \
    --cc=riel@redhat.com \
    --cc=sasha.levin@oracle.com \
    --cc=sds@tycho.nsa.gov \
    --cc=syzkaller@googlegroups.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).