From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1161597AbbKEOYQ (ORCPT ); Thu, 5 Nov 2015 09:24:16 -0500 Received: from mga09.intel.com ([134.134.136.24]:54079 "EHLO mga09.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1161575AbbKEOYO (ORCPT ); Thu, 5 Nov 2015 09:24:14 -0500 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.20,247,1444719600"; d="scan'208";a="827978697" From: "Kirill A. Shutemov" To: Dmitry Vyukov Cc: "Kirill A. Shutemov" , Andrew Morton , dave.hansen@linux.intel.com, Hugh Dickins , Joe Perches , sds@tycho.nsa.gov, Oleg Nesterov , "Kirill A. Shutemov" , Rik van Riel , mhocko@suse.cz, gang.chen.5i5j@gmail.com, Peter Feiner , aarcange@redhat.com, "linux-mm@kvack.org" , LKML , syzkaller@googlegroups.com, Kostya Serebryany , Alexander Potapenko , Andrey Konovalov , Sasha Levin , Manfred Spraul In-Reply-To: References: <20151012122702.GC2544@node> <20151012174945.GC3170@linux-uzut.site> <20151012181040.GC6447@node> <20151012185533.GD3170@linux-uzut.site> <20151013031821.GA3052@linux-uzut.site> <20151013123028.GA12934@node> Subject: Re: GPF in shm_lock ipc Content-Transfer-Encoding: 7bit Message-Id: <20151105142336.46D907FD@black.fi.intel.com> Date: Thu, 5 Nov 2015 16:23:36 +0200 (EET) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Dmitry Vyukov wrote: > On Tue, Oct 13, 2015 at 8:30 PM, Kirill A. Shutemov > wrote: > > On Mon, Oct 12, 2015 at 08:18:21PM -0700, Davidlohr Bueso wrote: > >> On Mon, 12 Oct 2015, Bueso wrote: > >> > >> >On Mon, 12 Oct 2015, Kirill A. Shutemov wrote: > >> > > >> >>On Mon, Oct 12, 2015 at 10:49:45AM -0700, Davidlohr Bueso wrote: > >> >>>diff --git a/ipc/shm.c b/ipc/shm.c > >> >>>index 4178727..9615f19 100644 > >> >>>--- a/ipc/shm.c > >> >>>+++ b/ipc/shm.c > >> >>>@@ -385,9 +385,25 @@ static struct mempolicy *shm_get_policy(struct vm_area_struct *vma, > >> >>>static int shm_mmap(struct file *file, struct vm_area_struct *vma) > >> >>>{ > >> >>>- struct shm_file_data *sfd = shm_file_data(file); > >> >>>+ struct file *vma_file = vma->vm_file; > >> >>>+ struct shm_file_data *sfd = shm_file_data(vma_file); > >> >>>+ struct ipc_ids *ids = &shm_ids(sfd->ns); > >> >>>+ struct kern_ipc_perm *shp; > >> >>> int ret; > >> >>>+ rcu_read_lock(); > >> >>>+ shp = ipc_obtain_object_check(ids, sfd->id); > >> >>>+ if (IS_ERR(shp)) { > >> >>>+ ret = -EINVAL; > >> >>>+ goto err; > >> >>>+ } > >> >>>+ > >> >>>+ if (!ipc_valid_object(shp)) { > >> >>>+ ret = -EIDRM; > >> >>>+ goto err; > >> >>>+ } > >> >>>+ rcu_read_unlock(); > >> >>>+ > >> >> > >> >>Hm. Isn't it racy? What prevents IPC_RMID from happening after this point? > >> > > >> >Nothing, but that is later caught by shm_open() doing similar checks. We > >> >basically end up doing a check between ->mmap() calls, which is fair imho. > >> >Note that this can occur anywhere in ipc as IPC_RMID is a user request/cmd, > >> >and we try to respect it -- thus you can argue this race anywhere, which is > >> >why we have EIDRM/EINVL. Ultimately the user should not be doing such hacks > >> >_anyway_. So I'm not really concerned about it. > >> > > >> >Another similar alternative would be perhaps to make shm_lock() return an > >> >error, and thus propagate that error to mmap return. That way we would have > >> >a silent way out of the warning scenario (afterward we cannot race as we > >> >hold the ipc object lock). However, the users would now have to take this > >> >into account... > >> > > >> > [validity check lockless] > >> > ->mmap() > >> > [validity check lock] > >> > >> Something like this, maybe. Although I could easily be missing things... > >> I've tested it enough to see Dimitry's testcase handled ok, and put it > >> through ltp. Also adding Manfred to the Cc, who always catches my idiotic > >> mistakes. > >> > >> 8<--------------------------------------------------------------------- > >> From: Davidlohr Bueso > >> Date: Mon, 12 Oct 2015 19:38:34 -0700 > >> Subject: [PATCH] ipc/shm: fix handling of (re)attaching to a deleted segment > >> > >> There are currently two issues when dealing with segments that are > >> marked for deletion: > >> > >> (i) With d0edd8528362 (ipc: convert invalid scenarios to use WARN_ON) > >> we relaxed the system-wide impact of using a deleted segment. However, > >> we can now perfectly well trigger the warning and then deference a nil > >> pointer -- where shp does not exist. > >> > >> (ii) As of a399b29dfbaa (ipc,shm: fix shm_file deletion races) we > >> forbid attaching/mapping a previously deleted segment; a feature once > >> unique to Linux, but removed[1] as a side effect of lockless ipc object > >> lookups and security checks. Similarly, Dmitry Vyukov reported[2] a > >> simple test case that creates a new vma for a previously deleted > >> segment, triggering the WARN_ON mentioned in (i). > >> > >> This patch tries to address (i) by moving the shp error check out > >> of shm_lock() and handled by the caller instead. The benefit of this > >> is that it allows better handling out of situations where we end up > >> returning ERMID or EINVAL. Specifically, there are three callers > >> of shm_lock which we must look into: > >> > >> - open/close -- which we ensure to never do any operations on > >> the pairs, thus becoming no-ops if found a prev > >> IPC_RMID. > >> > >> - loosing the reference of nattch upon shmat(2) -- not feasible. > >> > >> In addition, the common WARN_ON call is technically removed, but > >> we add a new one for the bogus shmat(2) case, which is definitely > >> unacceptable to race with RMID if nattch is bumped up. > >> > >> To address (ii), a new shm_check_vma_validity() helper is added > >> (for lack of a better name), which attempts to detect early on > >> any races with RMID, before doing the full ->mmap. There is still > >> a window between the callback and the shm_open call where we can > >> race with IPC_RMID. If this is the case, it is handled by the next > >> shm_lock(). > >> > >> shm_mmap: > >> [shm validity checks lockless] > >> ->mmap() > >> [shm validity checks lock] <-- at this point there after there > >> is no race as we hold the ipc > >> object lock. > >> > >> [1] https://lkml.org/lkml/2015/10/12/483 > >> [2] https://lkml.org/lkml/2015/10/12/284 > >> > >> Signed-off-by: Davidlohr Bueso > >> --- > >> ipc/shm.c | 78 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++---- > >> 1 file changed, 73 insertions(+), 5 deletions(-) > >> > >> diff --git a/ipc/shm.c b/ipc/shm.c > >> index 4178727..47a7a67 100644 > >> --- a/ipc/shm.c > >> +++ b/ipc/shm.c > >> @@ -156,11 +156,10 @@ static inline struct shmid_kernel *shm_lock(struct ipc_namespace *ns, int id) > >> struct kern_ipc_perm *ipcp = ipc_lock(&shm_ids(ns), id); > >> /* > >> - * We raced in the idr lookup or with shm_destroy(). Either way, the > >> - * ID is busted. > >> + * Callers of shm_lock() must validate the status of the returned > >> + * ipc object pointer (as returned by ipc_lock()), and error out as > >> + * appropriate. > >> */ > >> - WARN_ON(IS_ERR(ipcp)); > >> - > >> return container_of(ipcp, struct shmid_kernel, shm_perm); > >> } > >> @@ -194,6 +193,15 @@ static void shm_open(struct vm_area_struct *vma) > >> struct shmid_kernel *shp; > >> shp = shm_lock(sfd->ns, sfd->id); > >> + /* > >> + * We raced in the idr lookup or with shm_destroy(). > >> + * Either way, the ID is busted. In the same scenario, > >> + * but for the close counter-part, the nattch counter > >> + * is never decreased, thus we can safely return. > >> + */ > >> + if (IS_ERR(shp)) > >> + return; /* no-op */ > >> + > >> shp->shm_atim = get_seconds(); > >> shp->shm_lprid = task_tgid_vnr(current); > >> shp->shm_nattch++; > > > > ... > > > >> static int shm_mmap(struct file *file, struct vm_area_struct *vma) > >> { > >> struct shm_file_data *sfd = shm_file_data(file); > >> int ret; > >> + /* > >> + * Ensure that we have not raced with IPC_RMID, such that > >> + * we avoid doing the ->mmap altogether. This is a preventive > >> + * lockless check, and thus exposed to races during the mmap. > >> + * However, this is later caught in shm_open(), and handled > >> + * accordingly. > >> + */ > >> + ret = shm_check_vma_validity(vma); > >> + if (ret) > >> + return ret; > >> + > >> ret = sfd->file->f_op->mmap(sfd->file, vma); > >> if (ret != 0) > >> return ret; > >> + > >> sfd->vm_ops = vma->vm_ops; > >> #ifdef CONFIG_MMU > >> WARN_ON(!sfd->vm_ops->fault); > > > > If I read it correctly, with the patch we would ignore locking failure > > inside shm_open() and mmap will succeed in this case. So the idea is to > > have shm_close() no-op and therefore symmetrical. That's look fragile to > > me. We would silently miss some other broken open/close pattern. > > > > I would rather propagate error to shm_mmap() caller and therefore to > > userspace. I guess it's better to opencode shm_open() in shm_mmap() and > > return error this way. shm_open() itself can have WARN_ON_ONCE() for > > failure or something. > > > Davidlohr, any updates on this? Is it committed? I don't see it in Linus tree. > What do you think about Kirill's comments? What about this: >>From 06b0fc9d62592f6f3ad9f45cccf1f6a5b3113bdc Mon Sep 17 00:00:00 2001 From: "Kirill A. Shutemov" Date: Thu, 5 Nov 2015 15:53:04 +0200 Subject: [PATCH] ipc/shm: handle removed segments gracefully in shm_mmap() remap_file_pages(2) emulation can reach file which represents removed IPC ID as long as a memory segment is mapped. It breaks expectations of IPC subsystem. Test case (rewritten to be more human readable, originally autogenerated by syzkaller[1]): #define _GNU_SOURCE #include #include #include #include #define PAGE_SIZE 4096 int main() { int id; void *p; id = shmget(IPC_PRIVATE, 3 * PAGE_SIZE, 0); p = shmat(id, NULL, 0); shmctl(id, IPC_RMID, NULL); remap_file_pages(p, 3 * PAGE_SIZE, 0, 7, 0); return 0; } The patch changes shm_mmap() and code around shm_lock() to propagate locking error back to caller of shm_mmap(). [1] http://github.com/google/syzkaller Signed-off-by: Kirill A. Shutemov Reported-by: Dmitry Vyukov Cc: Davidlohr Bueso --- ipc/shm.c | 53 +++++++++++++++++++++++++++++++++++++++++++---------- 1 file changed, 43 insertions(+), 10 deletions(-) diff --git a/ipc/shm.c b/ipc/shm.c index 41787276e141..3174634ca4e5 100644 --- a/ipc/shm.c +++ b/ipc/shm.c @@ -156,11 +156,12 @@ static inline struct shmid_kernel *shm_lock(struct ipc_namespace *ns, int id) struct kern_ipc_perm *ipcp = ipc_lock(&shm_ids(ns), id); /* - * We raced in the idr lookup or with shm_destroy(). Either way, the - * ID is busted. + * Callers of shm_lock() must validate the status of the returned ipc + * object pointer (as returned by ipc_lock()), and error out as + * appropriate. */ - WARN_ON(IS_ERR(ipcp)); - + if (IS_ERR(ipcp)) + return (void *)ipcp; return container_of(ipcp, struct shmid_kernel, shm_perm); } @@ -186,18 +187,33 @@ static inline void shm_rmid(struct ipc_namespace *ns, struct shmid_kernel *s) } -/* This is called by fork, once for every shm attach. */ -static void shm_open(struct vm_area_struct *vma) +static int __shm_open(struct vm_area_struct *vma) { struct file *file = vma->vm_file; struct shm_file_data *sfd = shm_file_data(file); struct shmid_kernel *shp; shp = shm_lock(sfd->ns, sfd->id); + + if (IS_ERR(shp)) + return PTR_ERR(shp); + shp->shm_atim = get_seconds(); shp->shm_lprid = task_tgid_vnr(current); shp->shm_nattch++; shm_unlock(shp); + return 0; +} + +/* This is called by fork, once for every shm attach. */ +static void shm_open(struct vm_area_struct *vma) +{ + int err = __shm_open(vma); + /* + * We raced in the idr lookup or with shm_destroy(). + * Either way, the ID is busted. + */ + WARN_ON_ONCE(err); } /* @@ -260,6 +276,14 @@ static void shm_close(struct vm_area_struct *vma) down_write(&shm_ids(ns).rwsem); /* remove from the list of attaches of the shm segment */ shp = shm_lock(ns, sfd->id); + + /* + * We raced in the idr lookup or with shm_destroy(). + * Either way, the ID is busted. + */ + if (WARN_ON_ONCE(IS_ERR(shp))) + goto done; /* no-op */ + shp->shm_lprid = task_tgid_vnr(current); shp->shm_dtim = get_seconds(); shp->shm_nattch--; @@ -267,6 +291,7 @@ static void shm_close(struct vm_area_struct *vma) shm_destroy(ns, shp); else shm_unlock(shp); +done: up_write(&shm_ids(ns).rwsem); } @@ -388,17 +413,25 @@ static int shm_mmap(struct file *file, struct vm_area_struct *vma) struct shm_file_data *sfd = shm_file_data(file); int ret; + /* + * In case of remap_file_pages() emulation, the file can represent + * removed IPC ID: propogate shm_lock() error to caller. + */ + ret =__shm_open(vma); + if (ret) + return ret; + ret = sfd->file->f_op->mmap(sfd->file, vma); - if (ret != 0) + if (ret) { + shm_close(vma); return ret; + } sfd->vm_ops = vma->vm_ops; #ifdef CONFIG_MMU WARN_ON(!sfd->vm_ops->fault); #endif vma->vm_ops = &shm_vm_ops; - shm_open(vma); - - return ret; + return 0; } static int shm_release(struct inode *ino, struct file *file) -- 2.6.1