linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* GPF in shm_lock ipc
@ 2015-10-12  9:55 Dmitry Vyukov
  2015-10-12 11:41 ` Vlastimil Babka
  2015-10-12 12:27 ` Kirill A. Shutemov
  0 siblings, 2 replies; 17+ messages in thread
From: Dmitry Vyukov @ 2015-10-12  9:55 UTC (permalink / raw)
  To: Andrew Morton, dave, dave.hansen, Hugh Dickins, Joe Perches, sds,
	Oleg Nesterov, Kirill A. Shutemov, Rik van Riel, mhocko,
	gang.chen.5i5j, Peter Feiner, aarcange, linux-mm, LKML
  Cc: syzkaller, Kostya Serebryany, Alexander Potapenko,
	Andrey Konovalov, Sasha Levin

Hello,

The following program crashes kernel:

// autogenerated by syzkaller (http://github.com/google/syzkaller)
#include <syscall.h>
#include <string.h>
#include <stdint.h>

int main()
{
        long r0 = syscall(SYS_shmget, 0x0ul, 0x2ul, 0x8ul);
        long r1 = syscall(SYS_shmat, r0, 0x20000000ul, 0x0ul);
        long r2 = syscall(SYS_mremap, 0x20000000ul, 0x1000ul,
0x3000ul, 0x3ul, 0x207f9000ul);
        long r19 = syscall(SYS_shmctl, r0, 0x0ul, 0);
        long r20 = syscall(SYS_remap_file_pages, 0x207f9000ul,
0x3000ul, 0x0ul, 0x7ul, 0x0ul);
        return 0;
}

On commit dd36d7393d6310b0c1adefb22fba79c3cf8a577c
(git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git)

------------[ cut here ]------------
WARNING: CPU: 2 PID: 2636 at ipc/shm.c:162 shm_open+0x74/0x80()
Modules linked in:
CPU: 2 PID: 2636 Comm: a.out Not tainted 4.3.0-rc3+ #37
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
 ffffffff81bcb43c ffff88081bf0bd70 ffffffff812fe8d6 0000000000000000
 ffff88081bf0bda8 ffffffff81051ff1 ffffffffffffffea ffff88081b896ca8
 ffff880819b81620 ffff8800bbaa6d00 ffff880819b81600 ffff88081bf0bdb8
Call Trace:
 [<     inline     >] __dump_stack lib/dump_stack.c:15
 [<ffffffff812fe8d6>] dump_stack+0x44/0x5e lib/dump_stack.c:50
 [<ffffffff81051ff1>] warn_slowpath_common+0x81/0xc0 kernel/panic.c:447
 [<ffffffff810520e5>] warn_slowpath_null+0x15/0x20 kernel/panic.c:480
 [<     inline     >] shm_lock ipc/shm.c:162
 [<ffffffff81295c64>] shm_open+0x74/0x80 ipc/shm.c:196
 [<ffffffff81295cbe>] shm_mmap+0x4e/0x80 ipc/shm.c:399 (discriminator 2)
 [<ffffffff81142d14>] mmap_region+0x3c4/0x5e0 mm/mmap.c:1627
 [<ffffffff81143227>] do_mmap+0x2f7/0x3d0 mm/mmap.c:1402
 [<     inline     >] do_mmap_pgoff include/linux/mm.h:1930
 [<     inline     >] SYSC_remap_file_pages mm/mmap.c:2694
 [<ffffffff811434a9>] SyS_remap_file_pages+0x179/0x240 mm/mmap.c:2641
 [<ffffffff81859a97>] entry_SYSCALL_64_fastpath+0x12/0x6a
arch/x86/entry/entry_64.S:185
---[ end trace 0873e743fc645a8c ]---
BUG: unable to handle kernel NULL pointer dereference at 000000000000003a
IP: [<ffffffff81295c25>] shm_open+0x35/0x80 ipc/shm.c:197
PGD 81a08b067 PUD 81a01b067 PMD 0
Oops: 0002 [#1] SMP
Modules linked in:
CPU: 2 PID: 2636 Comm: a.out Tainted: G        W       4.3.0-rc3+ #37
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
task: ffff880819b26800 ti: ffff88081bf08000 task.ti: ffff88081bf08000
RIP: 0010:[<ffffffff81295c25>]  [<ffffffff81295c25>] shm_open+0x35/0x80
RSP: 0018:ffff88081bf0bdc8  EFLAGS: 00010296
RAX: 00000000560d1d13 RBX: ffffffffffffffea RCX: ffffffff8151e710
RDX: 0000000000000001 RSI: 0000000000000000 RDI: 0000000000000009
RBP: ffff88081bf0bdd0 R08: 000000000000000a R09: 0000000000000001
R10: 0000000000000000 R11: 00000000000001bc R12: ffff88081b896ca8
R13: ffff880819b81620 R14: ffff8800bbaa6d00 R15: ffff880819b81600
FS:  00007fbfd966f700(0000) GS:ffff88083fd00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000000000000003a CR3: 0000000819a97000 CR4: 00000000000006e0
Stack:
 ffff880819b81200 ffff88081bf0bdf8 ffffffff81295cbe 0000000019b81620
 ffff880819b81628 00000000207f9000 ffff88081bf0be80 ffffffff81142d14
 ffff8800bbaa6d00 0000000000000007 0000000000000000 0000000000000007
Call Trace:
 [<ffffffff81295cbe>] shm_mmap+0x4e/0x80 ipc/shm.c:399 (discriminator 2)
 [<ffffffff81142d14>] mmap_region+0x3c4/0x5e0 mm/mmap.c:1627
 [<ffffffff81143227>] do_mmap+0x2f7/0x3d0 mm/mmap.c:1402
 [<     inline     >] do_mmap_pgoff include/linux/mm.h:1930
 [<     inline     >] SYSC_remap_file_pages mm/mmap.c:2694
 [<ffffffff811434a9>] SyS_remap_file_pages+0x179/0x240 mm/mmap.c:2641
 [<ffffffff81859a97>] entry_SYSCALL_64_fastpath+0x12/0x6a
arch/x86/entry/entry_64.S:185
Code: 00 00 48 8b 80 d0 00 00 00 48 8b 50 08 8b 30 48 8d ba c8 00 00
00 e8 6b b1 ff ff 48 3d 00 f0 ff ff 48 89 c3 77 33 e8 5b 6b e1 ff <48>
89 43 50 65 48 8b 04 25 c0 ad 00 00 48 8b 80 40 04 00 00 48
RIP  [<ffffffff81295c25>] shm_open+0x35/0x80 ipc/shm.c:197
 RSP <ffff88081bf0bdc8>
CR2: 000000000000003a
---[ end trace 0873e743fc645a8d ]---

Found with syzkaller fuzzer.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: GPF in shm_lock ipc
  2015-10-12  9:55 GPF in shm_lock ipc Dmitry Vyukov
@ 2015-10-12 11:41 ` Vlastimil Babka
  2015-10-12 11:44   ` Dmitry Vyukov
  2015-10-12 12:27 ` Kirill A. Shutemov
  1 sibling, 1 reply; 17+ messages in thread
From: Vlastimil Babka @ 2015-10-12 11:41 UTC (permalink / raw)
  To: Dmitry Vyukov, Andrew Morton, dave, dave.hansen, Hugh Dickins,
	Joe Perches, sds, Oleg Nesterov, Kirill A. Shutemov,
	Rik van Riel, mhocko, gang.chen.5i5j, Peter Feiner, aarcange,
	linux-mm, LKML
  Cc: syzkaller, Kostya Serebryany, Alexander Potapenko,
	Andrey Konovalov, Sasha Levin

On 10/12/2015 11:55 AM, Dmitry Vyukov wrote:
> Hello,
>
> The following program crashes kernel:
>
> // autogenerated by syzkaller (http://github.com/google/syzkaller)
> #include <syscall.h>
> #include <string.h>
> #include <stdint.h>
>
> int main()
> {
>          long r0 = syscall(SYS_shmget, 0x0ul, 0x2ul, 0x8ul);
>          long r1 = syscall(SYS_shmat, r0, 0x20000000ul, 0x0ul);
>          long r2 = syscall(SYS_mremap, 0x20000000ul, 0x1000ul,
> 0x3000ul, 0x3ul, 0x207f9000ul);
>          long r19 = syscall(SYS_shmctl, r0, 0x0ul, 0);
>          long r20 = syscall(SYS_remap_file_pages, 0x207f9000ul,
> 0x3000ul, 0x0ul, 0x7ul, 0x0ul);
>          return 0;
> }
>
> On commit dd36d7393d6310b0c1adefb22fba79c3cf8a577c
> (git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git)
>
> ------------[ cut here ]------------
> WARNING: CPU: 2 PID: 2636 at ipc/shm.c:162 shm_open+0x74/0x80()
> Modules linked in:
> CPU: 2 PID: 2636 Comm: a.out Not tainted 4.3.0-rc3+ #37
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
>   ffffffff81bcb43c ffff88081bf0bd70 ffffffff812fe8d6 0000000000000000
>   ffff88081bf0bda8 ffffffff81051ff1 ffffffffffffffea ffff88081b896ca8
>   ffff880819b81620 ffff8800bbaa6d00 ffff880819b81600 ffff88081bf0bdb8
> Call Trace:
>   [<     inline     >] __dump_stack lib/dump_stack.c:15
>   [<ffffffff812fe8d6>] dump_stack+0x44/0x5e lib/dump_stack.c:50
>   [<ffffffff81051ff1>] warn_slowpath_common+0x81/0xc0 kernel/panic.c:447
>   [<ffffffff810520e5>] warn_slowpath_null+0x15/0x20 kernel/panic.c:480
>   [<     inline     >] shm_lock ipc/shm.c:162
>   [<ffffffff81295c64>] shm_open+0x74/0x80 ipc/shm.c:196
>   [<ffffffff81295cbe>] shm_mmap+0x4e/0x80 ipc/shm.c:399 (discriminator 2)
>   [<ffffffff81142d14>] mmap_region+0x3c4/0x5e0 mm/mmap.c:1627
>   [<ffffffff81143227>] do_mmap+0x2f7/0x3d0 mm/mmap.c:1402
>   [<     inline     >] do_mmap_pgoff include/linux/mm.h:1930
>   [<     inline     >] SYSC_remap_file_pages mm/mmap.c:2694

Hmm what kind of stack unwinder catches inlines? Some external patch, 
based on debuginfo?

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: GPF in shm_lock ipc
  2015-10-12 11:41 ` Vlastimil Babka
@ 2015-10-12 11:44   ` Dmitry Vyukov
  0 siblings, 0 replies; 17+ messages in thread
From: Dmitry Vyukov @ 2015-10-12 11:44 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: Andrew Morton, dave, dave.hansen, Hugh Dickins, Joe Perches, sds,
	Oleg Nesterov, Kirill A. Shutemov, Rik van Riel, mhocko,
	gang.chen.5i5j, Peter Feiner, aarcange, linux-mm, LKML,
	syzkaller, Kostya Serebryany, Alexander Potapenko,
	Andrey Konovalov, Sasha Levin

On Mon, Oct 12, 2015 at 1:41 PM, Vlastimil Babka <vbabka@suse.cz> wrote:
> On 10/12/2015 11:55 AM, Dmitry Vyukov wrote:
>>
>> Hello,
>>
>> The following program crashes kernel:
>>
>> // autogenerated by syzkaller (http://github.com/google/syzkaller)
>> #include <syscall.h>
>> #include <string.h>
>> #include <stdint.h>
>>
>> int main()
>> {
>>          long r0 = syscall(SYS_shmget, 0x0ul, 0x2ul, 0x8ul);
>>          long r1 = syscall(SYS_shmat, r0, 0x20000000ul, 0x0ul);
>>          long r2 = syscall(SYS_mremap, 0x20000000ul, 0x1000ul,
>> 0x3000ul, 0x3ul, 0x207f9000ul);
>>          long r19 = syscall(SYS_shmctl, r0, 0x0ul, 0);
>>          long r20 = syscall(SYS_remap_file_pages, 0x207f9000ul,
>> 0x3000ul, 0x0ul, 0x7ul, 0x0ul);
>>          return 0;
>> }
>>
>> On commit dd36d7393d6310b0c1adefb22fba79c3cf8a577c
>> (git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git)
>>
>> ------------[ cut here ]------------
>> WARNING: CPU: 2 PID: 2636 at ipc/shm.c:162 shm_open+0x74/0x80()
>> Modules linked in:
>> CPU: 2 PID: 2636 Comm: a.out Not tainted 4.3.0-rc3+ #37
>> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs
>> 01/01/2011
>>   ffffffff81bcb43c ffff88081bf0bd70 ffffffff812fe8d6 0000000000000000
>>   ffff88081bf0bda8 ffffffff81051ff1 ffffffffffffffea ffff88081b896ca8
>>   ffff880819b81620 ffff8800bbaa6d00 ffff880819b81600 ffff88081bf0bdb8
>> Call Trace:
>>   [<     inline     >] __dump_stack lib/dump_stack.c:15
>>   [<ffffffff812fe8d6>] dump_stack+0x44/0x5e lib/dump_stack.c:50
>>   [<ffffffff81051ff1>] warn_slowpath_common+0x81/0xc0 kernel/panic.c:447
>>   [<ffffffff810520e5>] warn_slowpath_null+0x15/0x20 kernel/panic.c:480
>>   [<     inline     >] shm_lock ipc/shm.c:162
>>   [<ffffffff81295c64>] shm_open+0x74/0x80 ipc/shm.c:196
>>   [<ffffffff81295cbe>] shm_mmap+0x4e/0x80 ipc/shm.c:399 (discriminator 2)
>>   [<ffffffff81142d14>] mmap_region+0x3c4/0x5e0 mm/mmap.c:1627
>>   [<ffffffff81143227>] do_mmap+0x2f7/0x3d0 mm/mmap.c:1402
>>   [<     inline     >] do_mmap_pgoff include/linux/mm.h:1930
>>   [<     inline     >] SYSC_remap_file_pages mm/mmap.c:2694
>
>
> Hmm what kind of stack unwinder catches inlines? Some external patch, based
> on debuginfo?

We use the following script to symbolize kernel stack traces. It adds
file:line info and inlined frames.
https://github.com/google/sanitizers/blob/master/address-sanitizer/tools/kasan_symbolize.py

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: GPF in shm_lock ipc
  2015-10-12  9:55 GPF in shm_lock ipc Dmitry Vyukov
  2015-10-12 11:41 ` Vlastimil Babka
@ 2015-10-12 12:27 ` Kirill A. Shutemov
  2015-10-12 17:49   ` Davidlohr Bueso
  1 sibling, 1 reply; 17+ messages in thread
From: Kirill A. Shutemov @ 2015-10-12 12:27 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Andrew Morton, dave, dave.hansen, Hugh Dickins, Joe Perches, sds,
	Oleg Nesterov, Kirill A. Shutemov, Rik van Riel, mhocko,
	gang.chen.5i5j, Peter Feiner, aarcange, linux-mm, LKML,
	syzkaller, Kostya Serebryany, Alexander Potapenko,
	Andrey Konovalov, Sasha Levin

On Mon, Oct 12, 2015 at 11:55:44AM +0200, Dmitry Vyukov wrote:
> Hello,
> 
> The following program crashes kernel:
> 
> // autogenerated by syzkaller (http://github.com/google/syzkaller)
> #include <syscall.h>
> #include <string.h>
> #include <stdint.h>
> 
> int main()
> {
>         long r0 = syscall(SYS_shmget, 0x0ul, 0x2ul, 0x8ul);
>         long r1 = syscall(SYS_shmat, r0, 0x20000000ul, 0x0ul);
>         long r2 = syscall(SYS_mremap, 0x20000000ul, 0x1000ul,
> 0x3000ul, 0x3ul, 0x207f9000ul);
>         long r19 = syscall(SYS_shmctl, r0, 0x0ul, 0);
>         long r20 = syscall(SYS_remap_file_pages, 0x207f9000ul,
> 0x3000ul, 0x0ul, 0x7ul, 0x0ul);
>         return 0;
> }

Here's slightly simplified and more human readable reproducer:

#define _GNU_SOURCE
#include <stdlib.h>
#include <sys/ipc.h>
#include <sys/mman.h>
#include <sys/shm.h>

#define PAGE_SIZE 4096

int main()
{
	int id;
	void *p;

	id = shmget(IPC_PRIVATE, 3 * PAGE_SIZE, 0);
	p = shmat(id, NULL, 0);
	shmctl(id, IPC_RMID, NULL);
	remap_file_pages(p, 3 * PAGE_SIZE, 0, 7, 0);

        return 0;
}

> 
> On commit dd36d7393d6310b0c1adefb22fba79c3cf8a577c
> (git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git)
> 
> ------------[ cut here ]------------
> WARNING: CPU: 2 PID: 2636 at ipc/shm.c:162 shm_open+0x74/0x80()
> Modules linked in:
> CPU: 2 PID: 2636 Comm: a.out Not tainted 4.3.0-rc3+ #37
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
>  ffffffff81bcb43c ffff88081bf0bd70 ffffffff812fe8d6 0000000000000000
>  ffff88081bf0bda8 ffffffff81051ff1 ffffffffffffffea ffff88081b896ca8
>  ffff880819b81620 ffff8800bbaa6d00 ffff880819b81600 ffff88081bf0bdb8
> Call Trace:
>  [<     inline     >] __dump_stack lib/dump_stack.c:15
>  [<ffffffff812fe8d6>] dump_stack+0x44/0x5e lib/dump_stack.c:50
>  [<ffffffff81051ff1>] warn_slowpath_common+0x81/0xc0 kernel/panic.c:447
>  [<ffffffff810520e5>] warn_slowpath_null+0x15/0x20 kernel/panic.c:480
>  [<     inline     >] shm_lock ipc/shm.c:162
>  [<ffffffff81295c64>] shm_open+0x74/0x80 ipc/shm.c:196
>  [<ffffffff81295cbe>] shm_mmap+0x4e/0x80 ipc/shm.c:399 (discriminator 2)
>  [<ffffffff81142d14>] mmap_region+0x3c4/0x5e0 mm/mmap.c:1627
>  [<ffffffff81143227>] do_mmap+0x2f7/0x3d0 mm/mmap.c:1402
>  [<     inline     >] do_mmap_pgoff include/linux/mm.h:1930
>  [<     inline     >] SYSC_remap_file_pages mm/mmap.c:2694
>  [<ffffffff811434a9>] SyS_remap_file_pages+0x179/0x240 mm/mmap.c:2641
>  [<ffffffff81859a97>] entry_SYSCALL_64_fastpath+0x12/0x6a
> arch/x86/entry/entry_64.S:185
> ---[ end trace 0873e743fc645a8c ]---

Okay. The problem is that SysV IPC SHM doesn't expect the memory region to
be mmap'ed after IPC_RMID, but remap_file_pages() manages to create new
VMA using existing one.

I'm not sure what the right way to fix it. The SysV SHM VMA is pretty
normal from mm POV (no special flags, etc.) and it meats remap_file_pages
criteria (shared mapping). Every fix I can think of on mm side is ugly.

Probably better to teach shm_mmap() to fall off gracefully in case of
non-existing shmid? I'm not familiar with IPC code.
Could anyone look into it?

-- 
 Kirill A. Shutemov

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: GPF in shm_lock ipc
  2015-10-12 12:27 ` Kirill A. Shutemov
@ 2015-10-12 17:49   ` Davidlohr Bueso
  2015-10-12 18:10     ` Kirill A. Shutemov
  0 siblings, 1 reply; 17+ messages in thread
From: Davidlohr Bueso @ 2015-10-12 17:49 UTC (permalink / raw)
  To: Kirill A. Shutemov
  Cc: Dmitry Vyukov, Andrew Morton, dave.hansen, Hugh Dickins,
	Joe Perches, sds, Oleg Nesterov, Kirill A. Shutemov,
	Rik van Riel, mhocko, gang.chen.5i5j, Peter Feiner, aarcange,
	linux-mm, LKML, syzkaller, Kostya Serebryany,
	Alexander Potapenko, Andrey Konovalov, Sasha Levin

On Mon, 12 Oct 2015, Kirill A. Shutemov wrote:

>On Mon, Oct 12, 2015 at 11:55:44AM +0200, Dmitry Vyukov wrote:
>Here's slightly simplified and more human readable reproducer:
>
>#define _GNU_SOURCE
>#include <stdlib.h>
>#include <sys/ipc.h>
>#include <sys/mman.h>
>#include <sys/shm.h>
>
>#define PAGE_SIZE 4096
>
>int main()
>{
>	int id;
>	void *p;
>
>	id = shmget(IPC_PRIVATE, 3 * PAGE_SIZE, 0);
>	p = shmat(id, NULL, 0);
>	shmctl(id, IPC_RMID, NULL);
>	remap_file_pages(p, 3 * PAGE_SIZE, 0, 7, 0);
>
>        return 0;
>}

Thanks!

>>
>> On commit dd36d7393d6310b0c1adefb22fba79c3cf8a577c
>> (git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git)
>>
>> ------------[ cut here ]------------
>> WARNING: CPU: 2 PID: 2636 at ipc/shm.c:162 shm_open+0x74/0x80()
>> Modules linked in:
>> CPU: 2 PID: 2636 Comm: a.out Not tainted 4.3.0-rc3+ #37
>> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
>>  ffffffff81bcb43c ffff88081bf0bd70 ffffffff812fe8d6 0000000000000000
>>  ffff88081bf0bda8 ffffffff81051ff1 ffffffffffffffea ffff88081b896ca8
>>  ffff880819b81620 ffff8800bbaa6d00 ffff880819b81600 ffff88081bf0bdb8
>> Call Trace:
>>  [<     inline     >] __dump_stack lib/dump_stack.c:15
>>  [<ffffffff812fe8d6>] dump_stack+0x44/0x5e lib/dump_stack.c:50
>>  [<ffffffff81051ff1>] warn_slowpath_common+0x81/0xc0 kernel/panic.c:447
>>  [<ffffffff810520e5>] warn_slowpath_null+0x15/0x20 kernel/panic.c:480
>>  [<     inline     >] shm_lock ipc/shm.c:162
>>  [<ffffffff81295c64>] shm_open+0x74/0x80 ipc/shm.c:196
>>  [<ffffffff81295cbe>] shm_mmap+0x4e/0x80 ipc/shm.c:399 (discriminator 2)
>>  [<ffffffff81142d14>] mmap_region+0x3c4/0x5e0 mm/mmap.c:1627
>>  [<ffffffff81143227>] do_mmap+0x2f7/0x3d0 mm/mmap.c:1402
>>  [<     inline     >] do_mmap_pgoff include/linux/mm.h:1930
>>  [<     inline     >] SYSC_remap_file_pages mm/mmap.c:2694
>>  [<ffffffff811434a9>] SyS_remap_file_pages+0x179/0x240 mm/mmap.c:2641
>>  [<ffffffff81859a97>] entry_SYSCALL_64_fastpath+0x12/0x6a
>> arch/x86/entry/entry_64.S:185
>> ---[ end trace 0873e743fc645a8c ]---
>
>Okay. The problem is that SysV IPC SHM doesn't expect the memory region to
>be mmap'ed after IPC_RMID, but remap_file_pages() manages to create new
>VMA using existing one.

Indeed, naughty users should not be mapping/(re)attaching after IPC_RMID.
This is common to all things ipc, not only to shm. And while Linux nowadays
does enforce that nothing touch a segment marked for deletion[1], we have
contradictory scenarios where the resource is only freed once the last attached
process exits. 

[1] https://lkml.org/lkml/2015/10/12/483

So this warning used to in fact be a full BUG_ON, but ultimately the ipc
subsystem acknowledges that this situation is possible but fully blames the
user responsible, and therefore we only warn about bogus usage.

>I'm not sure what the right way to fix it. The SysV SHM VMA is pretty
>normal from mm POV (no special flags, etc.) and it meats remap_file_pages
>criteria (shared mapping). Every fix I can think of on mm side is ugly.
>
>Probably better to teach shm_mmap() to fall off gracefully in case of
>non-existing shmid? I'm not familiar with IPC code.
>Could anyone look into it?

Yeah, this was my approach as well. Very little tested other than it solves
the above warning. Basically we don't want to be doing mmap if the segment
was deleted, thus return a corresponding error instead of triggering the
same error later on after mmaping, via shm_open(). I still need to think
a bit more about this, but seems legit if we don't hurt userspace while
at it (at least the idea, not considering any overhead in doing the idr
lookup). Thoughts?

Thanks,
Davidlohr

diff --git a/ipc/shm.c b/ipc/shm.c
index 4178727..9615f19 100644
--- a/ipc/shm.c
+++ b/ipc/shm.c
@@ -385,9 +385,25 @@ static struct mempolicy *shm_get_policy(struct vm_area_struct *vma,
  
  static int shm_mmap(struct file *file, struct vm_area_struct *vma)
  {
-	struct shm_file_data *sfd = shm_file_data(file);
+	struct file *vma_file = vma->vm_file;
+	struct shm_file_data *sfd = shm_file_data(vma_file);
+	struct ipc_ids *ids = &shm_ids(sfd->ns);
+	struct kern_ipc_perm *shp;
  	int ret;
  
+	rcu_read_lock();
+	shp = ipc_obtain_object_check(ids, sfd->id);
+	if (IS_ERR(shp)) {
+		ret = -EINVAL;
+		goto err;
+	}
+
+	if (!ipc_valid_object(shp)) {
+		ret = -EIDRM;
+		goto err;
+	}
+	rcu_read_unlock();
+
  	ret = sfd->file->f_op->mmap(sfd->file, vma);
  	if (ret != 0)
  		return ret;
@@ -399,6 +415,9 @@ static int shm_mmap(struct file *file, struct vm_area_struct *vma)
  	shm_open(vma);
  
  	return ret;
+err:
+	rcu_read_unlock();
+	return ret;
  }
  
  static int shm_release(struct inode *ino, struct file *file)






--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: GPF in shm_lock ipc
  2015-10-12 17:49   ` Davidlohr Bueso
@ 2015-10-12 18:10     ` Kirill A. Shutemov
  2015-10-12 18:55       ` Davidlohr Bueso
  0 siblings, 1 reply; 17+ messages in thread
From: Kirill A. Shutemov @ 2015-10-12 18:10 UTC (permalink / raw)
  To: Dmitry Vyukov, Andrew Morton, dave.hansen, Hugh Dickins,
	Joe Perches, sds, Oleg Nesterov, Kirill A. Shutemov,
	Rik van Riel, mhocko, gang.chen.5i5j, Peter Feiner, aarcange,
	linux-mm, LKML, syzkaller, Kostya Serebryany,
	Alexander Potapenko, Andrey Konovalov, Sasha Levin

On Mon, Oct 12, 2015 at 10:49:45AM -0700, Davidlohr Bueso wrote:
> On Mon, 12 Oct 2015, Kirill A. Shutemov wrote:
> 
> >On Mon, Oct 12, 2015 at 11:55:44AM +0200, Dmitry Vyukov wrote:
> >Here's slightly simplified and more human readable reproducer:
> >
> >#define _GNU_SOURCE
> >#include <stdlib.h>
> >#include <sys/ipc.h>
> >#include <sys/mman.h>
> >#include <sys/shm.h>
> >
> >#define PAGE_SIZE 4096
> >
> >int main()
> >{
> >	int id;
> >	void *p;
> >
> >	id = shmget(IPC_PRIVATE, 3 * PAGE_SIZE, 0);
> >	p = shmat(id, NULL, 0);
> >	shmctl(id, IPC_RMID, NULL);
> >	remap_file_pages(p, 3 * PAGE_SIZE, 0, 7, 0);
> >
> >       return 0;
> >}
> 
> Thanks!
> 
> >>
> >>On commit dd36d7393d6310b0c1adefb22fba79c3cf8a577c
> >>(git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git)
> >>
> >>------------[ cut here ]------------
> >>WARNING: CPU: 2 PID: 2636 at ipc/shm.c:162 shm_open+0x74/0x80()
> >>Modules linked in:
> >>CPU: 2 PID: 2636 Comm: a.out Not tainted 4.3.0-rc3+ #37
> >>Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
> >> ffffffff81bcb43c ffff88081bf0bd70 ffffffff812fe8d6 0000000000000000
> >> ffff88081bf0bda8 ffffffff81051ff1 ffffffffffffffea ffff88081b896ca8
> >> ffff880819b81620 ffff8800bbaa6d00 ffff880819b81600 ffff88081bf0bdb8
> >>Call Trace:
> >> [<     inline     >] __dump_stack lib/dump_stack.c:15
> >> [<ffffffff812fe8d6>] dump_stack+0x44/0x5e lib/dump_stack.c:50
> >> [<ffffffff81051ff1>] warn_slowpath_common+0x81/0xc0 kernel/panic.c:447
> >> [<ffffffff810520e5>] warn_slowpath_null+0x15/0x20 kernel/panic.c:480
> >> [<     inline     >] shm_lock ipc/shm.c:162
> >> [<ffffffff81295c64>] shm_open+0x74/0x80 ipc/shm.c:196
> >> [<ffffffff81295cbe>] shm_mmap+0x4e/0x80 ipc/shm.c:399 (discriminator 2)
> >> [<ffffffff81142d14>] mmap_region+0x3c4/0x5e0 mm/mmap.c:1627
> >> [<ffffffff81143227>] do_mmap+0x2f7/0x3d0 mm/mmap.c:1402
> >> [<     inline     >] do_mmap_pgoff include/linux/mm.h:1930
> >> [<     inline     >] SYSC_remap_file_pages mm/mmap.c:2694
> >> [<ffffffff811434a9>] SyS_remap_file_pages+0x179/0x240 mm/mmap.c:2641
> >> [<ffffffff81859a97>] entry_SYSCALL_64_fastpath+0x12/0x6a
> >>arch/x86/entry/entry_64.S:185
> >>---[ end trace 0873e743fc645a8c ]---
> >
> >Okay. The problem is that SysV IPC SHM doesn't expect the memory region to
> >be mmap'ed after IPC_RMID, but remap_file_pages() manages to create new
> >VMA using existing one.
> 
> Indeed, naughty users should not be mapping/(re)attaching after IPC_RMID.
> This is common to all things ipc, not only to shm. And while Linux nowadays
> does enforce that nothing touch a segment marked for deletion[1], we have
> contradictory scenarios where the resource is only freed once the last attached
> process exits.
> 
> [1] https://lkml.org/lkml/2015/10/12/483
> 
> So this warning used to in fact be a full BUG_ON, but ultimately the ipc
> subsystem acknowledges that this situation is possible but fully blames the
> user responsible, and therefore we only warn about bogus usage.
> 
> >I'm not sure what the right way to fix it. The SysV SHM VMA is pretty
> >normal from mm POV (no special flags, etc.) and it meats remap_file_pages
> >criteria (shared mapping). Every fix I can think of on mm side is ugly.
> >
> >Probably better to teach shm_mmap() to fall off gracefully in case of
> >non-existing shmid? I'm not familiar with IPC code.
> >Could anyone look into it?
> 
> Yeah, this was my approach as well. Very little tested other than it solves
> the above warning. Basically we don't want to be doing mmap if the segment
> was deleted, thus return a corresponding error instead of triggering the
> same error later on after mmaping, via shm_open(). I still need to think
> a bit more about this, but seems legit if we don't hurt userspace while
> at it (at least the idea, not considering any overhead in doing the idr
> lookup). Thoughts?
> 
> Thanks,
> Davidlohr
> 
> diff --git a/ipc/shm.c b/ipc/shm.c
> index 4178727..9615f19 100644
> --- a/ipc/shm.c
> +++ b/ipc/shm.c
> @@ -385,9 +385,25 @@ static struct mempolicy *shm_get_policy(struct vm_area_struct *vma,
>  static int shm_mmap(struct file *file, struct vm_area_struct *vma)
>  {
> -	struct shm_file_data *sfd = shm_file_data(file);
> +	struct file *vma_file = vma->vm_file;
> +	struct shm_file_data *sfd = shm_file_data(vma_file);
> +	struct ipc_ids *ids = &shm_ids(sfd->ns);
> +	struct kern_ipc_perm *shp;
>  	int ret;
> +	rcu_read_lock();
> +	shp = ipc_obtain_object_check(ids, sfd->id);
> +	if (IS_ERR(shp)) {
> +		ret = -EINVAL;
> +		goto err;
> +	}
> +
> +	if (!ipc_valid_object(shp)) {
> +		ret = -EIDRM;
> +		goto err;
> +	}
> +	rcu_read_unlock();
> +

Hm. Isn't it racy? What prevents IPC_RMID from happening after this point?
Shouldn't we bump shm_nattch here? Or some other refcount?


>  	ret = sfd->file->f_op->mmap(sfd->file, vma);
>  	if (ret != 0)
>  		return ret;
> @@ -399,6 +415,9 @@ static int shm_mmap(struct file *file, struct vm_area_struct *vma)
>  	shm_open(vma);
>  	return ret;
> +err:
> +	rcu_read_unlock();
> +	return ret;
>  }
>  static int shm_release(struct inode *ino, struct file *file)
> 
> 
> 
> 
> 
> 

-- 
 Kirill A. Shutemov

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: GPF in shm_lock ipc
  2015-10-12 18:10     ` Kirill A. Shutemov
@ 2015-10-12 18:55       ` Davidlohr Bueso
  2015-10-13  3:18         ` Davidlohr Bueso
  0 siblings, 1 reply; 17+ messages in thread
From: Davidlohr Bueso @ 2015-10-12 18:55 UTC (permalink / raw)
  To: Kirill A. Shutemov
  Cc: Dmitry Vyukov, Andrew Morton, dave.hansen, Hugh Dickins,
	Joe Perches, sds, Oleg Nesterov, Kirill A. Shutemov,
	Rik van Riel, mhocko, gang.chen.5i5j, Peter Feiner, aarcange,
	linux-mm, LKML, syzkaller, Kostya Serebryany,
	Alexander Potapenko, Andrey Konovalov, Sasha Levin

On Mon, 12 Oct 2015, Kirill A. Shutemov wrote:

>On Mon, Oct 12, 2015 at 10:49:45AM -0700, Davidlohr Bueso wrote:
>> diff --git a/ipc/shm.c b/ipc/shm.c
>> index 4178727..9615f19 100644
>> --- a/ipc/shm.c
>> +++ b/ipc/shm.c
>> @@ -385,9 +385,25 @@ static struct mempolicy *shm_get_policy(struct vm_area_struct *vma,
>>  static int shm_mmap(struct file *file, struct vm_area_struct *vma)
>>  {
>> -	struct shm_file_data *sfd = shm_file_data(file);
>> +	struct file *vma_file = vma->vm_file;
>> +	struct shm_file_data *sfd = shm_file_data(vma_file);
>> +	struct ipc_ids *ids = &shm_ids(sfd->ns);
>> +	struct kern_ipc_perm *shp;
>>  	int ret;
>> +	rcu_read_lock();
>> +	shp = ipc_obtain_object_check(ids, sfd->id);
>> +	if (IS_ERR(shp)) {
>> +		ret = -EINVAL;
>> +		goto err;
>> +	}
>> +
>> +	if (!ipc_valid_object(shp)) {
>> +		ret = -EIDRM;
>> +		goto err;
>> +	}
>> +	rcu_read_unlock();
>> +
>
>Hm. Isn't it racy? What prevents IPC_RMID from happening after this point?

Nothing, but that is later caught by shm_open() doing similar checks. We
basically end up doing a check between ->mmap() calls, which is fair imho.
Note that this can occur anywhere in ipc as IPC_RMID is a user request/cmd,
and we try to respect it -- thus you can argue this race anywhere, which is
why we have EIDRM/EINVL. Ultimately the user should not be doing such hacks
_anyway_. So I'm not really concerned about it.

Another similar alternative would be perhaps to make shm_lock() return an
error, and thus propagate that error to mmap return. That way we would have
a silent way out of the warning scenario (afterward we cannot race as we
hold the ipc object lock). However, the users would now have to take this
into account...

      [validity check lockless]
      ->mmap()
      [validity check lock]

>Shouldn't we bump shm_nattch here? Or some other refcount?

At least not shm_nattach, as that would acknowledge a new attachment after
a valid IPC_RMID. But the problem is also with how we check for marked for
deletion segments -- ipc_valid_object() checking the deleted flag. As such,
we always rely on explicitly checking against the deleted flag.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: GPF in shm_lock ipc
  2015-10-12 18:55       ` Davidlohr Bueso
@ 2015-10-13  3:18         ` Davidlohr Bueso
  2015-10-13 12:30           ` Kirill A. Shutemov
  0 siblings, 1 reply; 17+ messages in thread
From: Davidlohr Bueso @ 2015-10-13  3:18 UTC (permalink / raw)
  To: Kirill A. Shutemov, Dmitry Vyukov, Andrew Morton, dave.hansen,
	Hugh Dickins, Joe Perches, sds, Oleg Nesterov,
	Kirill A. Shutemov, Rik van Riel, mhocko, gang.chen.5i5j,
	Peter Feiner, aarcange, linux-mm, LKML, syzkaller,
	Kostya Serebryany, Alexander Potapenko, Andrey Konovalov,
	Sasha Levin, Manfred Spraul

On Mon, 12 Oct 2015, Bueso wrote:

>On Mon, 12 Oct 2015, Kirill A. Shutemov wrote:
>
>>On Mon, Oct 12, 2015 at 10:49:45AM -0700, Davidlohr Bueso wrote:
>>>diff --git a/ipc/shm.c b/ipc/shm.c
>>>index 4178727..9615f19 100644
>>>--- a/ipc/shm.c
>>>+++ b/ipc/shm.c
>>>@@ -385,9 +385,25 @@ static struct mempolicy *shm_get_policy(struct vm_area_struct *vma,
>>> static int shm_mmap(struct file *file, struct vm_area_struct *vma)
>>> {
>>>-	struct shm_file_data *sfd = shm_file_data(file);
>>>+	struct file *vma_file = vma->vm_file;
>>>+	struct shm_file_data *sfd = shm_file_data(vma_file);
>>>+	struct ipc_ids *ids = &shm_ids(sfd->ns);
>>>+	struct kern_ipc_perm *shp;
>>> 	int ret;
>>>+	rcu_read_lock();
>>>+	shp = ipc_obtain_object_check(ids, sfd->id);
>>>+	if (IS_ERR(shp)) {
>>>+		ret = -EINVAL;
>>>+		goto err;
>>>+	}
>>>+
>>>+	if (!ipc_valid_object(shp)) {
>>>+		ret = -EIDRM;
>>>+		goto err;
>>>+	}
>>>+	rcu_read_unlock();
>>>+
>>
>>Hm. Isn't it racy? What prevents IPC_RMID from happening after this point?
>
>Nothing, but that is later caught by shm_open() doing similar checks. We
>basically end up doing a check between ->mmap() calls, which is fair imho.
>Note that this can occur anywhere in ipc as IPC_RMID is a user request/cmd,
>and we try to respect it -- thus you can argue this race anywhere, which is
>why we have EIDRM/EINVL. Ultimately the user should not be doing such hacks
>_anyway_. So I'm not really concerned about it.
>
>Another similar alternative would be perhaps to make shm_lock() return an
>error, and thus propagate that error to mmap return. That way we would have
>a silent way out of the warning scenario (afterward we cannot race as we
>hold the ipc object lock). However, the users would now have to take this
>into account...
>
>     [validity check lockless]
>     ->mmap()
>     [validity check lock]

Something like this, maybe. Although I could easily be missing things...
I've tested it enough to see Dimitry's testcase handled ok, and put it
through ltp. Also adding Manfred to the Cc, who always catches my idiotic
mistakes.

8<---------------------------------------------------------------------
From: Davidlohr Bueso <dave@stgolabs.net>
Date: Mon, 12 Oct 2015 19:38:34 -0700
Subject: [PATCH] ipc/shm: fix handling of (re)attaching to a deleted segment

There are currently two issues when dealing with segments that are
marked for deletion:

(i) With d0edd8528362 (ipc: convert invalid scenarios to use WARN_ON)
we relaxed the system-wide impact of using a deleted segment. However,
we can now perfectly well trigger the warning and then deference a nil
pointer -- where shp does not exist.

(ii) As of a399b29dfbaa (ipc,shm: fix shm_file deletion races) we
forbid attaching/mapping a previously deleted segment; a feature once
unique to Linux, but removed[1] as a side effect of lockless ipc object
lookups and security checks. Similarly, Dmitry Vyukov reported[2] a
simple test case that creates a new vma for a previously deleted
segment, triggering the WARN_ON mentioned in (i).

This patch tries to address (i) by moving the shp error check out
of shm_lock() and handled by the caller instead. The benefit of this
is that it allows better handling out of situations where we end up
returning ERMID or EINVAL. Specifically, there are three callers
of shm_lock which we must look into:

  - open/close -- which we ensure to never do any operations on
                  the pairs, thus becoming no-ops if found a prev
		 IPC_RMID.

  - loosing the reference of nattch upon shmat(2) -- not feasible.

In addition, the common WARN_ON call is technically removed, but
we add a new one for the bogus shmat(2) case, which is definitely
unacceptable to race with RMID if nattch is bumped up.

To address (ii), a new shm_check_vma_validity() helper is added
(for lack of a better name), which attempts to detect early on
any races with RMID, before doing the full ->mmap. There is still
a window between the callback and the shm_open call where we can
race with IPC_RMID. If this is the case, it is handled by the next
shm_lock().

shm_mmap:
     [shm validity checks lockless]
     ->mmap()
     [shm validity checks lock] <-- at this point there after there
                                    is no race as we hold the ipc
                                    object lock.

[1] https://lkml.org/lkml/2015/10/12/483
[2] https://lkml.org/lkml/2015/10/12/284

Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
---
  ipc/shm.c | 78 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++----
  1 file changed, 73 insertions(+), 5 deletions(-)

diff --git a/ipc/shm.c b/ipc/shm.c
index 4178727..47a7a67 100644
--- a/ipc/shm.c
+++ b/ipc/shm.c
@@ -156,11 +156,10 @@ static inline struct shmid_kernel *shm_lock(struct ipc_namespace *ns, int id)
  	struct kern_ipc_perm *ipcp = ipc_lock(&shm_ids(ns), id);
  
  	/*
-	 * We raced in the idr lookup or with shm_destroy().  Either way, the
-	 * ID is busted.
+	 * Callers of shm_lock() must validate the status of the returned
+	 * ipc object pointer (as returned by ipc_lock()), and error out as
+	 * appropriate.
  	 */
-	WARN_ON(IS_ERR(ipcp));
-
  	return container_of(ipcp, struct shmid_kernel, shm_perm);
  }
  
@@ -194,6 +193,15 @@ static void shm_open(struct vm_area_struct *vma)
  	struct shmid_kernel *shp;
  
  	shp = shm_lock(sfd->ns, sfd->id);
+	/*
+	 * We raced in the idr lookup or with shm_destroy().
+	 * Either way, the ID is busted. In the same scenario,
+	 * but for the close counter-part, the nattch counter
+	 * is never decreased, thus we can safely return.
+	 */
+	if (IS_ERR(shp))
+		return; /* no-op */
+
  	shp->shm_atim = get_seconds();
  	shp->shm_lprid = task_tgid_vnr(current);
  	shp->shm_nattch++;
@@ -218,6 +226,7 @@ static void shm_destroy(struct ipc_namespace *ns, struct shmid_kernel *shp)
  	ns->shm_tot -= (shp->shm_segsz + PAGE_SIZE - 1) >> PAGE_SHIFT;
  	shm_rmid(ns, shp);
  	shm_unlock(shp);
+
  	if (!is_file_hugepages(shm_file))
  		shmem_lock(shm_file, 0, shp->mlock_user);
  	else if (shp->mlock_user)
@@ -258,8 +267,17 @@ static void shm_close(struct vm_area_struct *vma)
  	struct ipc_namespace *ns = sfd->ns;
  
  	down_write(&shm_ids(ns).rwsem);
-	/* remove from the list of attaches of the shm segment */
  	shp = shm_lock(ns, sfd->id);
+	/*
+	 * We raced in the idr lookup or with shm_destroy().
+	 * Either way, the ID is busted. In the same scenario,
+	 * but for the open counter-part, the nattch counter
+	 * is never increased, thus we can safely return.
+	 */
+	if (IS_ERR(shp))
+		goto done; /* no-op */
+
+	/* Remove from the list of attaches of the shm segment */
  	shp->shm_lprid = task_tgid_vnr(current);
  	shp->shm_dtim = get_seconds();
  	shp->shm_nattch--;
@@ -267,6 +285,7 @@ static void shm_close(struct vm_area_struct *vma)
  		shm_destroy(ns, shp);
  	else
  		shm_unlock(shp);
+done:
  	up_write(&shm_ids(ns).rwsem);
  }
  
@@ -383,14 +402,50 @@ static struct mempolicy *shm_get_policy(struct vm_area_struct *vma,
  }
  #endif
  
+static inline int shm_check_vma_validity(struct vm_area_struct *vma)
+{
+	struct file *file = vma->vm_file;
+	struct shm_file_data *sfd = shm_file_data(file);
+	struct ipc_ids *ids = &shm_ids(sfd->ns);
+	struct kern_ipc_perm *shp;
+	int ret = 0;
+
+	rcu_read_lock();
+	shp = ipc_obtain_object_idr(ids, sfd->id);
+	if (IS_ERR(shp)) {
+		ret = -EINVAL;
+		goto out;
+	}
+
+	if (!ipc_valid_object(shp)) {
+		ret = -EIDRM;
+		goto out;
+	}
+out:
+	rcu_read_unlock();
+	return ret;
+}
+
  static int shm_mmap(struct file *file, struct vm_area_struct *vma)
  {
  	struct shm_file_data *sfd = shm_file_data(file);
  	int ret;
  
+	/*
+	 * Ensure that we have not raced with IPC_RMID, such that
+	 * we avoid doing the ->mmap altogether. This is a preventive
+	 * lockless check, and thus exposed to races during the mmap.
+	 * However, this is later caught in shm_open(), and handled
+	 * accordingly.
+	 */
+	ret = shm_check_vma_validity(vma);
+	if (ret)
+		return ret;
+
  	ret = sfd->file->f_op->mmap(sfd->file, vma);
  	if (ret != 0)
  		return ret;
+
  	sfd->vm_ops = vma->vm_ops;
  #ifdef CONFIG_MMU
  	WARN_ON(!sfd->vm_ops->fault);
@@ -1193,6 +1248,19 @@ out_fput:
  out_nattch:
  	down_write(&shm_ids(ns).rwsem);
  	shp = shm_lock(ns, shmid);
+	if (unlikely(IS_ERR(shp))) {
+		up_write(&shm_ids(ns).rwsem);
+		err = PTR_ERR(shp);
+		/*
+		 * Before dropping the lock, nattch was incremented,
+		 * thus we cannot race with IPC_RMID (ipc object is
+		 * marked IPC_PRIVATE). As such, this scenario should
+		 * _never_ occur.
+		 */
+		 WARN_ON(1);
+		 goto out;
+	}
+
  	shp->shm_nattch--;
  	if (shm_may_destroy(ns, shp))
  		shm_destroy(ns, shp);
-- 
2.1.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: GPF in shm_lock ipc
  2015-10-13  3:18         ` Davidlohr Bueso
@ 2015-10-13 12:30           ` Kirill A. Shutemov
  2015-10-29 15:33             ` Dmitry Vyukov
  0 siblings, 1 reply; 17+ messages in thread
From: Kirill A. Shutemov @ 2015-10-13 12:30 UTC (permalink / raw)
  To: Dmitry Vyukov, Andrew Morton, dave.hansen, Hugh Dickins,
	Joe Perches, sds, Oleg Nesterov, Kirill A. Shutemov,
	Rik van Riel, mhocko, gang.chen.5i5j, Peter Feiner, aarcange,
	linux-mm, LKML, syzkaller, Kostya Serebryany,
	Alexander Potapenko, Andrey Konovalov, Sasha Levin,
	Manfred Spraul

On Mon, Oct 12, 2015 at 08:18:21PM -0700, Davidlohr Bueso wrote:
> On Mon, 12 Oct 2015, Bueso wrote:
> 
> >On Mon, 12 Oct 2015, Kirill A. Shutemov wrote:
> >
> >>On Mon, Oct 12, 2015 at 10:49:45AM -0700, Davidlohr Bueso wrote:
> >>>diff --git a/ipc/shm.c b/ipc/shm.c
> >>>index 4178727..9615f19 100644
> >>>--- a/ipc/shm.c
> >>>+++ b/ipc/shm.c
> >>>@@ -385,9 +385,25 @@ static struct mempolicy *shm_get_policy(struct vm_area_struct *vma,
> >>>static int shm_mmap(struct file *file, struct vm_area_struct *vma)
> >>>{
> >>>-	struct shm_file_data *sfd = shm_file_data(file);
> >>>+	struct file *vma_file = vma->vm_file;
> >>>+	struct shm_file_data *sfd = shm_file_data(vma_file);
> >>>+	struct ipc_ids *ids = &shm_ids(sfd->ns);
> >>>+	struct kern_ipc_perm *shp;
> >>>	int ret;
> >>>+	rcu_read_lock();
> >>>+	shp = ipc_obtain_object_check(ids, sfd->id);
> >>>+	if (IS_ERR(shp)) {
> >>>+		ret = -EINVAL;
> >>>+		goto err;
> >>>+	}
> >>>+
> >>>+	if (!ipc_valid_object(shp)) {
> >>>+		ret = -EIDRM;
> >>>+		goto err;
> >>>+	}
> >>>+	rcu_read_unlock();
> >>>+
> >>
> >>Hm. Isn't it racy? What prevents IPC_RMID from happening after this point?
> >
> >Nothing, but that is later caught by shm_open() doing similar checks. We
> >basically end up doing a check between ->mmap() calls, which is fair imho.
> >Note that this can occur anywhere in ipc as IPC_RMID is a user request/cmd,
> >and we try to respect it -- thus you can argue this race anywhere, which is
> >why we have EIDRM/EINVL. Ultimately the user should not be doing such hacks
> >_anyway_. So I'm not really concerned about it.
> >
> >Another similar alternative would be perhaps to make shm_lock() return an
> >error, and thus propagate that error to mmap return. That way we would have
> >a silent way out of the warning scenario (afterward we cannot race as we
> >hold the ipc object lock). However, the users would now have to take this
> >into account...
> >
> >    [validity check lockless]
> >    ->mmap()
> >    [validity check lock]
> 
> Something like this, maybe. Although I could easily be missing things...
> I've tested it enough to see Dimitry's testcase handled ok, and put it
> through ltp. Also adding Manfred to the Cc, who always catches my idiotic
> mistakes.
> 
> 8<---------------------------------------------------------------------
> From: Davidlohr Bueso <dave@stgolabs.net>
> Date: Mon, 12 Oct 2015 19:38:34 -0700
> Subject: [PATCH] ipc/shm: fix handling of (re)attaching to a deleted segment
> 
> There are currently two issues when dealing with segments that are
> marked for deletion:
> 
> (i) With d0edd8528362 (ipc: convert invalid scenarios to use WARN_ON)
> we relaxed the system-wide impact of using a deleted segment. However,
> we can now perfectly well trigger the warning and then deference a nil
> pointer -- where shp does not exist.
> 
> (ii) As of a399b29dfbaa (ipc,shm: fix shm_file deletion races) we
> forbid attaching/mapping a previously deleted segment; a feature once
> unique to Linux, but removed[1] as a side effect of lockless ipc object
> lookups and security checks. Similarly, Dmitry Vyukov reported[2] a
> simple test case that creates a new vma for a previously deleted
> segment, triggering the WARN_ON mentioned in (i).
> 
> This patch tries to address (i) by moving the shp error check out
> of shm_lock() and handled by the caller instead. The benefit of this
> is that it allows better handling out of situations where we end up
> returning ERMID or EINVAL. Specifically, there are three callers
> of shm_lock which we must look into:
> 
>  - open/close -- which we ensure to never do any operations on
>                  the pairs, thus becoming no-ops if found a prev
> 		 IPC_RMID.
> 
>  - loosing the reference of nattch upon shmat(2) -- not feasible.
> 
> In addition, the common WARN_ON call is technically removed, but
> we add a new one for the bogus shmat(2) case, which is definitely
> unacceptable to race with RMID if nattch is bumped up.
> 
> To address (ii), a new shm_check_vma_validity() helper is added
> (for lack of a better name), which attempts to detect early on
> any races with RMID, before doing the full ->mmap. There is still
> a window between the callback and the shm_open call where we can
> race with IPC_RMID. If this is the case, it is handled by the next
> shm_lock().
> 
> shm_mmap:
>     [shm validity checks lockless]
>     ->mmap()
>     [shm validity checks lock] <-- at this point there after there
>                                    is no race as we hold the ipc
>                                    object lock.
> 
> [1] https://lkml.org/lkml/2015/10/12/483
> [2] https://lkml.org/lkml/2015/10/12/284
> 
> Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
> ---
>  ipc/shm.c | 78 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++----
>  1 file changed, 73 insertions(+), 5 deletions(-)
> 
> diff --git a/ipc/shm.c b/ipc/shm.c
> index 4178727..47a7a67 100644
> --- a/ipc/shm.c
> +++ b/ipc/shm.c
> @@ -156,11 +156,10 @@ static inline struct shmid_kernel *shm_lock(struct ipc_namespace *ns, int id)
>  	struct kern_ipc_perm *ipcp = ipc_lock(&shm_ids(ns), id);
>  	/*
> -	 * We raced in the idr lookup or with shm_destroy().  Either way, the
> -	 * ID is busted.
> +	 * Callers of shm_lock() must validate the status of the returned
> +	 * ipc object pointer (as returned by ipc_lock()), and error out as
> +	 * appropriate.
>  	 */
> -	WARN_ON(IS_ERR(ipcp));
> -
>  	return container_of(ipcp, struct shmid_kernel, shm_perm);
>  }
> @@ -194,6 +193,15 @@ static void shm_open(struct vm_area_struct *vma)
>  	struct shmid_kernel *shp;
>  	shp = shm_lock(sfd->ns, sfd->id);
> +	/*
> +	 * We raced in the idr lookup or with shm_destroy().
> +	 * Either way, the ID is busted. In the same scenario,
> +	 * but for the close counter-part, the nattch counter
> +	 * is never decreased, thus we can safely return.
> +	 */
> +	if (IS_ERR(shp))
> +		return; /* no-op */
> +
>  	shp->shm_atim = get_seconds();
>  	shp->shm_lprid = task_tgid_vnr(current);
>  	shp->shm_nattch++;

...

>  static int shm_mmap(struct file *file, struct vm_area_struct *vma)
>  {
>  	struct shm_file_data *sfd = shm_file_data(file);
>  	int ret;
> +	/*
> +	 * Ensure that we have not raced with IPC_RMID, such that
> +	 * we avoid doing the ->mmap altogether. This is a preventive
> +	 * lockless check, and thus exposed to races during the mmap.
> +	 * However, this is later caught in shm_open(), and handled
> +	 * accordingly.
> +	 */
> +	ret = shm_check_vma_validity(vma);
> +	if (ret)
> +		return ret;
> +
>  	ret = sfd->file->f_op->mmap(sfd->file, vma);
>  	if (ret != 0)
>  		return ret;
> +
>  	sfd->vm_ops = vma->vm_ops;
>  #ifdef CONFIG_MMU
>  	WARN_ON(!sfd->vm_ops->fault);

If I read it correctly, with the patch we would ignore locking failure
inside shm_open() and mmap will succeed in this case. So the idea is to
have shm_close() no-op and therefore symmetrical. That's look fragile to
me. We would silently miss some other broken open/close pattern.

I would rather propagate error to shm_mmap() caller and therefore to
userspace. I guess it's better to opencode shm_open() in shm_mmap() and
return error this way. shm_open() itself can have WARN_ON_ONCE() for
failure or something.

-- 
 Kirill A. Shutemov

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: GPF in shm_lock ipc
  2015-10-13 12:30           ` Kirill A. Shutemov
@ 2015-10-29 15:33             ` Dmitry Vyukov
  2015-11-05 14:23               ` Kirill A. Shutemov
  0 siblings, 1 reply; 17+ messages in thread
From: Dmitry Vyukov @ 2015-10-29 15:33 UTC (permalink / raw)
  To: Kirill A. Shutemov
  Cc: Andrew Morton, dave.hansen, Hugh Dickins, Joe Perches, sds,
	Oleg Nesterov, Kirill A. Shutemov, Rik van Riel, mhocko,
	gang.chen.5i5j, Peter Feiner, aarcange, linux-mm, LKML,
	syzkaller, Kostya Serebryany, Alexander Potapenko,
	Andrey Konovalov, Sasha Levin, Manfred Spraul

On Tue, Oct 13, 2015 at 8:30 PM, Kirill A. Shutemov
<kirill@shutemov.name> wrote:
> On Mon, Oct 12, 2015 at 08:18:21PM -0700, Davidlohr Bueso wrote:
>> On Mon, 12 Oct 2015, Bueso wrote:
>>
>> >On Mon, 12 Oct 2015, Kirill A. Shutemov wrote:
>> >
>> >>On Mon, Oct 12, 2015 at 10:49:45AM -0700, Davidlohr Bueso wrote:
>> >>>diff --git a/ipc/shm.c b/ipc/shm.c
>> >>>index 4178727..9615f19 100644
>> >>>--- a/ipc/shm.c
>> >>>+++ b/ipc/shm.c
>> >>>@@ -385,9 +385,25 @@ static struct mempolicy *shm_get_policy(struct vm_area_struct *vma,
>> >>>static int shm_mmap(struct file *file, struct vm_area_struct *vma)
>> >>>{
>> >>>-  struct shm_file_data *sfd = shm_file_data(file);
>> >>>+  struct file *vma_file = vma->vm_file;
>> >>>+  struct shm_file_data *sfd = shm_file_data(vma_file);
>> >>>+  struct ipc_ids *ids = &shm_ids(sfd->ns);
>> >>>+  struct kern_ipc_perm *shp;
>> >>>   int ret;
>> >>>+  rcu_read_lock();
>> >>>+  shp = ipc_obtain_object_check(ids, sfd->id);
>> >>>+  if (IS_ERR(shp)) {
>> >>>+          ret = -EINVAL;
>> >>>+          goto err;
>> >>>+  }
>> >>>+
>> >>>+  if (!ipc_valid_object(shp)) {
>> >>>+          ret = -EIDRM;
>> >>>+          goto err;
>> >>>+  }
>> >>>+  rcu_read_unlock();
>> >>>+
>> >>
>> >>Hm. Isn't it racy? What prevents IPC_RMID from happening after this point?
>> >
>> >Nothing, but that is later caught by shm_open() doing similar checks. We
>> >basically end up doing a check between ->mmap() calls, which is fair imho.
>> >Note that this can occur anywhere in ipc as IPC_RMID is a user request/cmd,
>> >and we try to respect it -- thus you can argue this race anywhere, which is
>> >why we have EIDRM/EINVL. Ultimately the user should not be doing such hacks
>> >_anyway_. So I'm not really concerned about it.
>> >
>> >Another similar alternative would be perhaps to make shm_lock() return an
>> >error, and thus propagate that error to mmap return. That way we would have
>> >a silent way out of the warning scenario (afterward we cannot race as we
>> >hold the ipc object lock). However, the users would now have to take this
>> >into account...
>> >
>> >    [validity check lockless]
>> >    ->mmap()
>> >    [validity check lock]
>>
>> Something like this, maybe. Although I could easily be missing things...
>> I've tested it enough to see Dimitry's testcase handled ok, and put it
>> through ltp. Also adding Manfred to the Cc, who always catches my idiotic
>> mistakes.
>>
>> 8<---------------------------------------------------------------------
>> From: Davidlohr Bueso <dave@stgolabs.net>
>> Date: Mon, 12 Oct 2015 19:38:34 -0700
>> Subject: [PATCH] ipc/shm: fix handling of (re)attaching to a deleted segment
>>
>> There are currently two issues when dealing with segments that are
>> marked for deletion:
>>
>> (i) With d0edd8528362 (ipc: convert invalid scenarios to use WARN_ON)
>> we relaxed the system-wide impact of using a deleted segment. However,
>> we can now perfectly well trigger the warning and then deference a nil
>> pointer -- where shp does not exist.
>>
>> (ii) As of a399b29dfbaa (ipc,shm: fix shm_file deletion races) we
>> forbid attaching/mapping a previously deleted segment; a feature once
>> unique to Linux, but removed[1] as a side effect of lockless ipc object
>> lookups and security checks. Similarly, Dmitry Vyukov reported[2] a
>> simple test case that creates a new vma for a previously deleted
>> segment, triggering the WARN_ON mentioned in (i).
>>
>> This patch tries to address (i) by moving the shp error check out
>> of shm_lock() and handled by the caller instead. The benefit of this
>> is that it allows better handling out of situations where we end up
>> returning ERMID or EINVAL. Specifically, there are three callers
>> of shm_lock which we must look into:
>>
>>  - open/close -- which we ensure to never do any operations on
>>                  the pairs, thus becoming no-ops if found a prev
>>                IPC_RMID.
>>
>>  - loosing the reference of nattch upon shmat(2) -- not feasible.
>>
>> In addition, the common WARN_ON call is technically removed, but
>> we add a new one for the bogus shmat(2) case, which is definitely
>> unacceptable to race with RMID if nattch is bumped up.
>>
>> To address (ii), a new shm_check_vma_validity() helper is added
>> (for lack of a better name), which attempts to detect early on
>> any races with RMID, before doing the full ->mmap. There is still
>> a window between the callback and the shm_open call where we can
>> race with IPC_RMID. If this is the case, it is handled by the next
>> shm_lock().
>>
>> shm_mmap:
>>     [shm validity checks lockless]
>>     ->mmap()
>>     [shm validity checks lock] <-- at this point there after there
>>                                    is no race as we hold the ipc
>>                                    object lock.
>>
>> [1] https://lkml.org/lkml/2015/10/12/483
>> [2] https://lkml.org/lkml/2015/10/12/284
>>
>> Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
>> ---
>>  ipc/shm.c | 78 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++----
>>  1 file changed, 73 insertions(+), 5 deletions(-)
>>
>> diff --git a/ipc/shm.c b/ipc/shm.c
>> index 4178727..47a7a67 100644
>> --- a/ipc/shm.c
>> +++ b/ipc/shm.c
>> @@ -156,11 +156,10 @@ static inline struct shmid_kernel *shm_lock(struct ipc_namespace *ns, int id)
>>       struct kern_ipc_perm *ipcp = ipc_lock(&shm_ids(ns), id);
>>       /*
>> -      * We raced in the idr lookup or with shm_destroy().  Either way, the
>> -      * ID is busted.
>> +      * Callers of shm_lock() must validate the status of the returned
>> +      * ipc object pointer (as returned by ipc_lock()), and error out as
>> +      * appropriate.
>>        */
>> -     WARN_ON(IS_ERR(ipcp));
>> -
>>       return container_of(ipcp, struct shmid_kernel, shm_perm);
>>  }
>> @@ -194,6 +193,15 @@ static void shm_open(struct vm_area_struct *vma)
>>       struct shmid_kernel *shp;
>>       shp = shm_lock(sfd->ns, sfd->id);
>> +     /*
>> +      * We raced in the idr lookup or with shm_destroy().
>> +      * Either way, the ID is busted. In the same scenario,
>> +      * but for the close counter-part, the nattch counter
>> +      * is never decreased, thus we can safely return.
>> +      */
>> +     if (IS_ERR(shp))
>> +             return; /* no-op */
>> +
>>       shp->shm_atim = get_seconds();
>>       shp->shm_lprid = task_tgid_vnr(current);
>>       shp->shm_nattch++;
>
> ...
>
>>  static int shm_mmap(struct file *file, struct vm_area_struct *vma)
>>  {
>>       struct shm_file_data *sfd = shm_file_data(file);
>>       int ret;
>> +     /*
>> +      * Ensure that we have not raced with IPC_RMID, such that
>> +      * we avoid doing the ->mmap altogether. This is a preventive
>> +      * lockless check, and thus exposed to races during the mmap.
>> +      * However, this is later caught in shm_open(), and handled
>> +      * accordingly.
>> +      */
>> +     ret = shm_check_vma_validity(vma);
>> +     if (ret)
>> +             return ret;
>> +
>>       ret = sfd->file->f_op->mmap(sfd->file, vma);
>>       if (ret != 0)
>>               return ret;
>> +
>>       sfd->vm_ops = vma->vm_ops;
>>  #ifdef CONFIG_MMU
>>       WARN_ON(!sfd->vm_ops->fault);
>
> If I read it correctly, with the patch we would ignore locking failure
> inside shm_open() and mmap will succeed in this case. So the idea is to
> have shm_close() no-op and therefore symmetrical. That's look fragile to
> me. We would silently miss some other broken open/close pattern.
>
> I would rather propagate error to shm_mmap() caller and therefore to
> userspace. I guess it's better to opencode shm_open() in shm_mmap() and
> return error this way. shm_open() itself can have WARN_ON_ONCE() for
> failure or something.


Davidlohr, any updates on this? Is it committed? I don't see it in Linus tree.
What do you think about Kirill's comments?

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: GPF in shm_lock ipc
  2015-10-29 15:33             ` Dmitry Vyukov
@ 2015-11-05 14:23               ` Kirill A. Shutemov
  2015-12-21 15:44                 ` Dmitry Vyukov
  0 siblings, 1 reply; 17+ messages in thread
From: Kirill A. Shutemov @ 2015-11-05 14:23 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Kirill A. Shutemov, Andrew Morton, dave.hansen, Hugh Dickins,
	Joe Perches, sds, Oleg Nesterov, Kirill A. Shutemov,
	Rik van Riel, mhocko, gang.chen.5i5j, Peter Feiner, aarcange,
	linux-mm, LKML, syzkaller, Kostya Serebryany,
	Alexander Potapenko, Andrey Konovalov, Sasha Levin,
	Manfred Spraul

Dmitry Vyukov wrote:
> On Tue, Oct 13, 2015 at 8:30 PM, Kirill A. Shutemov
> <kirill@shutemov.name> wrote:
> > On Mon, Oct 12, 2015 at 08:18:21PM -0700, Davidlohr Bueso wrote:
> >> On Mon, 12 Oct 2015, Bueso wrote:
> >>
> >> >On Mon, 12 Oct 2015, Kirill A. Shutemov wrote:
> >> >
> >> >>On Mon, Oct 12, 2015 at 10:49:45AM -0700, Davidlohr Bueso wrote:
> >> >>>diff --git a/ipc/shm.c b/ipc/shm.c
> >> >>>index 4178727..9615f19 100644
> >> >>>--- a/ipc/shm.c
> >> >>>+++ b/ipc/shm.c
> >> >>>@@ -385,9 +385,25 @@ static struct mempolicy *shm_get_policy(struct vm_area_struct *vma,
> >> >>>static int shm_mmap(struct file *file, struct vm_area_struct *vma)
> >> >>>{
> >> >>>-  struct shm_file_data *sfd = shm_file_data(file);
> >> >>>+  struct file *vma_file = vma->vm_file;
> >> >>>+  struct shm_file_data *sfd = shm_file_data(vma_file);
> >> >>>+  struct ipc_ids *ids = &shm_ids(sfd->ns);
> >> >>>+  struct kern_ipc_perm *shp;
> >> >>>   int ret;
> >> >>>+  rcu_read_lock();
> >> >>>+  shp = ipc_obtain_object_check(ids, sfd->id);
> >> >>>+  if (IS_ERR(shp)) {
> >> >>>+          ret = -EINVAL;
> >> >>>+          goto err;
> >> >>>+  }
> >> >>>+
> >> >>>+  if (!ipc_valid_object(shp)) {
> >> >>>+          ret = -EIDRM;
> >> >>>+          goto err;
> >> >>>+  }
> >> >>>+  rcu_read_unlock();
> >> >>>+
> >> >>
> >> >>Hm. Isn't it racy? What prevents IPC_RMID from happening after this point?
> >> >
> >> >Nothing, but that is later caught by shm_open() doing similar checks. We
> >> >basically end up doing a check between ->mmap() calls, which is fair imho.
> >> >Note that this can occur anywhere in ipc as IPC_RMID is a user request/cmd,
> >> >and we try to respect it -- thus you can argue this race anywhere, which is
> >> >why we have EIDRM/EINVL. Ultimately the user should not be doing such hacks
> >> >_anyway_. So I'm not really concerned about it.
> >> >
> >> >Another similar alternative would be perhaps to make shm_lock() return an
> >> >error, and thus propagate that error to mmap return. That way we would have
> >> >a silent way out of the warning scenario (afterward we cannot race as we
> >> >hold the ipc object lock). However, the users would now have to take this
> >> >into account...
> >> >
> >> >    [validity check lockless]
> >> >    ->mmap()
> >> >    [validity check lock]
> >>
> >> Something like this, maybe. Although I could easily be missing things...
> >> I've tested it enough to see Dimitry's testcase handled ok, and put it
> >> through ltp. Also adding Manfred to the Cc, who always catches my idiotic
> >> mistakes.
> >>
> >> 8<---------------------------------------------------------------------
> >> From: Davidlohr Bueso <dave@stgolabs.net>
> >> Date: Mon, 12 Oct 2015 19:38:34 -0700
> >> Subject: [PATCH] ipc/shm: fix handling of (re)attaching to a deleted segment
> >>
> >> There are currently two issues when dealing with segments that are
> >> marked for deletion:
> >>
> >> (i) With d0edd8528362 (ipc: convert invalid scenarios to use WARN_ON)
> >> we relaxed the system-wide impact of using a deleted segment. However,
> >> we can now perfectly well trigger the warning and then deference a nil
> >> pointer -- where shp does not exist.
> >>
> >> (ii) As of a399b29dfbaa (ipc,shm: fix shm_file deletion races) we
> >> forbid attaching/mapping a previously deleted segment; a feature once
> >> unique to Linux, but removed[1] as a side effect of lockless ipc object
> >> lookups and security checks. Similarly, Dmitry Vyukov reported[2] a
> >> simple test case that creates a new vma for a previously deleted
> >> segment, triggering the WARN_ON mentioned in (i).
> >>
> >> This patch tries to address (i) by moving the shp error check out
> >> of shm_lock() and handled by the caller instead. The benefit of this
> >> is that it allows better handling out of situations where we end up
> >> returning ERMID or EINVAL. Specifically, there are three callers
> >> of shm_lock which we must look into:
> >>
> >>  - open/close -- which we ensure to never do any operations on
> >>                  the pairs, thus becoming no-ops if found a prev
> >>                IPC_RMID.
> >>
> >>  - loosing the reference of nattch upon shmat(2) -- not feasible.
> >>
> >> In addition, the common WARN_ON call is technically removed, but
> >> we add a new one for the bogus shmat(2) case, which is definitely
> >> unacceptable to race with RMID if nattch is bumped up.
> >>
> >> To address (ii), a new shm_check_vma_validity() helper is added
> >> (for lack of a better name), which attempts to detect early on
> >> any races with RMID, before doing the full ->mmap. There is still
> >> a window between the callback and the shm_open call where we can
> >> race with IPC_RMID. If this is the case, it is handled by the next
> >> shm_lock().
> >>
> >> shm_mmap:
> >>     [shm validity checks lockless]
> >>     ->mmap()
> >>     [shm validity checks lock] <-- at this point there after there
> >>                                    is no race as we hold the ipc
> >>                                    object lock.
> >>
> >> [1] https://lkml.org/lkml/2015/10/12/483
> >> [2] https://lkml.org/lkml/2015/10/12/284
> >>
> >> Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
> >> ---
> >>  ipc/shm.c | 78 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++----
> >>  1 file changed, 73 insertions(+), 5 deletions(-)
> >>
> >> diff --git a/ipc/shm.c b/ipc/shm.c
> >> index 4178727..47a7a67 100644
> >> --- a/ipc/shm.c
> >> +++ b/ipc/shm.c
> >> @@ -156,11 +156,10 @@ static inline struct shmid_kernel *shm_lock(struct ipc_namespace *ns, int id)
> >>       struct kern_ipc_perm *ipcp = ipc_lock(&shm_ids(ns), id);
> >>       /*
> >> -      * We raced in the idr lookup or with shm_destroy().  Either way, the
> >> -      * ID is busted.
> >> +      * Callers of shm_lock() must validate the status of the returned
> >> +      * ipc object pointer (as returned by ipc_lock()), and error out as
> >> +      * appropriate.
> >>        */
> >> -     WARN_ON(IS_ERR(ipcp));
> >> -
> >>       return container_of(ipcp, struct shmid_kernel, shm_perm);
> >>  }
> >> @@ -194,6 +193,15 @@ static void shm_open(struct vm_area_struct *vma)
> >>       struct shmid_kernel *shp;
> >>       shp = shm_lock(sfd->ns, sfd->id);
> >> +     /*
> >> +      * We raced in the idr lookup or with shm_destroy().
> >> +      * Either way, the ID is busted. In the same scenario,
> >> +      * but for the close counter-part, the nattch counter
> >> +      * is never decreased, thus we can safely return.
> >> +      */
> >> +     if (IS_ERR(shp))
> >> +             return; /* no-op */
> >> +
> >>       shp->shm_atim = get_seconds();
> >>       shp->shm_lprid = task_tgid_vnr(current);
> >>       shp->shm_nattch++;
> >
> > ...
> >
> >>  static int shm_mmap(struct file *file, struct vm_area_struct *vma)
> >>  {
> >>       struct shm_file_data *sfd = shm_file_data(file);
> >>       int ret;
> >> +     /*
> >> +      * Ensure that we have not raced with IPC_RMID, such that
> >> +      * we avoid doing the ->mmap altogether. This is a preventive
> >> +      * lockless check, and thus exposed to races during the mmap.
> >> +      * However, this is later caught in shm_open(), and handled
> >> +      * accordingly.
> >> +      */
> >> +     ret = shm_check_vma_validity(vma);
> >> +     if (ret)
> >> +             return ret;
> >> +
> >>       ret = sfd->file->f_op->mmap(sfd->file, vma);
> >>       if (ret != 0)
> >>               return ret;
> >> +
> >>       sfd->vm_ops = vma->vm_ops;
> >>  #ifdef CONFIG_MMU
> >>       WARN_ON(!sfd->vm_ops->fault);
> >
> > If I read it correctly, with the patch we would ignore locking failure
> > inside shm_open() and mmap will succeed in this case. So the idea is to
> > have shm_close() no-op and therefore symmetrical. That's look fragile to
> > me. We would silently miss some other broken open/close pattern.
> >
> > I would rather propagate error to shm_mmap() caller and therefore to
> > userspace. I guess it's better to opencode shm_open() in shm_mmap() and
> > return error this way. shm_open() itself can have WARN_ON_ONCE() for
> > failure or something.
> 
> 
> Davidlohr, any updates on this? Is it committed? I don't see it in Linus tree.
> What do you think about Kirill's comments?

What about this:

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: GPF in shm_lock ipc
  2015-11-05 14:23               ` Kirill A. Shutemov
@ 2015-12-21 15:44                 ` Dmitry Vyukov
  2016-01-02 11:33                   ` Manfred Spraul
  2016-02-02  3:25                   ` Andrew Morton
  0 siblings, 2 replies; 17+ messages in thread
From: Dmitry Vyukov @ 2015-12-21 15:44 UTC (permalink / raw)
  To: syzkaller
  Cc: Kirill A. Shutemov, Andrew Morton, Dave Hansen, Hugh Dickins,
	Joe Perches, sds, Oleg Nesterov, Kirill A. Shutemov,
	Rik van Riel, mhocko, gang.chen.5i5j, Peter Feiner,
	Andrea Arcangeli, linux-mm, LKML, Kostya Serebryany,
	Alexander Potapenko, Andrey Konovalov, Sasha Levin,
	Manfred Spraul

On Thu, Nov 5, 2015 at 3:23 PM, Kirill A. Shutemov
<kirill.shutemov@linux.intel.com> wrote:
> What about this:


Ping. This is still happening for me on tip. Can we pull in this fix
if it looks good to everybody?


> From 06b0fc9d62592f6f3ad9f45cccf1f6a5b3113bdc Mon Sep 17 00:00:00 2001
> From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
> Date: Thu, 5 Nov 2015 15:53:04 +0200
> Subject: [PATCH] ipc/shm: handle removed segments gracefully in shm_mmap()
>
> remap_file_pages(2) emulation can reach file which represents removed
> IPC ID as long as a memory segment is mapped. It breaks expectations
> of IPC subsystem.
>
> Test case (rewritten to be more human readable, originally autogenerated
> by syzkaller[1]):
>
>         #define _GNU_SOURCE
>         #include <stdlib.h>
>         #include <sys/ipc.h>
>         #include <sys/mman.h>
>         #include <sys/shm.h>
>
>         #define PAGE_SIZE 4096
>
>         int main()
>         {
>                 int id;
>                 void *p;
>
>                 id = shmget(IPC_PRIVATE, 3 * PAGE_SIZE, 0);
>                 p = shmat(id, NULL, 0);
>                 shmctl(id, IPC_RMID, NULL);
>                 remap_file_pages(p, 3 * PAGE_SIZE, 0, 7, 0);
>
>                 return 0;
>         }
>
> The patch changes shm_mmap() and code around shm_lock() to propagate
> locking error back to caller of shm_mmap().
>
> [1] http://github.com/google/syzkaller
>
> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> Reported-by: Dmitry Vyukov <dvyukov@google.com>
> Cc: Davidlohr Bueso <dave@stgolabs.net>
> ---
>  ipc/shm.c | 53 +++++++++++++++++++++++++++++++++++++++++++----------
>  1 file changed, 43 insertions(+), 10 deletions(-)
>
> diff --git a/ipc/shm.c b/ipc/shm.c
> index 41787276e141..3174634ca4e5 100644
> --- a/ipc/shm.c
> +++ b/ipc/shm.c
> @@ -156,11 +156,12 @@ static inline struct shmid_kernel *shm_lock(struct ipc_namespace *ns, int id)
>         struct kern_ipc_perm *ipcp = ipc_lock(&shm_ids(ns), id);
>
>         /*
> -        * We raced in the idr lookup or with shm_destroy().  Either way, the
> -        * ID is busted.
> +        * Callers of shm_lock() must validate the status of the returned ipc
> +        * object pointer (as returned by ipc_lock()), and error out as
> +        * appropriate.
>          */
> -       WARN_ON(IS_ERR(ipcp));
> -
> +       if (IS_ERR(ipcp))
> +               return (void *)ipcp;
>         return container_of(ipcp, struct shmid_kernel, shm_perm);
>  }
>
> @@ -186,18 +187,33 @@ static inline void shm_rmid(struct ipc_namespace *ns, struct shmid_kernel *s)
>  }
>
>
> -/* This is called by fork, once for every shm attach. */
> -static void shm_open(struct vm_area_struct *vma)
> +static int __shm_open(struct vm_area_struct *vma)
>  {
>         struct file *file = vma->vm_file;
>         struct shm_file_data *sfd = shm_file_data(file);
>         struct shmid_kernel *shp;
>
>         shp = shm_lock(sfd->ns, sfd->id);
> +
> +       if (IS_ERR(shp))
> +               return PTR_ERR(shp);
> +
>         shp->shm_atim = get_seconds();
>         shp->shm_lprid = task_tgid_vnr(current);
>         shp->shm_nattch++;
>         shm_unlock(shp);
> +       return 0;
> +}
> +
> +/* This is called by fork, once for every shm attach. */
> +static void shm_open(struct vm_area_struct *vma)
> +{
> +       int err = __shm_open(vma);
> +       /*
> +        * We raced in the idr lookup or with shm_destroy().
> +        * Either way, the ID is busted.
> +        */
> +       WARN_ON_ONCE(err);
>  }
>
>  /*
> @@ -260,6 +276,14 @@ static void shm_close(struct vm_area_struct *vma)
>         down_write(&shm_ids(ns).rwsem);
>         /* remove from the list of attaches of the shm segment */
>         shp = shm_lock(ns, sfd->id);
> +
> +       /*
> +        * We raced in the idr lookup or with shm_destroy().
> +        * Either way, the ID is busted.
> +        */
> +       if (WARN_ON_ONCE(IS_ERR(shp)))
> +               goto done; /* no-op */
> +
>         shp->shm_lprid = task_tgid_vnr(current);
>         shp->shm_dtim = get_seconds();
>         shp->shm_nattch--;
> @@ -267,6 +291,7 @@ static void shm_close(struct vm_area_struct *vma)
>                 shm_destroy(ns, shp);
>         else
>                 shm_unlock(shp);
> +done:
>         up_write(&shm_ids(ns).rwsem);
>  }
>
> @@ -388,17 +413,25 @@ static int shm_mmap(struct file *file, struct vm_area_struct *vma)
>         struct shm_file_data *sfd = shm_file_data(file);
>         int ret;
>
> +       /*
> +        * In case of remap_file_pages() emulation, the file can represent
> +        * removed IPC ID: propogate shm_lock() error to caller.
> +        */
> +       ret =__shm_open(vma);
> +       if (ret)
> +               return ret;
> +
>         ret = sfd->file->f_op->mmap(sfd->file, vma);
> -       if (ret != 0)
> +       if (ret) {
> +               shm_close(vma);
>                 return ret;
> +       }
>         sfd->vm_ops = vma->vm_ops;
>  #ifdef CONFIG_MMU
>         WARN_ON(!sfd->vm_ops->fault);
>  #endif
>         vma->vm_ops = &shm_vm_ops;
> -       shm_open(vma);
> -
> -       return ret;
> +       return 0;
>  }
>
>  static int shm_release(struct inode *ino, struct file *file)

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: GPF in shm_lock ipc
  2015-12-21 15:44                 ` Dmitry Vyukov
@ 2016-01-02 11:33                   ` Manfred Spraul
  2016-01-02 12:19                     ` Dmitry Vyukov
  2016-02-02  3:25                   ` Andrew Morton
  1 sibling, 1 reply; 17+ messages in thread
From: Manfred Spraul @ 2016-01-02 11:33 UTC (permalink / raw)
  To: Dmitry Vyukov, syzkaller
  Cc: Kirill A. Shutemov, Andrew Morton, Dave Hansen, Hugh Dickins,
	Joe Perches, sds, Oleg Nesterov, Kirill A. Shutemov,
	Rik van Riel, mhocko, gang.chen.5i5j, Peter Feiner,
	Andrea Arcangeli, linux-mm, LKML, Kostya Serebryany,
	Alexander Potapenko, Andrey Konovalov, Sasha Levin

Hi Dmitry,

shm locking differs too much from msg/sem locking, I never looked at it 
in depth, so I'm not able to perform a proper review.

Except for the obvious: Races that can be triggered from user space are 
inacceptable.
Regardless if there is a BUG_ON, a WARN_ON or nothing at all.

On 12/21/2015 04:44 PM, Dmitry Vyukov wrote:
>> +
>> +/* This is called by fork, once for every shm attach. */
>> +static void shm_open(struct vm_area_struct *vma)
>> +{
>> +       int err = __shm_open(vma);
>> +       /*
>> +        * We raced in the idr lookup or with shm_destroy().
>> +        * Either way, the ID is busted.
>> +        */
>> +       WARN_ON_ONCE(err);
>>   }
Is it possible to trigger this race? Parallel IPC_RMID & fork()?

--
     Manfred

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: GPF in shm_lock ipc
  2016-01-02 11:33                   ` Manfred Spraul
@ 2016-01-02 12:19                     ` Dmitry Vyukov
  2016-01-02 15:58                       ` Manfred Spraul
  0 siblings, 1 reply; 17+ messages in thread
From: Dmitry Vyukov @ 2016-01-02 12:19 UTC (permalink / raw)
  To: Manfred Spraul
  Cc: syzkaller, Kirill A. Shutemov, Andrew Morton, Dave Hansen,
	Hugh Dickins, Joe Perches, sds, Oleg Nesterov,
	Kirill A. Shutemov, Rik van Riel, mhocko, gang.chen.5i5j,
	Peter Feiner, Andrea Arcangeli, linux-mm, LKML,
	Kostya Serebryany, Alexander Potapenko, Andrey Konovalov,
	Sasha Levin

On Sat, Jan 2, 2016 at 12:33 PM, Manfred Spraul
<manfred@colorfullife.com> wrote:
> Hi Dmitry,
>
> shm locking differs too much from msg/sem locking, I never looked at it in
> depth, so I'm not able to perform a proper review.
>
> Except for the obvious: Races that can be triggered from user space are
> inacceptable.
> Regardless if there is a BUG_ON, a WARN_ON or nothing at all.
>
> On 12/21/2015 04:44 PM, Dmitry Vyukov wrote:
>>>
>>> +
>>> +/* This is called by fork, once for every shm attach. */
>>> +static void shm_open(struct vm_area_struct *vma)
>>> +{
>>> +       int err = __shm_open(vma);
>>> +       /*
>>> +        * We raced in the idr lookup or with shm_destroy().
>>> +        * Either way, the ID is busted.
>>> +        */
>>> +       WARN_ON_ONCE(err);
>>>   }
>
> Is it possible to trigger this race? Parallel IPC_RMID & fork()?

Hi Manfred,

As far as I see my reproducer triggers exactly this warning (and later a crash).

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: GPF in shm_lock ipc
  2016-01-02 12:19                     ` Dmitry Vyukov
@ 2016-01-02 15:58                       ` Manfred Spraul
  0 siblings, 0 replies; 17+ messages in thread
From: Manfred Spraul @ 2016-01-02 15:58 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: syzkaller, Kirill A. Shutemov, Andrew Morton, Dave Hansen,
	Hugh Dickins, Joe Perches, sds, Oleg Nesterov,
	Kirill A. Shutemov, Rik van Riel, mhocko, gang.chen.5i5j,
	Peter Feiner, Andrea Arcangeli, linux-mm, LKML,
	Kostya Serebryany, Alexander Potapenko, Andrey Konovalov,
	Sasha Levin

Hi Dmitry,

On 01/02/2016 01:19 PM, Dmitry Vyukov wrote:
> On Sat, Jan 2, 2016 at 12:33 PM, Manfred Spraul
> <manfred@colorfullife.com> wrote:
>> Hi Dmitry,
>>
>> shm locking differs too much from msg/sem locking, I never looked at it in
>> depth, so I'm not able to perform a proper review.
>>
>> Except for the obvious: Races that can be triggered from user space are
>> inacceptable.
>> Regardless if there is a BUG_ON, a WARN_ON or nothing at all.
>>
>> On 12/21/2015 04:44 PM, Dmitry Vyukov wrote:
>>>> +
>>>> +/* This is called by fork, once for every shm attach. */
>>>> +static void shm_open(struct vm_area_struct *vma)
>>>> +{
>>>> +       int err = __shm_open(vma);
>>>> +       /*
>>>> +        * We raced in the idr lookup or with shm_destroy().
>>>> +        * Either way, the ID is busted.
>>>> +        */
>>>> +       WARN_ON_ONCE(err);
>>>>    }
>> Is it possible to trigger this race? Parallel IPC_RMID & fork()?
> Hi Manfred,
>
> As far as I see my reproducer triggers exactly this warning (and later a crash).
Do I understand it right, shm_open() is also called by remap()?
Then please update the comment above shm_open().

And: If this is something that userspace can trigger, why a WARN_ON_ONCE()?
If the WARN_ON doesn't indicate a bug, then I would remove it entirely.

--
     Manfred

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: GPF in shm_lock ipc
  2015-12-21 15:44                 ` Dmitry Vyukov
  2016-01-02 11:33                   ` Manfred Spraul
@ 2016-02-02  3:25                   ` Andrew Morton
  2016-02-02 21:32                     ` Dmitry Vyukov
  1 sibling, 1 reply; 17+ messages in thread
From: Andrew Morton @ 2016-02-02  3:25 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: syzkaller, Kirill A. Shutemov, Dave Hansen, Hugh Dickins,
	Joe Perches, sds, Oleg Nesterov, Kirill A. Shutemov,
	Rik van Riel, mhocko, gang.chen.5i5j, Peter Feiner,
	Andrea Arcangeli, linux-mm, LKML, Kostya Serebryany,
	Alexander Potapenko, Andrey Konovalov, Sasha Levin,
	Manfred Spraul

On Mon, 21 Dec 2015 16:44:34 +0100 Dmitry Vyukov <dvyukov@google.com> wrote:

> On Thu, Nov 5, 2015 at 3:23 PM, Kirill A. Shutemov
> <kirill.shutemov@linux.intel.com> wrote:
> > What about this:
> 
> 
> Ping. This is still happening for me on tip. Can we pull in this fix
> if it looks good to everybody?
> 

Well we have at least three patches to choose from in this thread, most
of them missing most signs of having been reviewed or tested.

So I grabbed the last one, below.  Can we please rev this up again,
test it, see if we can agree that this is the way to go forward?

Thanks.



From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Subject: ipc/shm: handle removed segments gracefully in shm_mmap()

remap_file_pages(2) emulation can reach file which represents removed IPC
ID as long as a memory segment is mapped.  It breaks expectations of IPC
subsystem.

Test case (rewritten to be more human readable, originally autogenerated
by syzkaller[1]):

	#define _GNU_SOURCE
	#include <stdlib.h>
	#include <sys/ipc.h>
	#include <sys/mman.h>
	#include <sys/shm.h>

	#define PAGE_SIZE 4096

	int main()
	{
		int id;
		void *p;

		id = shmget(IPC_PRIVATE, 3 * PAGE_SIZE, 0);
		p = shmat(id, NULL, 0);
		shmctl(id, IPC_RMID, NULL);
		remap_file_pages(p, 3 * PAGE_SIZE, 0, 7, 0);

	        return 0;
	}

The patch changes shm_mmap() and code around shm_lock() to propagate
locking error back to caller of shm_mmap().

[1] http://github.com/google/syzkaller

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Manfred Spraul <manfred@colorfullife.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 ipc/shm.c |   53 ++++++++++++++++++++++++++++++++++++++++++----------
 1 file changed, 43 insertions(+), 10 deletions(-)

diff -puN ipc/shm.c~gpf-in-shm_lock-ipc ipc/shm.c
--- a/ipc/shm.c~gpf-in-shm_lock-ipc
+++ a/ipc/shm.c
@@ -156,11 +156,12 @@ static inline struct shmid_kernel *shm_l
 	struct kern_ipc_perm *ipcp = ipc_lock(&shm_ids(ns), id);
 
 	/*
-	 * We raced in the idr lookup or with shm_destroy().  Either way, the
-	 * ID is busted.
+	 * Callers of shm_lock() must validate the status of the returned ipc
+	 * object pointer (as returned by ipc_lock()), and error out as
+	 * appropriate.
 	 */
-	WARN_ON(IS_ERR(ipcp));
-
+	if (IS_ERR(ipcp))
+		return (void *)ipcp;
 	return container_of(ipcp, struct shmid_kernel, shm_perm);
 }
 
@@ -186,18 +187,33 @@ static inline void shm_rmid(struct ipc_n
 }
 
 
-/* This is called by fork, once for every shm attach. */
-static void shm_open(struct vm_area_struct *vma)
+static int __shm_open(struct vm_area_struct *vma)
 {
 	struct file *file = vma->vm_file;
 	struct shm_file_data *sfd = shm_file_data(file);
 	struct shmid_kernel *shp;
 
 	shp = shm_lock(sfd->ns, sfd->id);
+
+	if (IS_ERR(shp))
+		return PTR_ERR(shp);
+
 	shp->shm_atim = get_seconds();
 	shp->shm_lprid = task_tgid_vnr(current);
 	shp->shm_nattch++;
 	shm_unlock(shp);
+	return 0;
+}
+
+/* This is called by fork, once for every shm attach. */
+static void shm_open(struct vm_area_struct *vma)
+{
+	int err = __shm_open(vma);
+	/*
+	 * We raced in the idr lookup or with shm_destroy().
+	 * Either way, the ID is busted.
+	 */
+	WARN_ON_ONCE(err);
 }
 
 /*
@@ -260,6 +276,14 @@ static void shm_close(struct vm_area_str
 	down_write(&shm_ids(ns).rwsem);
 	/* remove from the list of attaches of the shm segment */
 	shp = shm_lock(ns, sfd->id);
+
+	/*
+	 * We raced in the idr lookup or with shm_destroy().
+	 * Either way, the ID is busted.
+	 */
+	if (WARN_ON_ONCE(IS_ERR(shp)))
+		goto done; /* no-op */
+
 	shp->shm_lprid = task_tgid_vnr(current);
 	shp->shm_dtim = get_seconds();
 	shp->shm_nattch--;
@@ -267,6 +291,7 @@ static void shm_close(struct vm_area_str
 		shm_destroy(ns, shp);
 	else
 		shm_unlock(shp);
+done:
 	up_write(&shm_ids(ns).rwsem);
 }
 
@@ -388,17 +413,25 @@ static int shm_mmap(struct file *file, s
 	struct shm_file_data *sfd = shm_file_data(file);
 	int ret;
 
+	/*
+	 * In case of remap_file_pages() emulation, the file can represent
+	 * removed IPC ID: propogate shm_lock() error to caller.
+	 */
+	ret =__shm_open(vma);
+	if (ret)
+		return ret;
+
 	ret = sfd->file->f_op->mmap(sfd->file, vma);
-	if (ret != 0)
+	if (ret) {
+		shm_close(vma);
 		return ret;
+	}
 	sfd->vm_ops = vma->vm_ops;
 #ifdef CONFIG_MMU
 	WARN_ON(!sfd->vm_ops->fault);
 #endif
 	vma->vm_ops = &shm_vm_ops;
-	shm_open(vma);
-
-	return ret;
+	return 0;
 }
 
 static int shm_release(struct inode *ino, struct file *file)
_

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: GPF in shm_lock ipc
  2016-02-02  3:25                   ` Andrew Morton
@ 2016-02-02 21:32                     ` Dmitry Vyukov
  0 siblings, 0 replies; 17+ messages in thread
From: Dmitry Vyukov @ 2016-02-02 21:32 UTC (permalink / raw)
  To: Andrew Morton
  Cc: syzkaller, Kirill A. Shutemov, Dave Hansen, Hugh Dickins,
	Joe Perches, sds, Oleg Nesterov, Kirill A. Shutemov,
	Rik van Riel, Michal Hocko, Chen Gang, Peter Feiner,
	Andrea Arcangeli, linux-mm, LKML, Kostya Serebryany,
	Alexander Potapenko, Andrey Konovalov, Sasha Levin,
	Manfred Spraul

On Tue, Feb 2, 2016 at 4:25 AM, Andrew Morton <akpm@linux-foundation.org> wrote:
> On Mon, 21 Dec 2015 16:44:34 +0100 Dmitry Vyukov <dvyukov@google.com> wrote:
>
>> On Thu, Nov 5, 2015 at 3:23 PM, Kirill A. Shutemov
>> <kirill.shutemov@linux.intel.com> wrote:
>> > What about this:
>>
>>
>> Ping. This is still happening for me on tip. Can we pull in this fix
>> if it looks good to everybody?
>>
>
> Well we have at least three patches to choose from in this thread, most
> of them missing most signs of having been reviewed or tested.
>
> So I grabbed the last one, below.  Can we please rev this up again,
> test it, see if we can agree that this is the way to go forward?


Well, I can say that this patch fixes the original reproducer for me.
I will restart the fuzzer with it.


> From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
> Subject: ipc/shm: handle removed segments gracefully in shm_mmap()
>
> remap_file_pages(2) emulation can reach file which represents removed IPC
> ID as long as a memory segment is mapped.  It breaks expectations of IPC
> subsystem.
>
> Test case (rewritten to be more human readable, originally autogenerated
> by syzkaller[1]):
>
>         #define _GNU_SOURCE
>         #include <stdlib.h>
>         #include <sys/ipc.h>
>         #include <sys/mman.h>
>         #include <sys/shm.h>
>
>         #define PAGE_SIZE 4096
>
>         int main()
>         {
>                 int id;
>                 void *p;
>
>                 id = shmget(IPC_PRIVATE, 3 * PAGE_SIZE, 0);
>                 p = shmat(id, NULL, 0);
>                 shmctl(id, IPC_RMID, NULL);
>                 remap_file_pages(p, 3 * PAGE_SIZE, 0, 7, 0);
>
>                 return 0;
>         }
>
> The patch changes shm_mmap() and code around shm_lock() to propagate
> locking error back to caller of shm_mmap().
>
> [1] http://github.com/google/syzkaller
>
> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> Reported-by: Dmitry Vyukov <dvyukov@google.com>
> Cc: Davidlohr Bueso <dave@stgolabs.net>
> Cc: Manfred Spraul <manfred@colorfullife.com>
> Cc: <stable@vger.kernel.org>
> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
> ---
>
>  ipc/shm.c |   53 ++++++++++++++++++++++++++++++++++++++++++----------
>  1 file changed, 43 insertions(+), 10 deletions(-)
>
> diff -puN ipc/shm.c~gpf-in-shm_lock-ipc ipc/shm.c
> --- a/ipc/shm.c~gpf-in-shm_lock-ipc
> +++ a/ipc/shm.c
> @@ -156,11 +156,12 @@ static inline struct shmid_kernel *shm_l
>         struct kern_ipc_perm *ipcp = ipc_lock(&shm_ids(ns), id);
>
>         /*
> -        * We raced in the idr lookup or with shm_destroy().  Either way, the
> -        * ID is busted.
> +        * Callers of shm_lock() must validate the status of the returned ipc
> +        * object pointer (as returned by ipc_lock()), and error out as
> +        * appropriate.
>          */
> -       WARN_ON(IS_ERR(ipcp));
> -
> +       if (IS_ERR(ipcp))
> +               return (void *)ipcp;
>         return container_of(ipcp, struct shmid_kernel, shm_perm);
>  }
>
> @@ -186,18 +187,33 @@ static inline void shm_rmid(struct ipc_n
>  }
>
>
> -/* This is called by fork, once for every shm attach. */
> -static void shm_open(struct vm_area_struct *vma)
> +static int __shm_open(struct vm_area_struct *vma)
>  {
>         struct file *file = vma->vm_file;
>         struct shm_file_data *sfd = shm_file_data(file);
>         struct shmid_kernel *shp;
>
>         shp = shm_lock(sfd->ns, sfd->id);
> +
> +       if (IS_ERR(shp))
> +               return PTR_ERR(shp);
> +
>         shp->shm_atim = get_seconds();
>         shp->shm_lprid = task_tgid_vnr(current);
>         shp->shm_nattch++;
>         shm_unlock(shp);
> +       return 0;
> +}
> +
> +/* This is called by fork, once for every shm attach. */
> +static void shm_open(struct vm_area_struct *vma)
> +{
> +       int err = __shm_open(vma);
> +       /*
> +        * We raced in the idr lookup or with shm_destroy().
> +        * Either way, the ID is busted.
> +        */
> +       WARN_ON_ONCE(err);
>  }
>
>  /*
> @@ -260,6 +276,14 @@ static void shm_close(struct vm_area_str
>         down_write(&shm_ids(ns).rwsem);
>         /* remove from the list of attaches of the shm segment */
>         shp = shm_lock(ns, sfd->id);
> +
> +       /*
> +        * We raced in the idr lookup or with shm_destroy().
> +        * Either way, the ID is busted.
> +        */
> +       if (WARN_ON_ONCE(IS_ERR(shp)))
> +               goto done; /* no-op */
> +
>         shp->shm_lprid = task_tgid_vnr(current);
>         shp->shm_dtim = get_seconds();
>         shp->shm_nattch--;
> @@ -267,6 +291,7 @@ static void shm_close(struct vm_area_str
>                 shm_destroy(ns, shp);
>         else
>                 shm_unlock(shp);
> +done:
>         up_write(&shm_ids(ns).rwsem);
>  }
>
> @@ -388,17 +413,25 @@ static int shm_mmap(struct file *file, s
>         struct shm_file_data *sfd = shm_file_data(file);
>         int ret;
>
> +       /*
> +        * In case of remap_file_pages() emulation, the file can represent
> +        * removed IPC ID: propogate shm_lock() error to caller.
> +        */
> +       ret =__shm_open(vma);
> +       if (ret)
> +               return ret;
> +
>         ret = sfd->file->f_op->mmap(sfd->file, vma);
> -       if (ret != 0)
> +       if (ret) {
> +               shm_close(vma);
>                 return ret;
> +       }
>         sfd->vm_ops = vma->vm_ops;
>  #ifdef CONFIG_MMU
>         WARN_ON(!sfd->vm_ops->fault);
>  #endif
>         vma->vm_ops = &shm_vm_ops;
> -       shm_open(vma);
> -
> -       return ret;
> +       return 0;
>  }
>
>  static int shm_release(struct inode *ino, struct file *file)
> _
>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2016-02-02 21:32 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-10-12  9:55 GPF in shm_lock ipc Dmitry Vyukov
2015-10-12 11:41 ` Vlastimil Babka
2015-10-12 11:44   ` Dmitry Vyukov
2015-10-12 12:27 ` Kirill A. Shutemov
2015-10-12 17:49   ` Davidlohr Bueso
2015-10-12 18:10     ` Kirill A. Shutemov
2015-10-12 18:55       ` Davidlohr Bueso
2015-10-13  3:18         ` Davidlohr Bueso
2015-10-13 12:30           ` Kirill A. Shutemov
2015-10-29 15:33             ` Dmitry Vyukov
2015-11-05 14:23               ` Kirill A. Shutemov
2015-12-21 15:44                 ` Dmitry Vyukov
2016-01-02 11:33                   ` Manfred Spraul
2016-01-02 12:19                     ` Dmitry Vyukov
2016-01-02 15:58                       ` Manfred Spraul
2016-02-02  3:25                   ` Andrew Morton
2016-02-02 21:32                     ` Dmitry Vyukov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).