On 2019/8/14 21:53, Michal Hocko wrote:
> On Tue 13-08-19 17:08:05, Kefeng Wang wrote:
>> Hi Andrea Arcangeli and all,
>>
>> There is a BUG after apply patch "04f5866e41fb coredump: fix race condition between mmget_not_zero()/get_task_mm() and core dumping".
> 
> Just to make sure, does reverting that commit fixes the bug?

Yes, it's easy to reproduce the panic with this patch, and after revert it, everything works well when using Syzkaller reproducer,
and only revert the following part could also fix the crash.

diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c
index ccbdbd62f0d8..566086881b3a 100644
--- a/fs/userfaultfd.c
+++ b/fs/userfaultfd.c
@@ -895,8 +895,6 @@ static int userfaultfd_release(struct inode *inode, struct file *file)
         * taking the mmap_sem for writing.
         */
        down_write(&mm->mmap_sem);
-       if (!mmget_still_valid(mm))
-               goto skip_mm;
        prev = NULL;
        for (vma = mm->mmap; vma; vma = vma->vm_next) {
                cond_resched();
@@ -919,7 +917,6 @@ static int userfaultfd_release(struct inode *inode, struct file *file)
                vma->vm_flags = new_flags;
                vma->vm_userfaultfd_ctx = NULL_VM_UFFD_CTX;
        }
-skip_mm:
        up_write(&mm->mmap_sem);
        mmput(mm);
 wakeup:


> 
>> The following is reproducer and panic log, could anyone check it?
>>
>> Syzkaller reproducer:
>> # {Threaded:true Collide:true Repeat:false RepeatTimes:0 Procs:1 Sandbox:none Fault:false FaultCall:-1 FaultNth:0 EnableTun:true EnableNetDev:true EnableNetReset:false EnableCgroups:false EnableBinfmtMisc:true EnableCloseFds:true UseTmpDir:true HandleSegv:true Repro:false Trace:false}
>> r0 = userfaultfd(0x80800)
>> ioctl$UFFDIO_API(r0, 0xc018aa3f, &(0x7f0000000200))
>> ioctl$UFFDIO_REGISTER(r0, 0xc020aa00, &(0x7f0000000080)={{&(0x7f0000ff2000/0xe000)=nil, 0xe000}, 0x1})
>> ioctl$UFFDIO_COPY(r0, 0xc028aa03, 0x0)
>> ioctl$UFFDIO_COPY(r0, 0xc028aa03, &(0x7f0000000000)={&(0x7f0000ffc000/0x3000)=nil, &(0x7f0000ffd000/0x2000)=nil, 0x3000})
>> syz_execute_func(&(0x7f00000000c0)="4134de984013e80f059532058300000071f3c4e18dd1ce5a65460f18320ce0b9977d8f64360f6e54e3a50fe53ff30fb837c42195dc42eddb8f087ca2a4d2c4017b708fa878c3e600f3266440d9a200000000c4016c5bdd7d0867dfe07f00f20f2b5f0009404cc442c102282cf2f20f51e22ef2e1291010f2262ef045814cb39700000000f32e3ef0fe05922f79a4000030470f3b58c1312fe7460f50ce0502338d00858526660f346253f6010f0f801d000000470f0f2c0a90c7c7df84feefff3636260fe02c98c8b8fcfc81fc51720a40400e700064660f71e70d2e0f57dfe819d0253f3ecaf06ad647608c41ffc42249bccb430f9bc8b7a042420f8d0042171e0f95ca9f7f921000d9fac4a27d5a1fc4a37961309de9000000003171460fc4d303c466410fd6389dc4426c456300c4233d4c922d92abf90ac6c34df30f5ee50909430f3a15e7776f6e866b0fdfdfc482797841cf6ffc842d9b9a516dc2e52ef2ac2636f20f114832d46231bffd4834eaeac4237d09d0003766420f160182c4a37d047882007f108f2808a6e68fc401505d6a82635d1467440fc7ba0c000000d4c482359652745300")
>> poll(&(0x7f00000000c0)=[{}], 0x1, 0x0)
> 
> Is there any way to decypher the above?

no, I also want to know the way :(


> 
>> ./syz-execprog -executor=./syz-executor -repeat=0 -procs=16 -cover=0 repofile
>>
>>
>> [   74.783362] invalid opcode: 0000 [#1] SMP PTI

This place is strange too.

>> [   74.783740] ------------[ cut here ]------------
>> [   74.784430] CPU: 5 PID: 12803 Comm: syz-executor.15 Not tainted 5.3.0-rc4 #15
>> [   74.785831] kernel BUG at ../fs/userfaultfd.c:385!
> 
> This looks like
> 	BUG_ON(ctx->mm != mm)
> where mm is vmf->vma->vm_mm while ctx->mm
> git grep  "ctx->mm[[:space:]]=" v5.3-rc4
> [...]
> v5.3-rc4:fs/userfaultfd.c:              ctx->mm = vma->vm_mm;
> v5.3-rc4:fs/userfaultfd.c:      ctx->mm = current->mm;
> 
> seem to always come from the local mm so it shouldn't really be out of
> sync. VMAs and the process doesn't change the mm pointer during the life
> time except for execing
> 
[...]
> 
> but that doesn't really match BUG_ON at all. Could you provide
> disassembly for that function and your build. I would like to see what
> do we have in registers and what ctx->mm vs. mm are.

(gdb) l * handle_userfault+0x615
0xffffffff8131dc05 is in handle_userfault (../fs/userfaultfd.c:379).
374	
375		/*
376		 * Coredumping runs without mmap_sem so we can only check that
377		 * the mmap_sem is held, if PF_DUMPCORE was not set.
378		 */
379		WARN_ON_ONCE(!rwsem_is_locked(&mm->mmap_sem));
380	
381		ctx = vmf->vma->vm_userfaultfd_ctx.ctx;
382		if (!ctx)
383			goto out;


376		 * Coredumping runs without mmap_sem so we can only check that
377		 * the mmap_sem is held, if PF_DUMPCORE was not set.
378		 */
379		WARN_ON_ONCE(!rwsem_is_locked(&mm->mmap_sem));
   0xffffffff8131d66d <+125>:	test   %rdx,%rdx
   0xffffffff8131d670 <+128>:	je     0xffffffff8131dbfd <handle_userfault+1549>
   0xffffffff8131dc00 <+1552>:	jmpq   0xffffffff8131d676 <handle_userfault+134>
   0xffffffff8131dc05 <+1557>:	ud2   // ---> handle_userfault+0x615

I reproduce it again, the whole config, panic log and assebler dump are attached, hope they are helpful.