From: Cong Wang <xiyou.wangcong@gmail.com>
To: Jason Gunthorpe <jgg@mellanox.com>
Cc: Leon Romanovsky <leon@kernel.org>,
LKML <linux-kernel@vger.kernel.org>,
linux-rdma@vger.kernel.org, Doug Ledford <dledford@redhat.com>
Subject: Re: [PATCH] infiniband: fix a subtle race condition
Date: Thu, 14 Jun 2018 16:14:13 -0700 [thread overview]
Message-ID: <CAM_iQpXFmqcCQWkyGPmev5LrPpDoEc=wpXQVDWJx2UQWPhwb2g@mail.gmail.com> (raw)
In-Reply-To: <20180614172441.GE24762@mellanox.com>
On Thu, Jun 14, 2018 at 10:24 AM, Jason Gunthorpe <jgg@mellanox.com> wrote:
> On Thu, Jun 14, 2018 at 10:03:09AM -0700, Cong Wang wrote:
>> On Thu, Jun 14, 2018 at 7:24 AM, Jason Gunthorpe <jgg@mellanox.com> wrote:
>> >
>> > This was my brief reaction too, this code path almost certainly has a
>> > use-after-free, and we should fix the concurrency between the two
>> > places in some correct way..
>>
>> First of all, why use-after-free could trigger an imbalance unlock?
>> IOW, why do we have to solve use-after-free to fix this imbalance
>> unlock?
>
> The issue syzkaller hit is that accessing ctx->file does not seem
> locked in any way and can race with other manipulations of ctx->file.
>
> So.. for this patch to be correct we need to understand how this
> statement:
>
> f = ctx->file
>
> Avoids f becoming a dangling pointer - and without locking, my
It doesn't, because this is not the point, this is not the cause
of the unlock imbalance either. syzbot didn't report use-after-free
or a kernel segfault here.
> suspicion is that it doesn't - because missing locking around
> ctx->file is probably the actual bug syzkaller found.
Does my patch make it lockless or dangling? Apparently no.
Before my patch:
mutex_lock(&ctx->file->mut);
After my patch:
cur_file = ctx->file;
mutex_lock(&cur_file->mut);
The deference is same as before, it was lockless and it is lockless
after my patch.
Look at the assembly code *without* my patch:
ffffffff819354f0: 49 8b 7c 24 78 mov 0x78(%r12),%rdi
ffffffff819354f5: 48 89 c3 mov %rax,%rbx
ffffffff819354f8: 31 f6 xor %esi,%esi
ffffffff819354fa: e8 d8 dd 40 00 callq
ffffffff81d432d7 <mutex_lock_nested>
Apparently the pointer is dereferenced before lock.
What difference does my patch make?
ffffffff819354f2: 4d 8b 74 24 78 mov 0x78(%r12),%r14
ffffffff819354f7: 48 89 c3 mov %rax,%rbx
ffffffff819354fa: 31 f6 xor %esi,%esi
ffffffff819354fc: 4c 89 f7 mov %r14,%rdi
ffffffff819354ff: e8 9b df 40 00 callq
ffffffff81d4349f <mutex_lock_nested>
...
ffffffff8193567d: 4c 89 f7 mov %r14,%rdi
ffffffff81935680: e8 98 dd 40 00 callq
ffffffff81d4341d <mutex_unlock>
The %r14 here is the whole point of my patch.
>
> If this is not the case, then add a comment explaining how f's
> lifetime is OK.
>
> Otherwise, we need some kind of locking and guessing we need to hold a
> kref for f?
I agree with you, but again, this is not necessary for unlock
imbalance.
>
>> Third of all, the use-after-free I can see (race with ->close) exists
>> before my patch, this patch doesn't make it better or worse, nor
>> I have any intend to fix it.
>
> I'm not sure that race exists, there should be something that flushes
> the WQ on the path to close... (though I have another email that
> perhaps that is broken, sigh)
>
This is not related to my patch, but to convince you, let me explain:
struct ucma_file is not refcnt'ed, I know you cancel the work in
rdma_destroy_id(), but after ucma_migrate_id() the ctx has already
been moved to the new file, for the old file, it won't cancel the
ctx flying with workqueue. So, I think the following use-after-free
could happen:
ucma_event_handler():
cur_file = ctx->file; // old file
ucma_migrate_id():
lock();
list_move_tail(&ctx->list, &new_file->ctx_list);
ctx->file = new_file;
unlock();
ucma_close():
// retrieve old file via filp->private_data
// the loop won't cover the ctx moved to the new_file
kfree(file);
ucma_event_handler():
// continued from above
lock(&cur_file->mux); // already freed!
This is _not_ the cause of the unlock imbalance, and is _not_ expected
to solve by patch either.
next prev parent reply other threads:[~2018-06-14 23:14 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-06-13 23:49 [PATCH] infiniband: fix a subtle race condition Cong Wang
2018-06-14 5:34 ` Leon Romanovsky
2018-06-14 6:21 ` Cong Wang
2018-06-14 7:01 ` Leon Romanovsky
2018-06-14 14:24 ` Jason Gunthorpe
2018-06-14 17:03 ` Cong Wang
2018-06-14 17:24 ` Jason Gunthorpe
2018-06-14 23:14 ` Cong Wang [this message]
2018-06-15 2:57 ` Jason Gunthorpe
2018-06-15 17:57 ` Cong Wang
2018-06-15 19:08 ` Jason Gunthorpe
2018-06-18 18:40 ` Cong Wang
2018-06-14 16:57 ` Cong Wang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CAM_iQpXFmqcCQWkyGPmev5LrPpDoEc=wpXQVDWJx2UQWPhwb2g@mail.gmail.com' \
--to=xiyou.wangcong@gmail.com \
--cc=dledford@redhat.com \
--cc=jgg@mellanox.com \
--cc=leon@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-rdma@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).