All of lore.kernel.org
 help / color / mirror / Atom feed
From: David Hildenbrand <david@redhat.com>
To: Stefan Hajnoczi <stefanha@gmail.com>
Cc: "Tong Zhang" <ztong0001@gmail.com>,
	"Tong Zhang" <t.zhang2@samsung.com>,
	"Paolo Bonzini" <pbonzini@redhat.com>,
	"Peter Xu" <peterx@redhat.com>,
	"Philippe Mathieu-Daudé" <f4bug@amsat.org>,
	"qemu-devel@nongnu.org" <qemu-devel@nongnu.org>,
	"Francisco Londono" <f.londono@samsung.com>
Subject: Re: [RESEND PATCH] hw/dma: fix crash caused by race condition
Date: Wed, 1 Jun 2022 15:29:05 +0200	[thread overview]
Message-ID: <b7eff284-fb61-6a66-dd9a-893b64dd5311@redhat.com> (raw)
In-Reply-To: <YpdoqgpGloiPIxBk@stefanha-x1.localdomain>

On 01.06.22 15:24, Stefan Hajnoczi wrote:
> On Wed, Jun 01, 2022 at 10:00:50AM +0200, David Hildenbrand wrote:
>> On 01.06.22 02:20, Tong Zhang wrote:
>>> Hi David,
>>>
>>> On Mon, May 30, 2022 at 9:19 AM David Hildenbrand <david@redhat.com> wrote:
>>>>
>>>> On 27.04.22 22:51, Tong Zhang wrote:
>>>>> assert(dbs->acb) is meant to check the return value of io_func per
>>>>> documented in commit 6bee44ea34 ("dma: the passed io_func does not
>>>>> return NULL"). However, there is a chance that after calling
>>>>> aio_context_release(dbs->ctx); the dma_blk_cb function is called before
>>>>> the assertion and dbs->acb is set to NULL again at line 121. Thus when
>>>>> we run assert at line 181 it will fail.
>>>>>
>>>>>   softmmu/dma-helpers.c:181: dma_blk_cb: Assertion `dbs->acb' failed.
>>>>>
>>>>> Reported-by: Francisco Londono <f.londono@samsung.com>
>>>>> Signed-off-by: Tong Zhang <t.zhang2@samsung.com>
>>>>> ---
>>>>>  softmmu/dma-helpers.c | 2 +-
>>>>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>>>>
>>>>> diff --git a/softmmu/dma-helpers.c b/softmmu/dma-helpers.c
>>>>> index 7820fec54c..cb81017928 100644
>>>>> --- a/softmmu/dma-helpers.c
>>>>> +++ b/softmmu/dma-helpers.c
>>>>> @@ -177,8 +177,8 @@ static void dma_blk_cb(void *opaque, int ret)
>>>>>      aio_context_acquire(dbs->ctx);
>>>>>      dbs->acb = dbs->io_func(dbs->offset, &dbs->iov,
>>>>>                              dma_blk_cb, dbs, dbs->io_func_opaque);
>>>>> -    aio_context_release(dbs->ctx);
>>>>>      assert(dbs->acb);
>>>>> +    aio_context_release(dbs->ctx);
>>>>>  }
>>>>>
>>>>>  static void dma_aio_cancel(BlockAIOCB *acb)
>>>>
>>>> I'm fairly new to that code, but I wonder what prevents dma_blk_cb() to
>>>> run after you reshuffled the code?
>>>>
>>>
>>> IMO if the assert is to test whether io_func returns a non-NULL value
>>> shouldn't it be immediately after calling io_func.
>>> Also... as suggested by commit 6bee44ea346aed24e12d525daf10542d695508db
>>>   >     dma: the passed io_func does not return NULL
>>
>> Yes, but I just don't see how it would fix the assertion you document in
>> the patch description. The locking change to fix the assertion doesn't
>> make any sense to me, and most probably I am missing something important :)
> 
> The other thread will invoke dma_blk_cb(), which modifies dbs->acb, when
> it can take the lock. Therefore dbs->acb may contain a value different
> from our io_func()'s return value by the time we perform the assertion
> check (that's the race).
> 
> This patch makes sense to me. Can you rephrase your concern?

The locking is around dbs->io_func().

aio_context_acquire(dbs->ctx);
dbs->acb = dbs->io_func()
aio_context_release(dbs->ctx);


So where exactly would the lock that's now still held stop someone from
modifying dbs->acb = NULL at the beginning of the function, which seems
to be not protected by that lock?

Maybe I'm missing some locking magic due to the lock being a recursive lock.

-- 
Thanks,

David / dhildenb



  reply	other threads:[~2022-06-01 13:33 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <CGME20220427205118uscas1p25031437c0cdd4363c104be13033f366a@uscas1p2.samsung.com>
2022-04-27 20:51 ` [RESEND PATCH] hw/dma: fix crash caused by race condition Tong Zhang
2022-05-30 14:34   ` Philippe Mathieu-Daudé via
2022-05-30 16:19   ` David Hildenbrand
2022-06-01  0:20     ` Tong Zhang
2022-06-01  8:00       ` David Hildenbrand
2022-06-01 13:24         ` Stefan Hajnoczi
2022-06-01 13:29           ` David Hildenbrand [this message]
2022-06-01 13:55             ` Stefan Hajnoczi
2022-06-02  1:04               ` Tong Zhang
2022-06-02  5:29                 ` Stefan Hajnoczi
     [not found] <CGME20220506163106uscas1p20aa8ba0a290a9b50be54df6ec4f9cee0@uscas1p2.samsung.com>
2022-05-06 16:31 ` Tong Zhang
2022-06-28 22:34   ` Laurent Vivier
2022-06-29  7:28   ` David Hildenbrand
2022-06-29  8:31     ` Tong Zhang
2022-06-29  9:52       ` David Hildenbrand

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=b7eff284-fb61-6a66-dd9a-893b64dd5311@redhat.com \
    --to=david@redhat.com \
    --cc=f.londono@samsung.com \
    --cc=f4bug@amsat.org \
    --cc=pbonzini@redhat.com \
    --cc=peterx@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@gmail.com \
    --cc=t.zhang2@samsung.com \
    --cc=ztong0001@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.