All of lore.kernel.org
 help / color / mirror / Atom feed
From: Hanna Czenczek <hreitz@redhat.com>
To: Fiona Ebner <f.ebner@proxmox.com>,
	QEMU Developers <qemu-devel@nongnu.org>
Cc: "open list:Network Block Dev..." <qemu-block@nongnu.org>,
	Thomas Lamprecht <t.lamprecht@proxmox.com>,
	John Snow <jsnow@redhat.com>, Paolo Bonzini <pbonzini@redhat.com>
Subject: Re: Deadlock with ide_issue_trim and draining
Date: Tue, 7 Mar 2023 14:44:29 +0100	[thread overview]
Message-ID: <97638730-0dfa-918b-3c66-7874171b3e5c@redhat.com> (raw)
In-Reply-To: <1e3813b6-f2d0-9bd5-a270-e5835c13b495@proxmox.com>

On 07.03.23 13:22, Fiona Ebner wrote:
> Hi,
> I am suspecting that commit 7e5cdb345f ("ide: Increment BB in-flight
> counter for TRIM BH") introduced an issue in combination with draining.
>
>  From a debug session on a costumer's machine I gathered the following
> information:
> * The QEMU process hangs in aio_poll called during draining and doesn't
> progress.
> * The in_flight counter for the BlockDriverState is 0 and for the
> BlockBackend it is 1.
> * There is a blk_aio_pdiscard_entry request in the BlockBackend's
> queued_requests.
> * The drive is attached via ahci.
>
> I suspect that something like the following happened:
>
> 1. ide_issue_trim is called, and increments the in_flight counter.
> 2. ide_issue_trim_cb calls blk_aio_pdiscard.
> 3. somebody else starts draining.
> 4. ide_issue_trim_cb is called as the completion callback for
> blk_aio_pdiscard.
> 5. ide_issue_trim_cb issues yet another blk_aio_pdiscard request.
> 6. The request is added to the wait queue via blk_wait_while_drained,
> because draining has been started.
> 7. Nobody ever decrements the in_flight counter and draining can't finish.

Sounds about right.

> The issue occurs very rarely and is difficult to reproduce, but with the
> help of GDB, I'm able to do it rather reliably:
> 1. Use GDB to break on blk_aio_pdiscard.
> 2. Run mkfs.ext4 on a huge disk in the guest.
> 3. Issue a drive-backup QMP command after landing on the breakpoint.
> 4. Continue a few times in GDB.
> 5. After that I can observe the same situation as described above.
>
> I'd be happy about suggestions for how to fix it. Unfortunately, I don't
> see a clear-cut way at the moment. The only idea I have right now is to
> change the code to issue all discard requests at the same time, but I
> fear there might pitfalls with that?

The point of 7e5cdb345f was that we need any in-flight count to 
accompany a set s->bus->dma->aiocb.  While blk_aio_pdiscard() is 
happening, we don’t necessarily need another count.  But we do need it 
while there is no blk_aio_pdiscard().

ide_issue_trim_cb() returns in two cases (and, recursively through its 
callers, leaves s->bus->dma->aiocb set):
1. After calling blk_aio_pdiscard(), which will keep an in-flight count,
2. After calling replay_bh_schedule_event() (i.e. qemu_bh_schedule()), 
which does not keep an in-flight count.

Perhaps we just need to move the blk_inc_in_flight() above the 
replay_bh_schedule_event() call?

Hanna



  reply	other threads:[~2023-03-07 13:45 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-03-07 12:22 Deadlock with ide_issue_trim and draining Fiona Ebner
2023-03-07 13:44 ` Hanna Czenczek [this message]
2023-03-07 14:27   ` Hanna Czenczek
2023-03-08 10:35     ` Fiona Ebner
2023-03-08 15:02       ` Hanna Czenczek
2023-03-08 18:03   ` Hanna Czenczek

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=97638730-0dfa-918b-3c66-7874171b3e5c@redhat.com \
    --to=hreitz@redhat.com \
    --cc=f.ebner@proxmox.com \
    --cc=jsnow@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=t.lamprecht@proxmox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.