Linux-Fsdevel Archive on lore.kernel.org
 help / color / Atom feed
From: Vivek Goyal <vgoyal@redhat.com>
To: linux-fsdevel@vger.kernel.org, dan.j.williams@intel.com,
	jack@suse.cz, willy@infradead.org
Cc: virtio-fs@redhat.com, slp@redhat.com, miklos@szeredi.hu,
	linux-nvdimm@lists.01.org, linux-kernel@vger.kernel.org,
	vgoyal@redhat.com
Subject: [PATCH v3 3/3] dax: Wake up all waiters after invalidating dax entry
Date: Mon, 19 Apr 2021 17:36:36 -0400
Message-ID: <20210419213636.1514816-4-vgoyal@redhat.com> (raw)
In-Reply-To: <20210419213636.1514816-1-vgoyal@redhat.com>

I am seeing missed wakeups which ultimately lead to a deadlock when I am
using virtiofs with DAX enabled and running "make -j". I had to mount
virtiofs as rootfs and also reduce to dax window size to 256M to reproduce
the problem consistently.

So here is the problem. put_unlocked_entry() wakes up waiters only
if entry is not null as well as !dax_is_conflict(entry). But if I
call multiple instances of invalidate_inode_pages2() in parallel,
then I can run into a situation where there are waiters on
this index but nobody will wait these.

invalidate_inode_pages2()
  invalidate_inode_pages2_range()
    invalidate_exceptional_entry2()
      dax_invalidate_mapping_entry_sync()
        __dax_invalidate_entry() {
                xas_lock_irq(&xas);
                entry = get_unlocked_entry(&xas, 0);
                ...
                ...
                dax_disassociate_entry(entry, mapping, trunc);
                xas_store(&xas, NULL);
                ...
                ...
                put_unlocked_entry(&xas, entry);
                xas_unlock_irq(&xas);
        }

Say a fault in in progress and it has locked entry at offset say "0x1c".
Now say three instances of invalidate_inode_pages2() are in progress
(A, B, C) and they all try to invalidate entry at offset "0x1c". Given
dax entry is locked, all tree instances A, B, C will wait in wait queue.

When dax fault finishes, say A is woken up. It will store NULL entry
at index "0x1c" and wake up B. When B comes along it will find "entry=0"
at page offset 0x1c and it will call put_unlocked_entry(&xas, 0). And
this means put_unlocked_entry() will not wake up next waiter, given
the current code. And that means C continues to wait and is not woken
up.

This patch fixes the issue by waking up all waiters when a dax entry
has been invalidated. This seems to fix the deadlock I am facing
and I can make forward progress.

Reported-by: Sergio Lopez <slp@redhat.com>
Fixes: ac401cc78242 ("dax: New fault locking")
Suggested-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
---
 fs/dax.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/dax.c b/fs/dax.c
index f19d76a6a493..cc497519be83 100644
--- a/fs/dax.c
+++ b/fs/dax.c
@@ -676,7 +676,7 @@ static int __dax_invalidate_entry(struct address_space *mapping,
 	mapping->nrexceptional--;
 	ret = 1;
 out:
-	put_unlocked_entry(&xas, entry, WAKE_NEXT);
+	put_unlocked_entry(&xas, entry, WAKE_ALL);
 	xas_unlock_irq(&xas);
 	return ret;
 }
-- 
2.25.4


  parent reply index

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-04-19 21:36 [PATCH v3 0/3] dax: Fix missed wakeup in put_unlocked_entry() Vivek Goyal
2021-04-19 21:36 ` [PATCH v3 1/3] dax: Add an enum for specifying dax wakup mode Vivek Goyal
2021-04-20  7:19   ` [Virtio-fs] " Greg Kurz
2021-04-21  9:24   ` Jan Kara
2021-04-21 15:56     ` Vivek Goyal
2021-04-21 16:16       ` Matthew Wilcox
2021-04-21 17:21         ` Vivek Goyal
2021-04-19 21:36 ` [PATCH v3 2/3] dax: Add a wakeup mode parameter to put_unlocked_entry() Vivek Goyal
2021-04-20  7:34   ` [Virtio-fs] " Greg Kurz
2021-04-20 14:00     ` Vivek Goyal
2021-04-21 19:09       ` Dan Williams
2021-04-21 19:13         ` Vivek Goyal
2021-04-22  6:24         ` Christoph Hellwig
2021-04-22 16:12           ` Darrick J. Wong
2021-04-22 20:01           ` Dan Williams
2021-04-22 20:07             ` Vivek Goyal
2021-05-17 17:10           ` Dan Williams
2021-04-21 17:39     ` Vivek Goyal
2021-04-21  9:25   ` Jan Kara
2021-04-19 21:36 ` Vivek Goyal [this message]
2021-04-21  9:26   ` [PATCH v3 3/3] dax: Wake up all waiters after invalidating dax entry Jan Kara

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210419213636.1514816-4-vgoyal@redhat.com \
    --to=vgoyal@redhat.com \
    --cc=dan.j.williams@intel.com \
    --cc=jack@suse.cz \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-nvdimm@lists.01.org \
    --cc=miklos@szeredi.hu \
    --cc=slp@redhat.com \
    --cc=virtio-fs@redhat.com \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Linux-Fsdevel Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-fsdevel/0 linux-fsdevel/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-fsdevel linux-fsdevel/ https://lore.kernel.org/linux-fsdevel \
		linux-fsdevel@vger.kernel.org
	public-inbox-index linux-fsdevel

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-fsdevel


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git