From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 60D16C433FE for ; Sat, 5 Dec 2020 11:39:07 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id C187322D6F for ; Sat, 5 Dec 2020 11:39:06 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org C187322D6F Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=konsulko.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id D65FD6B0036; Sat, 5 Dec 2020 06:39:05 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id D17146B005D; Sat, 5 Dec 2020 06:39:05 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id BB81A6B0068; Sat, 5 Dec 2020 06:39:05 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0141.hostedemail.com [216.40.44.141]) by kanga.kvack.org (Postfix) with ESMTP id A6DCA6B0036 for ; Sat, 5 Dec 2020 06:39:05 -0500 (EST) Received: from smtpin07.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 60D41181AEF21 for ; Sat, 5 Dec 2020 11:39:05 +0000 (UTC) X-FDA: 77559032250.07.tent80_170dcc6273cc Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin07.hostedemail.com (Postfix) with ESMTP id 4352E1803F9A7 for ; Sat, 5 Dec 2020 11:39:05 +0000 (UTC) X-HE-Tag: tent80_170dcc6273cc X-Filterd-Recvd-Size: 18152 Received: from mail-lf1-f65.google.com (mail-lf1-f65.google.com [209.85.167.65]) by imf03.hostedemail.com (Postfix) with ESMTP for ; Sat, 5 Dec 2020 11:39:04 +0000 (UTC) Received: by mail-lf1-f65.google.com with SMTP id d20so11317115lfe.11 for ; Sat, 05 Dec 2020 03:39:04 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=konsulko.com; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=ntD/1CLxl01XGRlq9nmiRJLwEnmOVufryedVpbTMVts=; b=ek0we6/dw3yiVL9VRu2dVxpvd9+nRlp4bnuodu31FEamorYK+JOg7seSPTr/0vveCH 4PLRO5GqkW4dEF9d2cS95oyYyElevIec6azGBQ570FFSWoT/PwBsz6Rq5wxsxWZ7/S4p A6p0R27pmy6mMmgXeTOhW+PrtOJZtDF1vUkvY= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=ntD/1CLxl01XGRlq9nmiRJLwEnmOVufryedVpbTMVts=; b=O7Tu8s13OstMcJ+EcdFHytMJp2WPJgaa+ZtlVoBs+R2ODYP2z6aADMh/COTqL9amAI 5Q5OhJfGTDibORpwv9wLGprfOnS5Of1Nug8DjJJn/EB7IKDW+phShW+GV/3SzHOh6aPt tc21Yo98Tvj+HC2EKFDVnEdBo1WfsAOzIIs9dDieHIesCMpIkSLyUe/vjpYBrFikgAiK MiFH4+nSdY8x5eHmLqPnzcwrJEU9DUBEZosxNvuqe4TNWNXTkW5L5LTiRj400LtbjQQX jlN28r/A3G7PFoXDi36bNw0UNSQgkwjL+LLF2AF4SvvAUCZ5K+gv1zVhRkwlDYCqq7G4 EZ+A== X-Gm-Message-State: AOAM532wpY4+53PxG3TNR2uhMmcf//9xMGTLWFbbH9J7+botYTX/EbLl 1/vKWiF0V0o+QvtUO3N60aLZvsApCS1jp7dmeTJmKA== X-Google-Smtp-Source: ABdhPJwIcwuyUH0YOtDyrNASqGfKoOJ/90ty7EXE+SMlt+ZGv6jN4rjgZ3Tdc2l/ZR7YT5w71SkqHPV49RjPrOINFBI= X-Received: by 2002:a19:8316:: with SMTP id f22mr123859lfd.10.1607168342899; Sat, 05 Dec 2020 03:39:02 -0800 (PST) MIME-Version: 1.0 References: <20201204200408.35009-1-vitaly.wool@konsulko.com> <3b8e33c3ba44e7ec4c090d68972015ef52ba1e7e.camel@gmx.de> In-Reply-To: <3b8e33c3ba44e7ec4c090d68972015ef52ba1e7e.camel@gmx.de> From: Vitaly Wool Date: Sat, 5 Dec 2020 12:38:51 +0100 Message-ID: Subject: Re: [PATCH] z3fold: stricter locking and more careful reclaim To: Mike Galbraith Cc: Linux-MM , Sebastian Andrzej Siewior , Andrew Morton Content-Type: text/plain; charset="UTF-8" X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Fri, Dec 4, 2020 at 10:47 PM Mike Galbraith wrote: > > On Fri, 2020-12-04 at 22:04 +0200, Vitaly Wool wrote: > > Use temporary slots in reclaim function to avoid possible race when > > freeing those. > > > > While at it, make sure we check CLAIMED flag under page lock in the > > reclaim function to make sure we are not racing with z3fold_alloc(). > > > > Signed-off-by: Vitaly Wool > > FYI, with this applied, ltp::mm still crashes the RT kernel, with the > same rwlock corruption being what takes it down. So sad it does, but thanks a lot for the quick response. ~Vitaly > -Mike > > --- > > mm/z3fold.c | 133 +++++++++++++++++++++++++++++----------------------- > > 1 file changed, 74 insertions(+), 59 deletions(-) > > > > diff --git a/mm/z3fold.c b/mm/z3fold.c > > index 18feaa0bc537..47f05228abdf 100644 > > --- a/mm/z3fold.c > > +++ b/mm/z3fold.c > > @@ -186,6 +186,7 @@ enum z3fold_page_flags { > > */ > > enum z3fold_handle_flags { > > HANDLES_ORPHANED = 0, > > + HANDLES_NOFREE = 1, > > }; > > > > /* > > @@ -320,7 +321,7 @@ static inline void free_handle(unsigned long handle) > > slots = handle_to_slots(handle); > > write_lock(&slots->lock); > > *(unsigned long *)handle = 0; > > - if (zhdr->slots == slots) { > > + if (zhdr->slots == slots || test_bit(HANDLES_NOFREE, &slots->pool)) { > > write_unlock(&slots->lock); > > return; /* simple case, nothing else to do */ > > } > > @@ -653,6 +654,28 @@ static inline void add_to_unbuddied(struct z3fold_pool *pool, > > } > > } > > > > +static inline enum buddy get_free_buddy(struct z3fold_header *zhdr, int chunks) > > +{ > > + enum buddy bud = HEADLESS; > > + > > + if (zhdr->middle_chunks) { > > + if (!zhdr->first_chunks && > > + chunks <= zhdr->start_middle - ZHDR_CHUNKS) > > + bud = FIRST; > > + else if (!zhdr->last_chunks) > > + bud = LAST; > > + } else { > > + if (!zhdr->first_chunks) > > + bud = FIRST; > > + else if (!zhdr->last_chunks) > > + bud = LAST; > > + else > > + bud = MIDDLE; > > + } > > + > > + return bud; > > +} > > + > > static inline void *mchunk_memmove(struct z3fold_header *zhdr, > > unsigned short dst_chunk) > > { > > @@ -714,18 +737,7 @@ static struct z3fold_header *compact_single_buddy(struct z3fold_header *zhdr) > > if (WARN_ON(new_zhdr == zhdr)) > > goto out_fail; > > > > - if (new_zhdr->first_chunks == 0) { > > - if (new_zhdr->middle_chunks != 0 && > > - chunks >= new_zhdr->start_middle) { > > - new_bud = LAST; > > - } else { > > - new_bud = FIRST; > > - } > > - } else if (new_zhdr->last_chunks == 0) { > > - new_bud = LAST; > > - } else if (new_zhdr->middle_chunks == 0) { > > - new_bud = MIDDLE; > > - } > > + new_bud = get_free_buddy(new_zhdr, chunks); > > q = new_zhdr; > > switch (new_bud) { > > case FIRST: > > @@ -847,9 +859,8 @@ static void do_compact_page(struct z3fold_header *zhdr, bool locked) > > return; > > } > > > > - if (unlikely(PageIsolated(page) || > > - test_bit(PAGE_CLAIMED, &page->private) || > > - test_bit(PAGE_STALE, &page->private))) { > > + if (test_bit(PAGE_STALE, &page->private) || > > + test_and_set_bit(PAGE_CLAIMED, &page->private)) { > > z3fold_page_unlock(zhdr); > > return; > > } > > @@ -858,13 +869,16 @@ static void do_compact_page(struct z3fold_header *zhdr, bool locked) > > zhdr->mapped_count == 0 && compact_single_buddy(zhdr)) { > > if (kref_put(&zhdr->refcount, release_z3fold_page_locked)) > > atomic64_dec(&pool->pages_nr); > > - else > > + else { > > + clear_bit(PAGE_CLAIMED, &page->private); > > z3fold_page_unlock(zhdr); > > + } > > return; > > } > > > > z3fold_compact_page(zhdr); > > add_to_unbuddied(pool, zhdr); > > + clear_bit(PAGE_CLAIMED, &page->private); > > z3fold_page_unlock(zhdr); > > } > > > > @@ -1109,17 +1123,8 @@ static int z3fold_alloc(struct z3fold_pool *pool, size_t size, gfp_t gfp, > > retry: > > zhdr = __z3fold_alloc(pool, size, can_sleep); > > if (zhdr) { > > - if (zhdr->first_chunks == 0) { > > - if (zhdr->middle_chunks != 0 && > > - chunks >= zhdr->start_middle) > > - bud = LAST; > > - else > > - bud = FIRST; > > - } else if (zhdr->last_chunks == 0) > > - bud = LAST; > > - else if (zhdr->middle_chunks == 0) > > - bud = MIDDLE; > > - else { > > + bud = get_free_buddy(zhdr, chunks); > > + if (bud == HEADLESS) { > > if (kref_put(&zhdr->refcount, > > release_z3fold_page_locked)) > > atomic64_dec(&pool->pages_nr); > > @@ -1265,7 +1270,6 @@ static void z3fold_free(struct z3fold_pool *pool, unsigned long handle) > > pr_err("%s: unknown bud %d\n", __func__, bud); > > WARN_ON(1); > > put_z3fold_header(zhdr); > > - clear_bit(PAGE_CLAIMED, &page->private); > > return; > > } > > > > @@ -1280,8 +1284,7 @@ static void z3fold_free(struct z3fold_pool *pool, unsigned long handle) > > z3fold_page_unlock(zhdr); > > return; > > } > > - if (unlikely(PageIsolated(page)) || > > - test_and_set_bit(NEEDS_COMPACTING, &page->private)) { > > + if (test_and_set_bit(NEEDS_COMPACTING, &page->private)) { > > put_z3fold_header(zhdr); > > clear_bit(PAGE_CLAIMED, &page->private); > > return; > > @@ -1345,6 +1348,10 @@ static int z3fold_reclaim_page(struct z3fold_pool *pool, unsigned int retries) > > struct page *page = NULL; > > struct list_head *pos; > > unsigned long first_handle = 0, middle_handle = 0, last_handle = 0; > > + struct z3fold_buddy_slots slots __attribute__((aligned(SLOTS_ALIGN))); > > + > > + rwlock_init(&slots.lock); > > + slots.pool = (unsigned long)pool | (1 << HANDLES_NOFREE); > > > > spin_lock(&pool->lock); > > if (!pool->ops || !pool->ops->evict || retries == 0) { > > @@ -1359,35 +1366,36 @@ static int z3fold_reclaim_page(struct z3fold_pool *pool, unsigned int retries) > > list_for_each_prev(pos, &pool->lru) { > > page = list_entry(pos, struct page, lru); > > > > - /* this bit could have been set by free, in which case > > - * we pass over to the next page in the pool. > > - */ > > - if (test_and_set_bit(PAGE_CLAIMED, &page->private)) { > > - page = NULL; > > - continue; > > - } > > - > > - if (unlikely(PageIsolated(page))) { > > - clear_bit(PAGE_CLAIMED, &page->private); > > - page = NULL; > > - continue; > > - } > > zhdr = page_address(page); > > if (test_bit(PAGE_HEADLESS, &page->private)) > > break; > > > > + if (kref_get_unless_zero(&zhdr->refcount) == 0) { > > + zhdr = NULL; > > + break; > > + } > > if (!z3fold_page_trylock(zhdr)) { > > - clear_bit(PAGE_CLAIMED, &page->private); > > + if (kref_put(&zhdr->refcount, > > + release_z3fold_page)) > > + atomic64_dec(&pool->pages_nr); > > zhdr = NULL; > > continue; /* can't evict at this point */ > > } > > - if (zhdr->foreign_handles) { > > - clear_bit(PAGE_CLAIMED, &page->private); > > - z3fold_page_unlock(zhdr); > > + > > + /* test_and_set_bit is of course atomic, but we still > > + * need to do it under page lock, otherwise checking > > + * that bit in __z3fold_alloc wouldn't make sense > > + */ > > + if (zhdr->foreign_handles || > > + test_and_set_bit(PAGE_CLAIMED, &page->private)) { > > + if (kref_put(&zhdr->refcount, > > + release_z3fold_page)) > > + atomic64_dec(&pool->pages_nr); > > + else > > + z3fold_page_unlock(zhdr); > > zhdr = NULL; > > continue; /* can't evict such page */ > > } > > - kref_get(&zhdr->refcount); > > list_del_init(&zhdr->buddy); > > zhdr->cpu = -1; > > break; > > @@ -1409,12 +1417,16 @@ static int z3fold_reclaim_page(struct z3fold_pool *pool, unsigned int retries) > > first_handle = 0; > > last_handle = 0; > > middle_handle = 0; > > + memset(slots.slot, 0, sizeof(slots.slot)); > > if (zhdr->first_chunks) > > - first_handle = encode_handle(zhdr, FIRST); > > + first_handle = __encode_handle(zhdr, &slots, > > + FIRST); > > if (zhdr->middle_chunks) > > - middle_handle = encode_handle(zhdr, MIDDLE); > > + middle_handle = __encode_handle(zhdr, &slots, > > + MIDDLE); > > if (zhdr->last_chunks) > > - last_handle = encode_handle(zhdr, LAST); > > + last_handle = __encode_handle(zhdr, &slots, > > + LAST); > > /* > > * it's safe to unlock here because we hold a > > * reference to this page > > @@ -1429,19 +1441,16 @@ static int z3fold_reclaim_page(struct z3fold_pool *pool, unsigned int retries) > > ret = pool->ops->evict(pool, middle_handle); > > if (ret) > > goto next; > > - free_handle(middle_handle); > > } > > if (first_handle) { > > ret = pool->ops->evict(pool, first_handle); > > if (ret) > > goto next; > > - free_handle(first_handle); > > } > > if (last_handle) { > > ret = pool->ops->evict(pool, last_handle); > > if (ret) > > goto next; > > - free_handle(last_handle); > > } > > next: > > if (test_bit(PAGE_HEADLESS, &page->private)) { > > @@ -1455,9 +1464,11 @@ static int z3fold_reclaim_page(struct z3fold_pool *pool, unsigned int retries) > > spin_unlock(&pool->lock); > > clear_bit(PAGE_CLAIMED, &page->private); > > } else { > > + struct z3fold_buddy_slots *slots = zhdr->slots; > > z3fold_page_lock(zhdr); > > if (kref_put(&zhdr->refcount, > > release_z3fold_page_locked)) { > > + kmem_cache_free(pool->c_handle, slots); > > atomic64_dec(&pool->pages_nr); > > return 0; > > } > > @@ -1573,8 +1584,7 @@ static bool z3fold_page_isolate(struct page *page, isolate_mode_t mode) > > VM_BUG_ON_PAGE(!PageMovable(page), page); > > VM_BUG_ON_PAGE(PageIsolated(page), page); > > > > - if (test_bit(PAGE_HEADLESS, &page->private) || > > - test_bit(PAGE_CLAIMED, &page->private)) > > + if (test_bit(PAGE_HEADLESS, &page->private)) > > return false; > > > > zhdr = page_address(page); > > @@ -1586,6 +1596,8 @@ static bool z3fold_page_isolate(struct page *page, isolate_mode_t mode) > > if (zhdr->mapped_count != 0 || zhdr->foreign_handles != 0) > > goto out; > > > > + if (test_and_set_bit(PAGE_CLAIMED, &page->private)) > > + goto out; > > pool = zhdr_to_pool(zhdr); > > spin_lock(&pool->lock); > > if (!list_empty(&zhdr->buddy)) > > @@ -1612,16 +1624,17 @@ static int z3fold_page_migrate(struct address_space *mapping, struct page *newpa > > > > VM_BUG_ON_PAGE(!PageMovable(page), page); > > VM_BUG_ON_PAGE(!PageIsolated(page), page); > > + VM_BUG_ON_PAGE(!test_bit(PAGE_CLAIMED, &page->private), page); > > VM_BUG_ON_PAGE(!PageLocked(newpage), newpage); > > > > zhdr = page_address(page); > > pool = zhdr_to_pool(zhdr); > > > > - if (!z3fold_page_trylock(zhdr)) { > > + if (!z3fold_page_trylock(zhdr)) > > return -EAGAIN; > > - } > > if (zhdr->mapped_count != 0 || zhdr->foreign_handles != 0) { > > z3fold_page_unlock(zhdr); > > + clear_bit(PAGE_CLAIMED, &page->private); > > return -EBUSY; > > } > > if (work_pending(&zhdr->work)) { > > @@ -1663,6 +1676,7 @@ static int z3fold_page_migrate(struct address_space *mapping, struct page *newpa > > queue_work_on(new_zhdr->cpu, pool->compact_wq, &new_zhdr->work); > > > > page_mapcount_reset(page); > > + clear_bit(PAGE_CLAIMED, &page->private); > > put_page(page); > > return 0; > > } > > @@ -1686,6 +1700,7 @@ static void z3fold_page_putback(struct page *page) > > spin_lock(&pool->lock); > > list_add(&page->lru, &pool->lru); > > spin_unlock(&pool->lock); > > + clear_bit(PAGE_CLAIMED, &page->private); > > z3fold_page_unlock(zhdr); > > } > > >