From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-10.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 313F1C4361A for ; Fri, 4 Dec 2020 21:48:05 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id EE30522BE8 for ; Fri, 4 Dec 2020 21:48:03 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org EE30522BE8 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=gmx.de Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 0923D6B0036; Fri, 4 Dec 2020 16:48:03 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 0433E6B005C; Fri, 4 Dec 2020 16:48:02 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E72946B005D; Fri, 4 Dec 2020 16:48:02 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0222.hostedemail.com [216.40.44.222]) by kanga.kvack.org (Postfix) with ESMTP id D15586B0036 for ; Fri, 4 Dec 2020 16:48:02 -0500 (EST) Received: from smtpin08.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 99328180AD811 for ; Fri, 4 Dec 2020 21:48:02 +0000 (UTC) X-FDA: 77556938004.08.cent39_170206e273c7 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin08.hostedemail.com (Postfix) with ESMTP id 7A77E1819E772 for ; Fri, 4 Dec 2020 21:48:02 +0000 (UTC) X-HE-Tag: cent39_170206e273c7 X-Filterd-Recvd-Size: 14173 Received: from mout.gmx.net (mout.gmx.net [212.227.15.18]) by imf14.hostedemail.com (Postfix) with ESMTP for ; Fri, 4 Dec 2020 21:48:01 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=gmx.net; s=badeba3b8450; t=1607118479; bh=/O7oPYCAE2VI1QdLv9B2R/lBF7qYIV8mVin6TykL/1k=; h=X-UI-Sender-Class:Subject:From:To:Cc:Date:In-Reply-To:References; b=M/7gMyzMCMDQ7UF3Y6bE5z1EJLueaWP/CAfpKmLv6UlyXs85K77YMuo7c/P6H2Yqp FRS5ramlHfahGawHemkuy3kj2i2g1d/jgtKKzX4mAaGmj6s8AGsNIqqydIFIDxehPL ozjxOHCaiRWg7fdXiZBErvkQJaBy4b+DPw025tz0= X-UI-Sender-Class: 01bb95c1-4bf8-414a-932a-4f6e2808ef9c Received: from homer.fritz.box ([185.221.149.242]) by mail.gmx.com (mrgmx005 [212.227.17.190]) with ESMTPSA (Nemesis) id 1MStCe-1kasIC3ej0-00UKaL; Fri, 04 Dec 2020 22:47:58 +0100 Message-ID: <3b8e33c3ba44e7ec4c090d68972015ef52ba1e7e.camel@gmx.de> Subject: Re: [PATCH] z3fold: stricter locking and more careful reclaim From: Mike Galbraith To: Vitaly Wool , linux-mm@kvack.org Cc: Sebastian Andrzej Siewior , akpm@linux-foundation.org Date: Fri, 04 Dec 2020 22:47:57 +0100 In-Reply-To: <20201204200408.35009-1-vitaly.wool@konsulko.com> References: <20201204200408.35009-1-vitaly.wool@konsulko.com> Content-Type: text/plain; charset="ISO-8859-15" User-Agent: Evolution 3.34.4 MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Provags-ID: V03:K1:pwifrQqlGKlrjdZ5zOWTCAVN825+IuXTCyZGrIxWCECEY9L6GLo AWCHK17lGrFmZ4rmFP7qzEBCt+XaMpUollZ+8BtEHIjV8hAXayPdtKNSw9KRcWI+pLr0gKL zLr7/K9YL95Pel+nCKkz9tJQU+pGM/MlwKtEGrO4/qzxE3ApBbunVNspY5oLlUE1CSH5neL ipqKNnC/syvAbg893sJJQ== X-UI-Out-Filterresults: notjunk:1;V03:K0:wUtamh21Fl8=:36m/yPoQpyZa+wO8XD4spy 8RdMpmvtrW7fMGLqYjK1gsdKnO8w0cpIPuZT1vW9prAMGt8fvQptX/Lo8putXpCmpxpIg8CeV cYAcebQCr87+mZfUFdKf9ojCZOzBctpVWfHgClb4Ng6ifE+ZEa2t6hccaYH8ly+lCU2KaEKAf SCHx5bIS3+K+Jxx5ZhfViKqTcbPlY7Hbh9lgkb1LB8fO7ZAuqgt6vk+BVx3A+SLxPMTqRmYfX gRnp3U+ksaCgs668yrIyFinyvGM+Grht7SI00Jf9ExTZDCE31+s56ZwM2PhTnoix3oGsZbNQa K4ITp4lle6bE9VX8kCtxBH/zZVctHKVykKdL63KQL2kkNaD2YmVGrpHVGkG7QJO+vzR6FkLOI vWdk5ga9TC7IYTPNL8b5yi2D8I19Fmqogzf7Q+be0FfctToGDtCgE2vbTRSepZKNXMYxvy+Fy +etyggElKwHW7f1fceGaKrZ8UgZ8cRAVqhqoTZ3CQFqCgKqh2KwW+hXq/CFfuGTQHY6xitPa1 SrVXFJJQKDP0fzY4Nt9nEfO6rWT3jVDWqhYNXgqLcbeo4LHvsP+rGX7oYHnKzlWnbmrK9P4fr zUX0/xpIAhiiqxynnpKQSmB/5WlPdC9gyK3ZGrSkD15Kidz3tD1SiV5Pyok95BhvxCMLHGjVT 6hJQIPnf8dhg7CxQcgZVEVgrzI8fZ7N7NQOOgdT/aBIrlJwWY+MrM/Cus5N2xNl4pXE5mI7g+ pk3xAlHnaQpJEkHDUJ1v9kp1xpIHZkIwPsuMr/Krl+3ze+hs/UvUPG/5xNBdZVvBx5zaaDj48 qDlBWZKS65qrCidscp+UTSFWBnCymUD6y2aaJ0DnX8hYbLvlpD0VS2gu1TFDRcINWMl5hdP3u NWRvBA/iSYBI5JQ2rBxg== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Fri, 2020-12-04 at 22:04 +0200, Vitaly Wool wrote: > Use temporary slots in reclaim function to avoid possible race when > freeing those. > > While at it, make sure we check CLAIMED flag under page lock in the > reclaim function to make sure we are not racing with z3fold_alloc(). > > Signed-off-by: Vitaly Wool FYI, with this applied, ltp::mm still crashes the RT kernel, with the same rwlock corruption being what takes it down. -Mike > --- > mm/z3fold.c | 133 +++++++++++++++++++++++++++++----------------------- > 1 file changed, 74 insertions(+), 59 deletions(-) > > diff --git a/mm/z3fold.c b/mm/z3fold.c > index 18feaa0bc537..47f05228abdf 100644 > --- a/mm/z3fold.c > +++ b/mm/z3fold.c > @@ -186,6 +186,7 @@ enum z3fold_page_flags { > */ > enum z3fold_handle_flags { > HANDLES_ORPHANED =3D 0, > + HANDLES_NOFREE =3D 1, > }; > > /* > @@ -320,7 +321,7 @@ static inline void free_handle(unsigned long handle) > slots =3D handle_to_slots(handle); > write_lock(&slots->lock); > *(unsigned long *)handle =3D 0; > - if (zhdr->slots =3D=3D slots) { > + if (zhdr->slots =3D=3D slots || test_bit(HANDLES_NOFREE, &slots->pool)= ) { > write_unlock(&slots->lock); > return; /* simple case, nothing else to do */ > } > @@ -653,6 +654,28 @@ static inline void add_to_unbuddied(struct z3fold_p= ool *pool, > } > } > > +static inline enum buddy get_free_buddy(struct z3fold_header *zhdr, int= chunks) > +{ > + enum buddy bud =3D HEADLESS; > + > + if (zhdr->middle_chunks) { > + if (!zhdr->first_chunks && > + chunks <=3D zhdr->start_middle - ZHDR_CHUNKS) > + bud =3D FIRST; > + else if (!zhdr->last_chunks) > + bud =3D LAST; > + } else { > + if (!zhdr->first_chunks) > + bud =3D FIRST; > + else if (!zhdr->last_chunks) > + bud =3D LAST; > + else > + bud =3D MIDDLE; > + } > + > + return bud; > +} > + > static inline void *mchunk_memmove(struct z3fold_header *zhdr, > unsigned short dst_chunk) > { > @@ -714,18 +737,7 @@ static struct z3fold_header *compact_single_buddy(s= truct z3fold_header *zhdr) > if (WARN_ON(new_zhdr =3D=3D zhdr)) > goto out_fail; > > - if (new_zhdr->first_chunks =3D=3D 0) { > - if (new_zhdr->middle_chunks !=3D 0 && > - chunks >=3D new_zhdr->start_middle) { > - new_bud =3D LAST; > - } else { > - new_bud =3D FIRST; > - } > - } else if (new_zhdr->last_chunks =3D=3D 0) { > - new_bud =3D LAST; > - } else if (new_zhdr->middle_chunks =3D=3D 0) { > - new_bud =3D MIDDLE; > - } > + new_bud =3D get_free_buddy(new_zhdr, chunks); > q =3D new_zhdr; > switch (new_bud) { > case FIRST: > @@ -847,9 +859,8 @@ static void do_compact_page(struct z3fold_header *zh= dr, bool locked) > return; > } > > - if (unlikely(PageIsolated(page) || > - test_bit(PAGE_CLAIMED, &page->private) || > - test_bit(PAGE_STALE, &page->private))) { > + if (test_bit(PAGE_STALE, &page->private) || > + test_and_set_bit(PAGE_CLAIMED, &page->private)) { > z3fold_page_unlock(zhdr); > return; > } > @@ -858,13 +869,16 @@ static void do_compact_page(struct z3fold_header *= zhdr, bool locked) > zhdr->mapped_count =3D=3D 0 && compact_single_buddy(zhdr)) { > if (kref_put(&zhdr->refcount, release_z3fold_page_locked)) > atomic64_dec(&pool->pages_nr); > - else > + else { > + clear_bit(PAGE_CLAIMED, &page->private); > z3fold_page_unlock(zhdr); > + } > return; > } > > z3fold_compact_page(zhdr); > add_to_unbuddied(pool, zhdr); > + clear_bit(PAGE_CLAIMED, &page->private); > z3fold_page_unlock(zhdr); > } > > @@ -1109,17 +1123,8 @@ static int z3fold_alloc(struct z3fold_pool *pool,= size_t size, gfp_t gfp, > retry: > zhdr =3D __z3fold_alloc(pool, size, can_sleep); > if (zhdr) { > - if (zhdr->first_chunks =3D=3D 0) { > - if (zhdr->middle_chunks !=3D 0 && > - chunks >=3D zhdr->start_middle) > - bud =3D LAST; > - else > - bud =3D FIRST; > - } else if (zhdr->last_chunks =3D=3D 0) > - bud =3D LAST; > - else if (zhdr->middle_chunks =3D=3D 0) > - bud =3D MIDDLE; > - else { > + bud =3D get_free_buddy(zhdr, chunks); > + if (bud =3D=3D HEADLESS) { > if (kref_put(&zhdr->refcount, > release_z3fold_page_locked)) > atomic64_dec(&pool->pages_nr); > @@ -1265,7 +1270,6 @@ static void z3fold_free(struct z3fold_pool *pool, = unsigned long handle) > pr_err("%s: unknown bud %d\n", __func__, bud); > WARN_ON(1); > put_z3fold_header(zhdr); > - clear_bit(PAGE_CLAIMED, &page->private); > return; > } > > @@ -1280,8 +1284,7 @@ static void z3fold_free(struct z3fold_pool *pool, = unsigned long handle) > z3fold_page_unlock(zhdr); > return; > } > - if (unlikely(PageIsolated(page)) || > - test_and_set_bit(NEEDS_COMPACTING, &page->private)) { > + if (test_and_set_bit(NEEDS_COMPACTING, &page->private)) { > put_z3fold_header(zhdr); > clear_bit(PAGE_CLAIMED, &page->private); > return; > @@ -1345,6 +1348,10 @@ static int z3fold_reclaim_page(struct z3fold_pool= *pool, unsigned int retries) > struct page *page =3D NULL; > struct list_head *pos; > unsigned long first_handle =3D 0, middle_handle =3D 0, last_handle =3D= 0; > + struct z3fold_buddy_slots slots __attribute__((aligned(SLOTS_ALIGN))); > + > + rwlock_init(&slots.lock); > + slots.pool =3D (unsigned long)pool | (1 << HANDLES_NOFREE); > > spin_lock(&pool->lock); > if (!pool->ops || !pool->ops->evict || retries =3D=3D 0) { > @@ -1359,35 +1366,36 @@ static int z3fold_reclaim_page(struct z3fold_poo= l *pool, unsigned int retries) > list_for_each_prev(pos, &pool->lru) { > page =3D list_entry(pos, struct page, lru); > > - /* this bit could have been set by free, in which case > - * we pass over to the next page in the pool. > - */ > - if (test_and_set_bit(PAGE_CLAIMED, &page->private)) { > - page =3D NULL; > - continue; > - } > - > - if (unlikely(PageIsolated(page))) { > - clear_bit(PAGE_CLAIMED, &page->private); > - page =3D NULL; > - continue; > - } > zhdr =3D page_address(page); > if (test_bit(PAGE_HEADLESS, &page->private)) > break; > > + if (kref_get_unless_zero(&zhdr->refcount) =3D=3D 0) { > + zhdr =3D NULL; > + break; > + } > if (!z3fold_page_trylock(zhdr)) { > - clear_bit(PAGE_CLAIMED, &page->private); > + if (kref_put(&zhdr->refcount, > + release_z3fold_page)) > + atomic64_dec(&pool->pages_nr); > zhdr =3D NULL; > continue; /* can't evict at this point */ > } > - if (zhdr->foreign_handles) { > - clear_bit(PAGE_CLAIMED, &page->private); > - z3fold_page_unlock(zhdr); > + > + /* test_and_set_bit is of course atomic, but we still > + * need to do it under page lock, otherwise checking > + * that bit in __z3fold_alloc wouldn't make sense > + */ > + if (zhdr->foreign_handles || > + test_and_set_bit(PAGE_CLAIMED, &page->private)) { > + if (kref_put(&zhdr->refcount, > + release_z3fold_page)) > + atomic64_dec(&pool->pages_nr); > + else > + z3fold_page_unlock(zhdr); > zhdr =3D NULL; > continue; /* can't evict such page */ > } > - kref_get(&zhdr->refcount); > list_del_init(&zhdr->buddy); > zhdr->cpu =3D -1; > break; > @@ -1409,12 +1417,16 @@ static int z3fold_reclaim_page(struct z3fold_poo= l *pool, unsigned int retries) > first_handle =3D 0; > last_handle =3D 0; > middle_handle =3D 0; > + memset(slots.slot, 0, sizeof(slots.slot)); > if (zhdr->first_chunks) > - first_handle =3D encode_handle(zhdr, FIRST); > + first_handle =3D __encode_handle(zhdr, &slots, > + FIRST); > if (zhdr->middle_chunks) > - middle_handle =3D encode_handle(zhdr, MIDDLE); > + middle_handle =3D __encode_handle(zhdr, &slots, > + MIDDLE); > if (zhdr->last_chunks) > - last_handle =3D encode_handle(zhdr, LAST); > + last_handle =3D __encode_handle(zhdr, &slots, > + LAST); > /* > * it's safe to unlock here because we hold a > * reference to this page > @@ -1429,19 +1441,16 @@ static int z3fold_reclaim_page(struct z3fold_poo= l *pool, unsigned int retries) > ret =3D pool->ops->evict(pool, middle_handle); > if (ret) > goto next; > - free_handle(middle_handle); > } > if (first_handle) { > ret =3D pool->ops->evict(pool, first_handle); > if (ret) > goto next; > - free_handle(first_handle); > } > if (last_handle) { > ret =3D pool->ops->evict(pool, last_handle); > if (ret) > goto next; > - free_handle(last_handle); > } > next: > if (test_bit(PAGE_HEADLESS, &page->private)) { > @@ -1455,9 +1464,11 @@ static int z3fold_reclaim_page(struct z3fold_pool= *pool, unsigned int retries) > spin_unlock(&pool->lock); > clear_bit(PAGE_CLAIMED, &page->private); > } else { > + struct z3fold_buddy_slots *slots =3D zhdr->slots; > z3fold_page_lock(zhdr); > if (kref_put(&zhdr->refcount, > release_z3fold_page_locked)) { > + kmem_cache_free(pool->c_handle, slots); > atomic64_dec(&pool->pages_nr); > return 0; > } > @@ -1573,8 +1584,7 @@ static bool z3fold_page_isolate(struct page *page,= isolate_mode_t mode) > VM_BUG_ON_PAGE(!PageMovable(page), page); > VM_BUG_ON_PAGE(PageIsolated(page), page); > > - if (test_bit(PAGE_HEADLESS, &page->private) || > - test_bit(PAGE_CLAIMED, &page->private)) > + if (test_bit(PAGE_HEADLESS, &page->private)) > return false; > > zhdr =3D page_address(page); > @@ -1586,6 +1596,8 @@ static bool z3fold_page_isolate(struct page *page,= isolate_mode_t mode) > if (zhdr->mapped_count !=3D 0 || zhdr->foreign_handles !=3D 0) > goto out; > > + if (test_and_set_bit(PAGE_CLAIMED, &page->private)) > + goto out; > pool =3D zhdr_to_pool(zhdr); > spin_lock(&pool->lock); > if (!list_empty(&zhdr->buddy)) > @@ -1612,16 +1624,17 @@ static int z3fold_page_migrate(struct address_sp= ace *mapping, struct page *newpa > > VM_BUG_ON_PAGE(!PageMovable(page), page); > VM_BUG_ON_PAGE(!PageIsolated(page), page); > + VM_BUG_ON_PAGE(!test_bit(PAGE_CLAIMED, &page->private), page); > VM_BUG_ON_PAGE(!PageLocked(newpage), newpage); > > zhdr =3D page_address(page); > pool =3D zhdr_to_pool(zhdr); > > - if (!z3fold_page_trylock(zhdr)) { > + if (!z3fold_page_trylock(zhdr)) > return -EAGAIN; > - } > if (zhdr->mapped_count !=3D 0 || zhdr->foreign_handles !=3D 0) { > z3fold_page_unlock(zhdr); > + clear_bit(PAGE_CLAIMED, &page->private); > return -EBUSY; > } > if (work_pending(&zhdr->work)) { > @@ -1663,6 +1676,7 @@ static int z3fold_page_migrate(struct address_spac= e *mapping, struct page *newpa > queue_work_on(new_zhdr->cpu, pool->compact_wq, &new_zhdr->work); > > page_mapcount_reset(page); > + clear_bit(PAGE_CLAIMED, &page->private); > put_page(page); > return 0; > } > @@ -1686,6 +1700,7 @@ static void z3fold_page_putback(struct page *page) > spin_lock(&pool->lock); > list_add(&page->lru, &pool->lru); > spin_unlock(&pool->lock); > + clear_bit(PAGE_CLAIMED, &page->private); > z3fold_page_unlock(zhdr); > } >