All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mike Galbraith <efault@gmx.de>
To: Vitaly Wool <vitaly.wool@konsulko.com>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>,
	Oleksandr Natalenko <oleksandr@natalenko.name>,
	LKML <linux-kernel@vger.kernel.org>,
	Linux-MM <linux-mm@kvack.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Steven Rostedt <rostedt@goodmis.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	linux-rt-users@vger.kernel.org
Subject: Re: scheduling while atomic in z3fold
Date: Wed, 09 Dec 2020 07:13:31 +0100	[thread overview]
Message-ID: <35863f8cbdb99b2a7eeac3bca13ae6962d6a98c0.camel@gmx.de> (raw)
In-Reply-To: <159ce32b-a7c4-08be-9283-9e38a7c27848@konsulko.com>

On Wed, 2020-12-09 at 00:26 +0100, Vitaly Wool wrote:
> Hi Mike,
>
> On 2020-12-07 16:41, Mike Galbraith wrote:
> > On Mon, 2020-12-07 at 16:21 +0100, Vitaly Wool wrote:
> >> On Mon, Dec 7, 2020 at 1:34 PM Mike Galbraith <efault@gmx.de> wrote:
> >>>
> >>
> >>> Unfortunately, that made zero difference.
> >>
> >> Okay, I suggest that you submit the patch that changes read_lock() to
> >> write_lock() in __release_z3fold_page() and I'll ack it then.
> >> I would like to rewrite the code so that write_lock is not necessary
> >> there but I don't want to hold you back and it isn't likely that I'll
> >> complete this today.
> >
> > Nah, I'm in no rush... especially not to sign off on "Because the
> > little voices in my head said this bit should look like that bit over
> > yonder, and testing _seems_ to indicate they're right about that" :)
> >
> > 	-Mike
> >
>
> okay, thanks. Would this make things better:

Yup, z3fold became RT tolerant with this (un-munged and) applied.

BTW, I don't think my write_lock() hacklet actually plugged the hole
that leads to the below.  I think it just reduced the odds of the two
meeting enough to make it look ~solid in fairly limited test time.

[ 3369.373023] kworker/-7413      4.....12 3368809247us : do_compact_page: zhdr: ffff95d93abd8000 zhdr->slots: ffff95d951f5df80 zhdr->slots->slot[0]: 0
[ 3369.373025] kworker/-7413      4.....12 3368809248us : do_compact_page: old_handle ffff95d951f5df98 *old_handle was ffff95d93abd804f now is ffff95da3ab8104c
[ 3369.373027] kworker/-7413      4.....11 3368809249us : __release_z3fold_page.constprop.25: freed ffff95d951f5df80
[ 3369.373028] ---------------------------------
[ 3369.373029] CR2: 0000000000000018
crash> p debug_handle | grep '\[2'
  [2]: ffff95dc1ecac0d0
crash> rd ffff95dc1ecac0d0
ffff95dc1ecac0d0:  ffff95d951f5df98                    ...Q....
crash> p debug_zhdr | grep '\[2'
  [2]: ffff95dc1ecac0c8
crash> rd ffff95dc1ecac0c8
ffff95dc1ecac0c8:  ffff95da3ab81000                    ...:....  <== kworker is not working on same zhdr as free_handle()..
crash> p debug_slots | grep '\[2'
  [2]: ffff95dc1ecac0c0
crash> rd ffff95dc1ecac0c0
ffff95dc1ecac0c0:  ffff95d951f5df80                    ...Q....  <== ..but is the same slots, and frees it under free_handle().
crash> bt -sx                                                          leading to use after free+corruption+explosion 1us later.
PID: 9334   TASK: ffff95d95a1eb3c0  CPU: 2   COMMAND: "swapoff"
 #0 [ffffb4248847f8f0] machine_kexec+0x16e at ffffffffa605f8ce
 #1 [ffffb4248847f938] __crash_kexec+0xd2 at ffffffffa614c162
 #2 [ffffb4248847f9f8] crash_kexec+0x30 at ffffffffa614d350
 #3 [ffffb4248847fa08] oops_end+0xca at ffffffffa602680a
 #4 [ffffb4248847fa28] no_context+0x14d at ffffffffa606d7cd
 #5 [ffffb4248847fa88] exc_page_fault+0x2b8 at ffffffffa68bdb88
 #6 [ffffb4248847fae0] asm_exc_page_fault+0x1e at ffffffffa6a00ace
 #7 [ffffb4248847fb68] mark_wakeup_next_waiter+0x51 at ffffffffa60ea121
 #8 [ffffb4248847fbd0] rt_mutex_futex_unlock+0x4f at ffffffffa68c93ef
 #9 [ffffb4248847fc10] z3fold_zpool_free+0x593 at ffffffffc0ecb663 [z3fold]
#10 [ffffb4248847fc78] zswap_free_entry+0x43 at ffffffffa627c823
#11 [ffffb4248847fc88] zswap_frontswap_invalidate_page+0x8a at ffffffffa627c92a
#12 [ffffb4248847fcb0] __frontswap_invalidate_page+0x48 at ffffffffa627c018
#13 [ffffb4248847fcd8] swapcache_free_entries+0x1ee at ffffffffa6276f5e
#14 [ffffb4248847fd20] free_swap_slot+0x9f at ffffffffa627b8ff
#15 [ffffb4248847fd40] delete_from_swap_cache+0x61 at ffffffffa6274621
#16 [ffffb4248847fd60] try_to_free_swap+0x70 at ffffffffa6277520
#17 [ffffb4248847fd70] unuse_vma+0x55c at ffffffffa627869c
#18 [ffffb4248847fe90] try_to_unuse+0x139 at ffffffffa6278e89
#19 [ffffb4248847fee8] __x64_sys_swapoff+0x1eb at ffffffffa62798cb
#20 [ffffb4248847ff40] do_syscall_64+0x33 at ffffffffa68b9ab3
#21 [ffffb4248847ff50] entry_SYSCALL_64_after_hwframe+0x44 at ffffffffa6a0007c
    RIP: 00007fbd835a5d17  RSP: 00007ffd60634458  RFLAGS: 00000202
    RAX: ffffffffffffffda  RBX: 0000559540e34b60  RCX: 00007fbd835a5d17
    RDX: 0000000000000001  RSI: 0000000000000001  RDI: 0000559540e34b60
    RBP: 0000000000000001   R8: 00007ffd606344c0   R9: 0000000000000003
    R10: 0000559540e34721  R11: 0000000000000202  R12: 0000000000000001
    R13: 0000000000000000  R14: 00007ffd606344c0  R15: 0000000000000000
    ORIG_RAX: 00000000000000a8  CS: 0033  SS: 002b
crash>

>
> diff --git a/mm/z3fold.c b/mm/z3fold.c
> index 18feaa0bc537..340c38a5ffac 100644
> --- a/mm/z3fold.c
> +++ b/mm/z3fold.c
> @@ -303,10 +303,9 @@ static inline void put_z3fold_header(struct
> z3fold_header *zhdr)
>   		z3fold_page_unlock(zhdr);
>   }
>
> -static inline void free_handle(unsigned long handle)
> +static inline void free_handle(unsigned long handle, struct
> z3fold_header *zhdr)
>   {
>   	struct z3fold_buddy_slots *slots;
> -	struct z3fold_header *zhdr;
>   	int i;
>   	bool is_free;
>
> @@ -316,22 +315,13 @@ static inline void free_handle(unsigned long handle)
>   	if (WARN_ON(*(unsigned long *)handle == 0))
>   		return;
>
> -	zhdr = handle_to_z3fold_header(handle);
>   	slots = handle_to_slots(handle);
>   	write_lock(&slots->lock);
>   	*(unsigned long *)handle = 0;
> -	if (zhdr->slots == slots) {
> -		write_unlock(&slots->lock);
> -		return; /* simple case, nothing else to do */
> -	}
> +	if (zhdr->slots != slots)
> +		zhdr->foreign_handles--;
>
> -	/* we are freeing a foreign handle if we are here */
> -	zhdr->foreign_handles--;
>   	is_free = true;
> -	if (!test_bit(HANDLES_ORPHANED, &slots->pool)) {
> -		write_unlock(&slots->lock);
> -		return;
> -	}
>   	for (i = 0; i <= BUDDY_MASK; i++) {
>   		if (slots->slot[i]) {
>   			is_free = false;
> @@ -343,6 +333,8 @@ static inline void free_handle(unsigned long handle)
>   	if (is_free) {
>   		struct z3fold_pool *pool = slots_to_pool(slots);
>
> +		if (zhdr->slots == slots)
> +			zhdr->slots = NULL;
>   		kmem_cache_free(pool->c_handle, slots);
>   	}
>   }
> @@ -525,8 +517,6 @@ static void __release_z3fold_page(struct
> z3fold_header *zhdr, bool locked)
>   {
>   	struct page *page = virt_to_page(zhdr);
>   	struct z3fold_pool *pool = zhdr_to_pool(zhdr);
> -	bool is_free = true;
> -	int i;
>
>   	WARN_ON(!list_empty(&zhdr->buddy));
>   	set_bit(PAGE_STALE, &page->private);
> @@ -536,21 +526,6 @@ static void __release_z3fold_page(struct
> z3fold_header *zhdr, bool locked)
>   		list_del_init(&page->lru);
>   	spin_unlock(&pool->lock);
>
> -	/* If there are no foreign handles, free the handles array */
> -	read_lock(&zhdr->slots->lock);
> -	for (i = 0; i <= BUDDY_MASK; i++) {
> -		if (zhdr->slots->slot[i]) {
> -			is_free = false;
> -			break;
> -		}
> -	}
> -	if (!is_free)
> -		set_bit(HANDLES_ORPHANED, &zhdr->slots->pool);
> -	read_unlock(&zhdr->slots->lock);
> -
> -	if (is_free)
> -		kmem_cache_free(pool->c_handle, zhdr->slots);
> -
>   	if (locked)
>   		z3fold_page_unlock(zhdr);
>
> @@ -973,6 +948,9 @@ static inline struct z3fold_header
> *__z3fold_alloc(struct z3fold_pool *pool,
>   		}
>   	}
>
> +	if (zhdr && !zhdr->slots)
> +		zhdr->slots = alloc_slots(pool,
> +					can_sleep ? GFP_NOIO : GFP_ATOMIC);
>   	return zhdr;
>   }
>
> @@ -1270,7 +1248,7 @@ static void z3fold_free(struct z3fold_pool *pool,
> unsigned long handle)
>   	}
>
>   	if (!page_claimed)
> -		free_handle(handle);
> +		free_handle(handle, zhdr);
>   	if (kref_put(&zhdr->refcount, release_z3fold_page_locked_list)) {
>   		atomic64_dec(&pool->pages_nr);
>   		return;
> @@ -1429,19 +1407,19 @@ static int z3fold_reclaim_page(struct
> z3fold_pool *pool, unsigned int retries)
>   			ret = pool->ops->evict(pool, middle_handle);
>   			if (ret)
>   				goto next;
> -			free_handle(middle_handle);
> +			free_handle(middle_handle, zhdr);
>   		}
>   		if (first_handle) {
>   			ret = pool->ops->evict(pool, first_handle);
>   			if (ret)
>   				goto next;
> -			free_handle(first_handle);
> +			free_handle(first_handle, zhdr);
>   		}
>   		if (last_handle) {
>   			ret = pool->ops->evict(pool, last_handle);
>   			if (ret)
>   				goto next;
> -			free_handle(last_handle);
> +			free_handle(last_handle, zhdr);
>   		}
>   next:
>   		if (test_bit(PAGE_HEADLESS, &page->private)) {
>
> --
>
> Best regards,
>     Vitaly


  reply	other threads:[~2020-12-09  6:15 UTC|newest]

Thread overview: 70+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-11-28 14:05 scheduling while atomic in z3fold Oleksandr Natalenko
2020-11-28 14:09 ` Oleksandr Natalenko
2020-11-28 14:27   ` Oleksandr Natalenko
2020-11-29  6:41     ` Mike Galbraith
2020-11-29  6:41       ` Mike Galbraith
2020-11-29  7:48       ` Mike Galbraith
2020-11-29  9:21         ` Mike Galbraith
2020-11-29  9:21           ` Mike Galbraith
2020-11-29 10:56           ` Mike Galbraith
2020-11-29 10:56             ` Mike Galbraith
2020-11-29 11:29             ` Oleksandr Natalenko
2020-11-29 11:41               ` Mike Galbraith
2020-11-29 11:41                 ` Mike Galbraith
2020-11-30 13:20                 ` Sebastian Andrzej Siewior
2020-11-30 13:53                   ` Oleksandr Natalenko
2020-11-30 14:28                     ` Sebastian Andrzej Siewior
2020-11-30 14:42                   ` Mike Galbraith
2020-11-30 14:42                     ` Mike Galbraith
2020-11-30 14:52                     ` Sebastian Andrzej Siewior
2020-11-30 15:01                       ` Mike Galbraith
2020-11-30 15:01                         ` Mike Galbraith
2020-11-30 15:03                         ` Mike Galbraith
2020-11-30 15:03                           ` Mike Galbraith
2020-11-30 16:03                         ` Sebastian Andrzej Siewior
2020-11-30 16:27                           ` Mike Galbraith
2020-11-30 16:27                             ` Mike Galbraith
2020-11-30 16:32                             ` Sebastian Andrzej Siewior
2020-11-30 16:36                               ` Mike Galbraith
2020-11-30 16:36                                 ` Mike Galbraith
2020-11-30 19:09                               ` Mike Galbraith
2020-11-30 19:09                                 ` Mike Galbraith
2020-11-30 16:53                             ` Mike Galbraith
2020-11-30 16:53                               ` Mike Galbraith
2020-12-02  2:30                           ` Mike Galbraith
2020-12-02  2:30                             ` Mike Galbraith
2020-12-02 22:08                             ` Sebastian Andrzej Siewior
2020-12-03  2:16                               ` Mike Galbraith
2020-12-03  2:16                                 ` Mike Galbraith
2020-12-03  8:18                                 ` Mike Galbraith
2020-12-03  8:18                                   ` Mike Galbraith
2020-12-03  9:40                                   ` zswap explosion when using zsmalloc pool compression Mike Galbraith
2020-12-03  9:48                                     ` Sebastian Andrzej Siewior
2020-12-03 11:03                                       ` Mike Galbraith
2020-12-03 12:51                                         ` Mike Galbraith
2020-12-03 13:07                                           ` Mike Galbraith
2020-12-03 13:43                                             ` Mike Galbraith
2020-12-03 14:04                                               ` Mike Galbraith
2020-12-04  5:22                                                 ` [patch] mm,zswap: fix zswap::zswap_comp.lock vs zsmalloc::zs_map_area.lock deadlock Mike Galbraith
2020-12-03 13:39                                   ` scheduling while atomic in z3fold Sebastian Andrzej Siewior
2020-12-03 14:07                                     ` Vitaly Wool
2020-12-03 14:07                                       ` Vitaly Wool
2020-12-06  9:18                                     ` Mike Galbraith
2020-12-06  9:18                                       ` Mike Galbraith
2020-12-07  1:05                                       ` Vitaly Wool
2020-12-07  2:18                                         ` Mike Galbraith
2020-12-07  2:18                                           ` Mike Galbraith
2020-12-07 11:52                                           ` Vitaly Wool
2020-12-07 11:52                                             ` Vitaly Wool
2020-12-07 12:34                                             ` Mike Galbraith
2020-12-07 12:34                                               ` Mike Galbraith
2020-12-07 15:21                                               ` Vitaly Wool
2020-12-07 15:21                                                 ` Vitaly Wool
2020-12-07 15:41                                                 ` Sebastian Andrzej Siewior
2020-12-07 15:41                                                 ` Mike Galbraith
2020-12-07 15:41                                                   ` Mike Galbraith
2020-12-08 23:26                                                   ` Vitaly Wool
2020-12-09  6:13                                                     ` Mike Galbraith [this message]
2020-12-09  6:13                                                       ` Mike Galbraith
2020-12-09  6:31                                                       ` Mike Galbraith
2020-12-09  6:31                                                         ` Mike Galbraith

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=35863f8cbdb99b2a7eeac3bca13ae6962d6a98c0.camel@gmx.de \
    --to=efault@gmx.de \
    --cc=akpm@linux-foundation.org \
    --cc=bigeasy@linutronix.de \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-rt-users@vger.kernel.org \
    --cc=oleksandr@natalenko.name \
    --cc=rostedt@goodmis.org \
    --cc=tglx@linutronix.de \
    --cc=vitaly.wool@konsulko.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.