On 2020-12-06 10:18, Mike Galbraith wrote:
On Thu, 2020-12-03 at 14:39 +0100, Sebastian Andrzej Siewior wrote:
On 2020-12-03 09:18:21 [+0100], Mike Galbraith wrote:
On Thu, 2020-12-03 at 03:16 +0100, Mike Galbraith wrote:
On Wed, 2020-12-02 at 23:08 +0100, Sebastian Andrzej Siewior wrote:
Looks like...

d8f117abb380 z3fold: fix use-after-free when freeing handles

...wasn't completely effective...

The top two hunks seem to have rendered the thing RT tolerant.

Yes, it appears to. I have no idea if this is a proper fix or not.
Without your write lock, after a few attempts, KASAN says:

| BUG: KASAN: use-after-free in __pv_queued_spin_lock_slowpath+0x293/0x770
| Write of size 2 at addr ffff88800e0e10aa by task kworker/u16:3/237

Things that make ya go hmmm...

I started poking at it thinking ok, given write_lock() fixes it, bad
juju must be happening under another read_lock(), so I poisoned the
structure about to be freed by __release_z3fold_page() under lock, and
opened a delay window for bad juju to materialize, but it didn't, so I
just poisoned instead, and sprinkled landmines all over the place.

My landmines are not triggering but the detonator is materializing
inside the structure containing the lock we explode on.  Somebody
blasts a recycled z3fold_buddy_slots into ram we were working on?
Could you please try the following patch in your setup:
diff --git a/mm/z3fold.c b/mm/z3fold.c
index 18feaa0bc537..efe9a012643d 100644
--- a/mm/z3fold.c
+++ b/mm/z3fold.c
@@ -544,12 +544,17 @@ static void __release_z3fold_page(struct z3fold_header *zhdr, bool locked)
 			break;
 		}
 	}
-	if (!is_free)
+	if (!is_free) {
 		set_bit(HANDLES_ORPHANED, &zhdr->slots->pool);
-	read_unlock(&zhdr->slots->lock);
-
-	if (is_free)
+		read_unlock(&zhdr->slots->lock);
+	} else {
+		zhdr->slots->slot[0] =
+			zhdr->slots->slot[1] =
+			zhdr->slots->slot[2] =
+			zhdr->slots->slot[3] = 0xdeadbeef;
+		read_unlock(&zhdr->slots->lock);
 		kmem_cache_free(pool->c_handle, zhdr->slots);
+	}
 
 	if (locked)
 		z3fold_page_unlock(zhdr);