linux-kselftest.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v4 1/3] kunit: make test->lock irq safe
@ 2021-04-13 10:07 glittao
  2021-04-13 10:07 ` [PATCH v4 2/3] mm/slub, kunit: add a KUnit test for SLUB debugging functionality glittao
                   ` (2 more replies)
  0 siblings, 3 replies; 11+ messages in thread
From: glittao @ 2021-04-13 10:07 UTC (permalink / raw)
  To: brendanhiggins, cl, penberg, rientjes, iamjoonsoo.kim, akpm, vbabka
  Cc: linux-kernel, linux-kselftest, kunit-dev, linux-mm, elver,
	dlatypov, Oliver Glitta

From: Vlastimil Babka <vbabka@suse.cz>

The upcoming SLUB kunit test will be calling kunit_find_named_resource() from
a context with disabled interrupts. That means kunit's test->lock needs to be
IRQ safe to avoid potential deadlocks and lockdep splats.

This patch therefore changes the test->lock usage to spin_lock_irqsave()
and spin_unlock_irqrestore().

Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Oliver Glitta <glittao@gmail.com>
---
 include/kunit/test.h |  5 +++--
 lib/kunit/test.c     | 18 +++++++++++-------
 2 files changed, 14 insertions(+), 9 deletions(-)

diff --git a/include/kunit/test.h b/include/kunit/test.h
index 49601c4b98b8..524d4789af22 100644
--- a/include/kunit/test.h
+++ b/include/kunit/test.h
@@ -515,8 +515,9 @@ kunit_find_resource(struct kunit *test,
 		    void *match_data)
 {
 	struct kunit_resource *res, *found = NULL;
+	unsigned long flags;
 
-	spin_lock(&test->lock);
+	spin_lock_irqsave(&test->lock, flags);
 
 	list_for_each_entry_reverse(res, &test->resources, node) {
 		if (match(test, res, (void *)match_data)) {
@@ -526,7 +527,7 @@ kunit_find_resource(struct kunit *test,
 		}
 	}
 
-	spin_unlock(&test->lock);
+	spin_unlock_irqrestore(&test->lock, flags);
 
 	return found;
 }
diff --git a/lib/kunit/test.c b/lib/kunit/test.c
index ec9494e914ef..2c62eeb45b82 100644
--- a/lib/kunit/test.c
+++ b/lib/kunit/test.c
@@ -442,6 +442,7 @@ int kunit_add_resource(struct kunit *test,
 		       void *data)
 {
 	int ret = 0;
+	unsigned long flags;
 
 	res->free = free;
 	kref_init(&res->refcount);
@@ -454,10 +455,10 @@ int kunit_add_resource(struct kunit *test,
 		res->data = data;
 	}
 
-	spin_lock(&test->lock);
+	spin_lock_irqsave(&test->lock, flags);
 	list_add_tail(&res->node, &test->resources);
 	/* refcount for list is established by kref_init() */
-	spin_unlock(&test->lock);
+	spin_unlock_irqrestore(&test->lock, flags);
 
 	return ret;
 }
@@ -515,9 +516,11 @@ EXPORT_SYMBOL_GPL(kunit_alloc_and_get_resource);
 
 void kunit_remove_resource(struct kunit *test, struct kunit_resource *res)
 {
-	spin_lock(&test->lock);
+	unsigned long flags;
+
+	spin_lock_irqsave(&test->lock, flags);
 	list_del(&res->node);
-	spin_unlock(&test->lock);
+	spin_unlock_irqrestore(&test->lock, flags);
 	kunit_put_resource(res);
 }
 EXPORT_SYMBOL_GPL(kunit_remove_resource);
@@ -597,6 +600,7 @@ EXPORT_SYMBOL_GPL(kunit_kfree);
 void kunit_cleanup(struct kunit *test)
 {
 	struct kunit_resource *res;
+	unsigned long flags;
 
 	/*
 	 * test->resources is a stack - each allocation must be freed in the
@@ -608,9 +612,9 @@ void kunit_cleanup(struct kunit *test)
 	 * protect against the current node being deleted, not the next.
 	 */
 	while (true) {
-		spin_lock(&test->lock);
+		spin_lock_irqsave(&test->lock, flags);
 		if (list_empty(&test->resources)) {
-			spin_unlock(&test->lock);
+			spin_unlock_irqrestore(&test->lock, flags);
 			break;
 		}
 		res = list_last_entry(&test->resources,
@@ -621,7 +625,7 @@ void kunit_cleanup(struct kunit *test)
 		 * resource, and this can't happen if the test->lock
 		 * is held.
 		 */
-		spin_unlock(&test->lock);
+		spin_unlock_irqrestore(&test->lock, flags);
 		kunit_remove_resource(test, res);
 	}
 #if (IS_ENABLED(CONFIG_KASAN) && IS_ENABLED(CONFIG_KUNIT))
-- 
2.31.1.272.g89b43f80a5


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH v4 2/3] mm/slub, kunit: add a KUnit test for SLUB debugging functionality
  2021-04-13 10:07 [PATCH v4 1/3] kunit: make test->lock irq safe glittao
@ 2021-04-13 10:07 ` glittao
  2021-04-13 13:54   ` Marco Elver
                     ` (2 more replies)
  2021-04-13 10:07 ` [PATCH v4 3/3] slub: remove resiliency_test() function glittao
  2021-04-13 13:38 ` [PATCH v4 1/3] kunit: make test->lock irq safe Brendan Higgins
  2 siblings, 3 replies; 11+ messages in thread
From: glittao @ 2021-04-13 10:07 UTC (permalink / raw)
  To: brendanhiggins, cl, penberg, rientjes, iamjoonsoo.kim, akpm, vbabka
  Cc: linux-kernel, linux-kselftest, kunit-dev, linux-mm, elver,
	dlatypov, Oliver Glitta

From: Oliver Glitta <glittao@gmail.com>

SLUB has resiliency_test() function which is hidden behind #ifdef
SLUB_RESILIENCY_TEST that is not part of Kconfig, so nobody
runs it. KUnit should be a proper replacement for it.

Try changing byte in redzone after allocation and changing
pointer to next free node, first byte, 50th byte and redzone
byte. Check if validation finds errors.

There are several differences from the original resiliency test:
Tests create own caches with known state instead of corrupting
shared kmalloc caches.

The corruption of freepointer uses correct offset, the original
resiliency test got broken with freepointer changes.

Scratch changing random byte test, because it does not have
meaning in this form where we need deterministic results.

Add new option CONFIG_SLUB_KUNIT_TEST in Kconfig.
Because the test deliberatly modifies non-allocated objects, it depends on
!KASAN which would have otherwise prevented that.

Use kunit_resource to count errors in cache and silence bug reports.
Count error whenever slab_bug() or slab_fix() is called or when
the count of pages is wrong.

Signed-off-by: Oliver Glitta <glittao@gmail.com>
---
Changes since v3

Use kunit_resource to silence bug reports and count errors suggested by
Marco Elver.
Make the test depends on !KASAN thanks to report from the kernel test robot.

Changes since v2

Use bit operation & instead of logical && as reported by kernel test
robot and Dan Carpenter

Changes since v1

Conversion from kselftest to KUnit test suggested by Marco Elver.
Error silencing.
Error counting improvements.
 lib/Kconfig.debug |  12 ++++
 lib/Makefile      |   1 +
 lib/slub_kunit.c  | 150 ++++++++++++++++++++++++++++++++++++++++++++++
 mm/slab.h         |   1 +
 mm/slub.c         |  50 ++++++++++++++--
 5 files changed, 209 insertions(+), 5 deletions(-)
 create mode 100644 lib/slub_kunit.c

diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index 2779c29d9981..9b8a0d754278 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -2371,6 +2371,18 @@ config BITS_TEST
 
 	  If unsure, say N.
 
+config SLUB_KUNIT_TEST
+	tristate "KUnit test for SLUB cache error detection" if !KUNIT_ALL_TESTS
+	depends on SLUB_DEBUG && KUNIT && !KASAN
+	default KUNIT_ALL_TESTS
+	help
+	  This builds SLUB allocator unit test.
+	  Tests SLUB cache debugging functionality.
+	  For more information on KUnit and unit tests in general please refer
+	  to the KUnit documentation in Documentation/dev-tools/kunit/.
+
+	  If unsure, say N.
+
 config TEST_UDELAY
 	tristate "udelay test driver"
 	help
diff --git a/lib/Makefile b/lib/Makefile
index b5307d3eec1a..1e59c6714ed8 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -352,5 +352,6 @@ obj-$(CONFIG_LIST_KUNIT_TEST) += list-test.o
 obj-$(CONFIG_LINEAR_RANGES_TEST) += test_linear_ranges.o
 obj-$(CONFIG_BITS_TEST) += test_bits.o
 obj-$(CONFIG_CMDLINE_KUNIT_TEST) += cmdline_kunit.o
+obj-$(CONFIG_SLUB_KUNIT_TEST) += slub_kunit.o
 
 obj-$(CONFIG_GENERIC_LIB_DEVMEM_IS_ALLOWED) += devmem_is_allowed.o
diff --git a/lib/slub_kunit.c b/lib/slub_kunit.c
new file mode 100644
index 000000000000..cb9ae9f7e8a6
--- /dev/null
+++ b/lib/slub_kunit.c
@@ -0,0 +1,150 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <kunit/test.h>
+#include <linux/mm.h>
+#include <linux/slab.h>
+#include <linux/module.h>
+#include <linux/kernel.h>
+#include "../mm/slab.h"
+
+static struct kunit_resource resource;
+static int slab_errors;
+
+static void test_clobber_zone(struct kunit *test)
+{
+	struct kmem_cache *s = kmem_cache_create("TestSlub_RZ_alloc", 64, 0,
+				SLAB_RED_ZONE, NULL);
+	u8 *p = kmem_cache_alloc(s, GFP_KERNEL);
+
+	p[64] = 0x12;
+
+	validate_slab_cache(s);
+	KUNIT_EXPECT_EQ(test, 2, slab_errors);
+
+	kmem_cache_free(s, p);
+	kmem_cache_destroy(s);
+}
+
+static void test_next_pointer(struct kunit *test)
+{
+	struct kmem_cache *s = kmem_cache_create("TestSlub_next_ptr_free", 64, 0,
+				SLAB_POISON, NULL);
+	u8 *p = kmem_cache_alloc(s, GFP_KERNEL);
+	unsigned long tmp;
+	unsigned long *ptr_addr;
+
+	kmem_cache_free(s, p);
+
+	ptr_addr = (unsigned long *)(p + s->offset);
+	tmp = *ptr_addr;
+	p[s->offset] = 0x12;
+
+	/*
+	 * Expecting three errors.
+	 * One for the corrupted freechain and the other one for the wrong
+	 * count of objects in use. The third error is fixing broken cache.
+	 */
+	validate_slab_cache(s);
+	KUNIT_EXPECT_EQ(test, 3, slab_errors);
+
+	/*
+	 * Try to repair corrupted freepointer.
+	 * Still expecting two errors. The first for the wrong count
+	 * of objects in use.
+	 * The second error is for fixing broken cache.
+	 */
+	*ptr_addr = tmp;
+	slab_errors = 0;
+
+	validate_slab_cache(s);
+	KUNIT_EXPECT_EQ(test, 2, slab_errors);
+
+	/*
+	 * Previous validation repaired the count of objects in use.
+	 * Now expecting no error.
+	 */
+	slab_errors = 0;
+	validate_slab_cache(s);
+	KUNIT_EXPECT_EQ(test, 0, slab_errors);
+
+	kmem_cache_destroy(s);
+}
+
+static void test_first_word(struct kunit *test)
+{
+	struct kmem_cache *s = kmem_cache_create("TestSlub_1th_word_free", 64, 0,
+				SLAB_POISON, NULL);
+	u8 *p = kmem_cache_alloc(s, GFP_KERNEL);
+
+	kmem_cache_free(s, p);
+	*p = 0x78;
+
+	validate_slab_cache(s);
+	KUNIT_EXPECT_EQ(test, 2, slab_errors);
+
+	kmem_cache_destroy(s);
+}
+
+static void test_clobber_50th_byte(struct kunit *test)
+{
+	struct kmem_cache *s = kmem_cache_create("TestSlub_50th_word_free", 64, 0,
+				SLAB_POISON, NULL);
+	u8 *p = kmem_cache_alloc(s, GFP_KERNEL);
+
+	kmem_cache_free(s, p);
+	p[50] = 0x9a;
+
+	validate_slab_cache(s);
+	KUNIT_EXPECT_EQ(test, 2, slab_errors);
+	kmem_cache_destroy(s);
+}
+
+static void test_clobber_redzone_free(struct kunit *test)
+{
+	struct kmem_cache *s = kmem_cache_create("TestSlub_RZ_free", 64, 0,
+				SLAB_RED_ZONE, NULL);
+	u8 *p = kmem_cache_alloc(s, GFP_KERNEL);
+
+	kmem_cache_free(s, p);
+	p[64] = 0xab;
+
+	validate_slab_cache(s);
+	KUNIT_EXPECT_EQ(test, 2, slab_errors);
+	kmem_cache_destroy(s);
+}
+
+static int test_init(struct kunit *test)
+{
+	slab_errors = 0;
+
+	/* FIXME: remove when CONFIG_KASAN requirement is dropped. */
+	current->kunit_test = test;
+
+	kunit_add_named_resource(test, NULL, NULL, &resource,
+					"slab_errors", &slab_errors);
+	return 0;
+}
+
+static void test_exit(struct kunit *test)
+{
+	/* FIXME: remove when CONFIG_KASAN requirement is dropped. */
+	current->kunit_test = NULL;
+}
+
+static struct kunit_case test_cases[] = {
+	KUNIT_CASE(test_clobber_zone),
+	KUNIT_CASE(test_next_pointer),
+	KUNIT_CASE(test_first_word),
+	KUNIT_CASE(test_clobber_50th_byte),
+	KUNIT_CASE(test_clobber_redzone_free),
+	{}
+};
+
+static struct kunit_suite test_suite = {
+	.name = "slub_test",
+	.init = test_init,
+	.exit = test_exit,
+	.test_cases = test_cases,
+};
+kunit_test_suite(test_suite);
+
+MODULE_LICENSE("GPL");
diff --git a/mm/slab.h b/mm/slab.h
index 076582f58f68..95cf42eb8396 100644
--- a/mm/slab.h
+++ b/mm/slab.h
@@ -215,6 +215,7 @@ DECLARE_STATIC_KEY_TRUE(slub_debug_enabled);
 DECLARE_STATIC_KEY_FALSE(slub_debug_enabled);
 #endif
 extern void print_tracking(struct kmem_cache *s, void *object);
+long validate_slab_cache(struct kmem_cache *s);
 #else
 static inline void print_tracking(struct kmem_cache *s, void *object)
 {
diff --git a/mm/slub.c b/mm/slub.c
index 3021ce9bf1b3..d7df8841d90a 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -35,6 +35,7 @@
 #include <linux/prefetch.h>
 #include <linux/memcontrol.h>
 #include <linux/random.h>
+#include <kunit/test.h>
 
 #include <trace/events/kmem.h>
 
@@ -447,6 +448,26 @@ static inline bool cmpxchg_double_slab(struct kmem_cache *s, struct page *page,
 static unsigned long object_map[BITS_TO_LONGS(MAX_OBJS_PER_PAGE)];
 static DEFINE_SPINLOCK(object_map_lock);
 
+#if IS_ENABLED(CONFIG_KUNIT)
+static bool slab_add_kunit_errors(void)
+{
+	struct kunit_resource *resource;
+
+	if (likely(!current->kunit_test))
+		return false;
+
+	resource = kunit_find_named_resource(current->kunit_test, "slab_errors");
+	if (!resource)
+		return false;
+
+	(*(int *)resource->data)++;
+	kunit_put_resource(resource);
+	return true;
+}
+#else
+static inline bool slab_add_kunit_errors(void) { return false; }
+#endif
+
 /*
  * Determine a map of object in use on a page.
  *
@@ -676,6 +697,9 @@ static void slab_fix(struct kmem_cache *s, char *fmt, ...)
 	struct va_format vaf;
 	va_list args;
 
+	if (slab_add_kunit_errors())
+		return;
+
 	va_start(args, fmt);
 	vaf.fmt = fmt;
 	vaf.va = &args;
@@ -739,6 +763,9 @@ static void print_trailer(struct kmem_cache *s, struct page *page, u8 *p)
 void object_err(struct kmem_cache *s, struct page *page,
 			u8 *object, char *reason)
 {
+	if (slab_add_kunit_errors())
+		return;
+
 	slab_bug(s, "%s", reason);
 	print_trailer(s, page, object);
 }
@@ -749,6 +776,9 @@ static __printf(3, 4) void slab_err(struct kmem_cache *s, struct page *page,
 	va_list args;
 	char buf[100];
 
+	if (slab_add_kunit_errors())
+		return;
+
 	va_start(args, fmt);
 	vsnprintf(buf, sizeof(buf), fmt, args);
 	va_end(args);
@@ -798,12 +828,16 @@ static int check_bytes_and_report(struct kmem_cache *s, struct page *page,
 	while (end > fault && end[-1] == value)
 		end--;
 
+	if (slab_add_kunit_errors())
+		goto skip_bug_print;
+
 	slab_bug(s, "%s overwritten", what);
 	pr_err("INFO: 0x%p-0x%p @offset=%tu. First byte 0x%x instead of 0x%x\n",
-					fault, end - 1, fault - addr,
-					fault[0], value);
+				fault, end - 1, fault - addr,
+				fault[0], value);
 	print_trailer(s, page, object);
 
+skip_bug_print:
 	restore_bytes(s, what, value, fault, end);
 	return 0;
 }
@@ -4650,9 +4684,11 @@ static int validate_slab_node(struct kmem_cache *s,
 		validate_slab(s, page);
 		count++;
 	}
-	if (count != n->nr_partial)
+	if (count != n->nr_partial) {
 		pr_err("SLUB %s: %ld partial slabs counted but counter=%ld\n",
 		       s->name, count, n->nr_partial);
+		slab_add_kunit_errors();
+	}
 
 	if (!(s->flags & SLAB_STORE_USER))
 		goto out;
@@ -4661,16 +4697,18 @@ static int validate_slab_node(struct kmem_cache *s,
 		validate_slab(s, page);
 		count++;
 	}
-	if (count != atomic_long_read(&n->nr_slabs))
+	if (count != atomic_long_read(&n->nr_slabs)) {
 		pr_err("SLUB: %s %ld slabs counted but counter=%ld\n",
 		       s->name, count, atomic_long_read(&n->nr_slabs));
+		slab_add_kunit_errors();
+	}
 
 out:
 	spin_unlock_irqrestore(&n->list_lock, flags);
 	return count;
 }
 
-static long validate_slab_cache(struct kmem_cache *s)
+long validate_slab_cache(struct kmem_cache *s)
 {
 	int node;
 	unsigned long count = 0;
@@ -4682,6 +4720,8 @@ static long validate_slab_cache(struct kmem_cache *s)
 
 	return count;
 }
+EXPORT_SYMBOL(validate_slab_cache);
+
 /*
  * Generate lists of code addresses where slabcache objects are allocated
  * and freed.
-- 
2.31.1.272.g89b43f80a5


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH v4 3/3] slub: remove resiliency_test() function
  2021-04-13 10:07 [PATCH v4 1/3] kunit: make test->lock irq safe glittao
  2021-04-13 10:07 ` [PATCH v4 2/3] mm/slub, kunit: add a KUnit test for SLUB debugging functionality glittao
@ 2021-04-13 10:07 ` glittao
  2021-04-13 13:38 ` [PATCH v4 1/3] kunit: make test->lock irq safe Brendan Higgins
  2 siblings, 0 replies; 11+ messages in thread
From: glittao @ 2021-04-13 10:07 UTC (permalink / raw)
  To: brendanhiggins, cl, penberg, rientjes, iamjoonsoo.kim, akpm, vbabka
  Cc: linux-kernel, linux-kselftest, kunit-dev, linux-mm, elver,
	dlatypov, Oliver Glitta

From: Oliver Glitta <glittao@gmail.com>

Function resiliency_test() is hidden behind #ifdef
SLUB_RESILIENCY_TEST that is not part of Kconfig, so nobody
runs it.

This function is replaced with KUnit test for SLUB added
by the previous patch "selftests: add a KUnit test for SLUB
debugging functionality".

Signed-off-by: Oliver Glitta <glittao@gmail.com>

Acked-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: David Rientjes <rientjes@google.com>
---
 mm/slub.c | 64 -------------------------------------------------------
 1 file changed, 64 deletions(-)

diff --git a/mm/slub.c b/mm/slub.c
index d7df8841d90a..c65e2c471a13 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -154,9 +154,6 @@ static inline bool kmem_cache_has_cpu_partial(struct kmem_cache *s)
  * - Variable sizing of the per node arrays
  */
 
-/* Enable to test recovery from slab corruption on boot */
-#undef SLUB_RESILIENCY_TEST
-
 /* Enable to log cmpxchg failures */
 #undef SLUB_DEBUG_CMPXCHG
 
@@ -4939,66 +4936,6 @@ static int list_locations(struct kmem_cache *s, char *buf,
 }
 #endif	/* CONFIG_SLUB_DEBUG */
 
-#ifdef SLUB_RESILIENCY_TEST
-static void __init resiliency_test(void)
-{
-	u8 *p;
-	int type = KMALLOC_NORMAL;
-
-	BUILD_BUG_ON(KMALLOC_MIN_SIZE > 16 || KMALLOC_SHIFT_HIGH < 10);
-
-	pr_err("SLUB resiliency testing\n");
-	pr_err("-----------------------\n");
-	pr_err("A. Corruption after allocation\n");
-
-	p = kzalloc(16, GFP_KERNEL);
-	p[16] = 0x12;
-	pr_err("\n1. kmalloc-16: Clobber Redzone/next pointer 0x12->0x%p\n\n",
-	       p + 16);
-
-	validate_slab_cache(kmalloc_caches[type][4]);
-
-	/* Hmmm... The next two are dangerous */
-	p = kzalloc(32, GFP_KERNEL);
-	p[32 + sizeof(void *)] = 0x34;
-	pr_err("\n2. kmalloc-32: Clobber next pointer/next slab 0x34 -> -0x%p\n",
-	       p);
-	pr_err("If allocated object is overwritten then not detectable\n\n");
-
-	validate_slab_cache(kmalloc_caches[type][5]);
-	p = kzalloc(64, GFP_KERNEL);
-	p += 64 + (get_cycles() & 0xff) * sizeof(void *);
-	*p = 0x56;
-	pr_err("\n3. kmalloc-64: corrupting random byte 0x56->0x%p\n",
-	       p);
-	pr_err("If allocated object is overwritten then not detectable\n\n");
-	validate_slab_cache(kmalloc_caches[type][6]);
-
-	pr_err("\nB. Corruption after free\n");
-	p = kzalloc(128, GFP_KERNEL);
-	kfree(p);
-	*p = 0x78;
-	pr_err("1. kmalloc-128: Clobber first word 0x78->0x%p\n\n", p);
-	validate_slab_cache(kmalloc_caches[type][7]);
-
-	p = kzalloc(256, GFP_KERNEL);
-	kfree(p);
-	p[50] = 0x9a;
-	pr_err("\n2. kmalloc-256: Clobber 50th byte 0x9a->0x%p\n\n", p);
-	validate_slab_cache(kmalloc_caches[type][8]);
-
-	p = kzalloc(512, GFP_KERNEL);
-	kfree(p);
-	p[512] = 0xab;
-	pr_err("\n3. kmalloc-512: Clobber redzone 0xab->0x%p\n\n", p);
-	validate_slab_cache(kmalloc_caches[type][9]);
-}
-#else
-#ifdef CONFIG_SYSFS
-static void resiliency_test(void) {};
-#endif
-#endif	/* SLUB_RESILIENCY_TEST */
-
 #ifdef CONFIG_SYSFS
 enum slab_stat_type {
 	SL_ALL,			/* All slabs */
@@ -5847,7 +5784,6 @@ static int __init slab_sysfs_init(void)
 	}
 
 	mutex_unlock(&slab_mutex);
-	resiliency_test();
 	return 0;
 }
 
-- 
2.31.1.272.g89b43f80a5


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH v4 1/3] kunit: make test->lock irq safe
  2021-04-13 10:07 [PATCH v4 1/3] kunit: make test->lock irq safe glittao
  2021-04-13 10:07 ` [PATCH v4 2/3] mm/slub, kunit: add a KUnit test for SLUB debugging functionality glittao
  2021-04-13 10:07 ` [PATCH v4 3/3] slub: remove resiliency_test() function glittao
@ 2021-04-13 13:38 ` Brendan Higgins
  2 siblings, 0 replies; 11+ messages in thread
From: Brendan Higgins @ 2021-04-13 13:38 UTC (permalink / raw)
  To: glittao
  Cc: Christoph Lameter, Pekka Enberg, David Rientjes, Joonsoo Kim,
	Andrew Morton, Vlastimil Babka, Linux Kernel Mailing List,
	open list:KERNEL SELFTEST FRAMEWORK, KUnit Development,
	Linux Memory Management List, Marco Elver, Daniel Latypov

On Tue, Apr 13, 2021 at 3:07 AM <glittao@gmail.com> wrote:
>
> From: Vlastimil Babka <vbabka@suse.cz>
>
> The upcoming SLUB kunit test will be calling kunit_find_named_resource() from
> a context with disabled interrupts. That means kunit's test->lock needs to be
> IRQ safe to avoid potential deadlocks and lockdep splats.
>
> This patch therefore changes the test->lock usage to spin_lock_irqsave()
> and spin_unlock_irqrestore().
>
> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
> Signed-off-by: Oliver Glitta <glittao@gmail.com>

Reviewed-by: Brendan Higgins <brendanhiggins@google.com>

Thanks!

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v4 2/3] mm/slub, kunit: add a KUnit test for SLUB debugging functionality
  2021-04-13 10:07 ` [PATCH v4 2/3] mm/slub, kunit: add a KUnit test for SLUB debugging functionality glittao
@ 2021-04-13 13:54   ` Marco Elver
  2021-04-15 10:10     ` Oliver Glitta
  2021-04-13 21:33   ` Daniel Latypov
  2021-04-15 10:30   ` Vlastimil Babka
  2 siblings, 1 reply; 11+ messages in thread
From: Marco Elver @ 2021-04-13 13:54 UTC (permalink / raw)
  To: glittao
  Cc: Brendan Higgins, Christoph Lameter, Pekka Enberg, David Rientjes,
	Joonsoo Kim, Andrew Morton, Vlastimil Babka, LKML,
	open list:KERNEL SELFTEST FRAMEWORK, KUnit Development,
	Linux Memory Management List, Daniel Latypov

On Tue, 13 Apr 2021 at 12:07, <glittao@gmail.com> wrote:
> From: Oliver Glitta <glittao@gmail.com>
>
> SLUB has resiliency_test() function which is hidden behind #ifdef
> SLUB_RESILIENCY_TEST that is not part of Kconfig, so nobody
> runs it. KUnit should be a proper replacement for it.
>
> Try changing byte in redzone after allocation and changing
> pointer to next free node, first byte, 50th byte and redzone
> byte. Check if validation finds errors.
>
> There are several differences from the original resiliency test:
> Tests create own caches with known state instead of corrupting
> shared kmalloc caches.
>
> The corruption of freepointer uses correct offset, the original
> resiliency test got broken with freepointer changes.
>
> Scratch changing random byte test, because it does not have
> meaning in this form where we need deterministic results.
>
> Add new option CONFIG_SLUB_KUNIT_TEST in Kconfig.
> Because the test deliberatly modifies non-allocated objects, it depends on
> !KASAN which would have otherwise prevented that.

Hmm, did the test fail with KASAN? Is it possible to skip the tests
and still run a subset of tests with KASAN? It'd be nice if we could
run some of these tests with KASAN as well.

> Use kunit_resource to count errors in cache and silence bug reports.
> Count error whenever slab_bug() or slab_fix() is called or when
> the count of pages is wrong.
>
> Signed-off-by: Oliver Glitta <glittao@gmail.com>

Reviewed-by: Marco Elver <elver@google.com>

Thanks, this all looks good to me. But perhaps do test what works with
KASAN, to see if you need the !KASAN constraint for all cases.

> ---
> Changes since v3
>
> Use kunit_resource to silence bug reports and count errors suggested by
> Marco Elver.
> Make the test depends on !KASAN thanks to report from the kernel test robot.
>
> Changes since v2
>
> Use bit operation & instead of logical && as reported by kernel test
> robot and Dan Carpenter
>
> Changes since v1
>
> Conversion from kselftest to KUnit test suggested by Marco Elver.
> Error silencing.
> Error counting improvements.
>  lib/Kconfig.debug |  12 ++++
>  lib/Makefile      |   1 +
>  lib/slub_kunit.c  | 150 ++++++++++++++++++++++++++++++++++++++++++++++
>  mm/slab.h         |   1 +
>  mm/slub.c         |  50 ++++++++++++++--
>  5 files changed, 209 insertions(+), 5 deletions(-)
>  create mode 100644 lib/slub_kunit.c
>
> diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
> index 2779c29d9981..9b8a0d754278 100644
> --- a/lib/Kconfig.debug
> +++ b/lib/Kconfig.debug
> @@ -2371,6 +2371,18 @@ config BITS_TEST
>
>           If unsure, say N.
>
> +config SLUB_KUNIT_TEST
> +       tristate "KUnit test for SLUB cache error detection" if !KUNIT_ALL_TESTS
> +       depends on SLUB_DEBUG && KUNIT && !KASAN
> +       default KUNIT_ALL_TESTS
> +       help
> +         This builds SLUB allocator unit test.
> +         Tests SLUB cache debugging functionality.
> +         For more information on KUnit and unit tests in general please refer
> +         to the KUnit documentation in Documentation/dev-tools/kunit/.
> +
> +         If unsure, say N.
> +
>  config TEST_UDELAY
>         tristate "udelay test driver"
>         help
> diff --git a/lib/Makefile b/lib/Makefile
> index b5307d3eec1a..1e59c6714ed8 100644
> --- a/lib/Makefile
> +++ b/lib/Makefile
> @@ -352,5 +352,6 @@ obj-$(CONFIG_LIST_KUNIT_TEST) += list-test.o
>  obj-$(CONFIG_LINEAR_RANGES_TEST) += test_linear_ranges.o
>  obj-$(CONFIG_BITS_TEST) += test_bits.o
>  obj-$(CONFIG_CMDLINE_KUNIT_TEST) += cmdline_kunit.o
> +obj-$(CONFIG_SLUB_KUNIT_TEST) += slub_kunit.o
>
>  obj-$(CONFIG_GENERIC_LIB_DEVMEM_IS_ALLOWED) += devmem_is_allowed.o
> diff --git a/lib/slub_kunit.c b/lib/slub_kunit.c
> new file mode 100644
> index 000000000000..cb9ae9f7e8a6
> --- /dev/null
> +++ b/lib/slub_kunit.c
> @@ -0,0 +1,150 @@
> +// SPDX-License-Identifier: GPL-2.0
> +#include <kunit/test.h>
> +#include <linux/mm.h>
> +#include <linux/slab.h>
> +#include <linux/module.h>
> +#include <linux/kernel.h>
> +#include "../mm/slab.h"
> +
> +static struct kunit_resource resource;
> +static int slab_errors;
> +
> +static void test_clobber_zone(struct kunit *test)
> +{
> +       struct kmem_cache *s = kmem_cache_create("TestSlub_RZ_alloc", 64, 0,
> +                               SLAB_RED_ZONE, NULL);
> +       u8 *p = kmem_cache_alloc(s, GFP_KERNEL);
> +
> +       p[64] = 0x12;
> +
> +       validate_slab_cache(s);
> +       KUNIT_EXPECT_EQ(test, 2, slab_errors);
> +
> +       kmem_cache_free(s, p);
> +       kmem_cache_destroy(s);
> +}
> +
> +static void test_next_pointer(struct kunit *test)
> +{
> +       struct kmem_cache *s = kmem_cache_create("TestSlub_next_ptr_free", 64, 0,
> +                               SLAB_POISON, NULL);
> +       u8 *p = kmem_cache_alloc(s, GFP_KERNEL);
> +       unsigned long tmp;
> +       unsigned long *ptr_addr;
> +
> +       kmem_cache_free(s, p);
> +
> +       ptr_addr = (unsigned long *)(p + s->offset);
> +       tmp = *ptr_addr;
> +       p[s->offset] = 0x12;
> +
> +       /*
> +        * Expecting three errors.
> +        * One for the corrupted freechain and the other one for the wrong
> +        * count of objects in use. The third error is fixing broken cache.
> +        */
> +       validate_slab_cache(s);
> +       KUNIT_EXPECT_EQ(test, 3, slab_errors);
> +
> +       /*
> +        * Try to repair corrupted freepointer.
> +        * Still expecting two errors. The first for the wrong count
> +        * of objects in use.
> +        * The second error is for fixing broken cache.
> +        */
> +       *ptr_addr = tmp;
> +       slab_errors = 0;
> +
> +       validate_slab_cache(s);
> +       KUNIT_EXPECT_EQ(test, 2, slab_errors);
> +
> +       /*
> +        * Previous validation repaired the count of objects in use.
> +        * Now expecting no error.
> +        */
> +       slab_errors = 0;
> +       validate_slab_cache(s);
> +       KUNIT_EXPECT_EQ(test, 0, slab_errors);
> +
> +       kmem_cache_destroy(s);
> +}
> +
> +static void test_first_word(struct kunit *test)
> +{
> +       struct kmem_cache *s = kmem_cache_create("TestSlub_1th_word_free", 64, 0,
> +                               SLAB_POISON, NULL);
> +       u8 *p = kmem_cache_alloc(s, GFP_KERNEL);
> +
> +       kmem_cache_free(s, p);
> +       *p = 0x78;
> +
> +       validate_slab_cache(s);
> +       KUNIT_EXPECT_EQ(test, 2, slab_errors);
> +
> +       kmem_cache_destroy(s);
> +}
> +
> +static void test_clobber_50th_byte(struct kunit *test)
> +{
> +       struct kmem_cache *s = kmem_cache_create("TestSlub_50th_word_free", 64, 0,
> +                               SLAB_POISON, NULL);
> +       u8 *p = kmem_cache_alloc(s, GFP_KERNEL);
> +
> +       kmem_cache_free(s, p);
> +       p[50] = 0x9a;
> +
> +       validate_slab_cache(s);
> +       KUNIT_EXPECT_EQ(test, 2, slab_errors);
> +       kmem_cache_destroy(s);
> +}
> +
> +static void test_clobber_redzone_free(struct kunit *test)
> +{
> +       struct kmem_cache *s = kmem_cache_create("TestSlub_RZ_free", 64, 0,
> +                               SLAB_RED_ZONE, NULL);
> +       u8 *p = kmem_cache_alloc(s, GFP_KERNEL);
> +
> +       kmem_cache_free(s, p);
> +       p[64] = 0xab;
> +
> +       validate_slab_cache(s);
> +       KUNIT_EXPECT_EQ(test, 2, slab_errors);
> +       kmem_cache_destroy(s);
> +}
> +
> +static int test_init(struct kunit *test)
> +{
> +       slab_errors = 0;
> +
> +       /* FIXME: remove when CONFIG_KASAN requirement is dropped. */
> +       current->kunit_test = test;

Note, the patch "kunit: support failure from dynamic analysis tools"
is already in -next. It's probably safe to leave this, and send a
follow-up patch later once that kunit patch is in mainline.

> +       kunit_add_named_resource(test, NULL, NULL, &resource,
> +                                       "slab_errors", &slab_errors);
> +       return 0;
> +}
> +
> +static void test_exit(struct kunit *test)
> +{
> +       /* FIXME: remove when CONFIG_KASAN requirement is dropped. */
> +       current->kunit_test = NULL;
> +}
> +
> +static struct kunit_case test_cases[] = {
> +       KUNIT_CASE(test_clobber_zone),
> +       KUNIT_CASE(test_next_pointer),
> +       KUNIT_CASE(test_first_word),
> +       KUNIT_CASE(test_clobber_50th_byte),
> +       KUNIT_CASE(test_clobber_redzone_free),
> +       {}
> +};
> +
> +static struct kunit_suite test_suite = {
> +       .name = "slub_test",
> +       .init = test_init,
> +       .exit = test_exit,
> +       .test_cases = test_cases,
> +};
> +kunit_test_suite(test_suite);
> +
> +MODULE_LICENSE("GPL");
> diff --git a/mm/slab.h b/mm/slab.h
> index 076582f58f68..95cf42eb8396 100644
> --- a/mm/slab.h
> +++ b/mm/slab.h
> @@ -215,6 +215,7 @@ DECLARE_STATIC_KEY_TRUE(slub_debug_enabled);
>  DECLARE_STATIC_KEY_FALSE(slub_debug_enabled);
>  #endif
>  extern void print_tracking(struct kmem_cache *s, void *object);
> +long validate_slab_cache(struct kmem_cache *s);
>  #else
>  static inline void print_tracking(struct kmem_cache *s, void *object)
>  {
> diff --git a/mm/slub.c b/mm/slub.c
> index 3021ce9bf1b3..d7df8841d90a 100644
> --- a/mm/slub.c
> +++ b/mm/slub.c
> @@ -35,6 +35,7 @@
>  #include <linux/prefetch.h>
>  #include <linux/memcontrol.h>
>  #include <linux/random.h>
> +#include <kunit/test.h>
>
>  #include <trace/events/kmem.h>
>
> @@ -447,6 +448,26 @@ static inline bool cmpxchg_double_slab(struct kmem_cache *s, struct page *page,
>  static unsigned long object_map[BITS_TO_LONGS(MAX_OBJS_PER_PAGE)];
>  static DEFINE_SPINLOCK(object_map_lock);
>
> +#if IS_ENABLED(CONFIG_KUNIT)
> +static bool slab_add_kunit_errors(void)
> +{
> +       struct kunit_resource *resource;
> +
> +       if (likely(!current->kunit_test))
> +               return false;
> +
> +       resource = kunit_find_named_resource(current->kunit_test, "slab_errors");
> +       if (!resource)
> +               return false;
> +
> +       (*(int *)resource->data)++;
> +       kunit_put_resource(resource);
> +       return true;
> +}
> +#else
> +static inline bool slab_add_kunit_errors(void) { return false; }
> +#endif
> +
>  /*
>   * Determine a map of object in use on a page.
>   *
> @@ -676,6 +697,9 @@ static void slab_fix(struct kmem_cache *s, char *fmt, ...)
>         struct va_format vaf;
>         va_list args;
>
> +       if (slab_add_kunit_errors())
> +               return;
> +
>         va_start(args, fmt);
>         vaf.fmt = fmt;
>         vaf.va = &args;
> @@ -739,6 +763,9 @@ static void print_trailer(struct kmem_cache *s, struct page *page, u8 *p)
>  void object_err(struct kmem_cache *s, struct page *page,
>                         u8 *object, char *reason)
>  {
> +       if (slab_add_kunit_errors())
> +               return;
> +
>         slab_bug(s, "%s", reason);
>         print_trailer(s, page, object);
>  }
> @@ -749,6 +776,9 @@ static __printf(3, 4) void slab_err(struct kmem_cache *s, struct page *page,
>         va_list args;
>         char buf[100];
>
> +       if (slab_add_kunit_errors())
> +               return;
> +
>         va_start(args, fmt);
>         vsnprintf(buf, sizeof(buf), fmt, args);
>         va_end(args);
> @@ -798,12 +828,16 @@ static int check_bytes_and_report(struct kmem_cache *s, struct page *page,
>         while (end > fault && end[-1] == value)
>                 end--;
>
> +       if (slab_add_kunit_errors())
> +               goto skip_bug_print;
> +
>         slab_bug(s, "%s overwritten", what);
>         pr_err("INFO: 0x%p-0x%p @offset=%tu. First byte 0x%x instead of 0x%x\n",
> -                                       fault, end - 1, fault - addr,
> -                                       fault[0], value);
> +                               fault, end - 1, fault - addr,
> +                               fault[0], value);
>         print_trailer(s, page, object);
>
> +skip_bug_print:
>         restore_bytes(s, what, value, fault, end);
>         return 0;
>  }
> @@ -4650,9 +4684,11 @@ static int validate_slab_node(struct kmem_cache *s,
>                 validate_slab(s, page);
>                 count++;
>         }
> -       if (count != n->nr_partial)
> +       if (count != n->nr_partial) {
>                 pr_err("SLUB %s: %ld partial slabs counted but counter=%ld\n",
>                        s->name, count, n->nr_partial);
> +               slab_add_kunit_errors();
> +       }
>
>         if (!(s->flags & SLAB_STORE_USER))
>                 goto out;
> @@ -4661,16 +4697,18 @@ static int validate_slab_node(struct kmem_cache *s,
>                 validate_slab(s, page);
>                 count++;
>         }
> -       if (count != atomic_long_read(&n->nr_slabs))
> +       if (count != atomic_long_read(&n->nr_slabs)) {
>                 pr_err("SLUB: %s %ld slabs counted but counter=%ld\n",
>                        s->name, count, atomic_long_read(&n->nr_slabs));
> +               slab_add_kunit_errors();
> +       }
>
>  out:
>         spin_unlock_irqrestore(&n->list_lock, flags);
>         return count;
>  }
>
> -static long validate_slab_cache(struct kmem_cache *s)
> +long validate_slab_cache(struct kmem_cache *s)
>  {
>         int node;
>         unsigned long count = 0;
> @@ -4682,6 +4720,8 @@ static long validate_slab_cache(struct kmem_cache *s)
>
>         return count;
>  }
> +EXPORT_SYMBOL(validate_slab_cache);
> +
>  /*
>   * Generate lists of code addresses where slabcache objects are allocated
>   * and freed.
> --
> 2.31.1.272.g89b43f80a5
>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v4 2/3] mm/slub, kunit: add a KUnit test for SLUB debugging functionality
  2021-04-13 10:07 ` [PATCH v4 2/3] mm/slub, kunit: add a KUnit test for SLUB debugging functionality glittao
  2021-04-13 13:54   ` Marco Elver
@ 2021-04-13 21:33   ` Daniel Latypov
  2021-04-15 10:11     ` Oliver Glitta
  2021-04-15 10:30   ` Vlastimil Babka
  2 siblings, 1 reply; 11+ messages in thread
From: Daniel Latypov @ 2021-04-13 21:33 UTC (permalink / raw)
  To: glittao
  Cc: Brendan Higgins, Christoph Lameter, Pekka Enberg, David Rientjes,
	Joonsoo Kim, Andrew Morton, Vlastimil Babka,
	Linux Kernel Mailing List, open list:KERNEL SELFTEST FRAMEWORK,
	KUnit Development, Linux Memory Management List, Marco Elver

On Tue, Apr 13, 2021 at 3:07 AM <glittao@gmail.com> wrote:
>
> From: Oliver Glitta <glittao@gmail.com>
>
> SLUB has resiliency_test() function which is hidden behind #ifdef
> SLUB_RESILIENCY_TEST that is not part of Kconfig, so nobody
> runs it. KUnit should be a proper replacement for it.
>
> Try changing byte in redzone after allocation and changing
> pointer to next free node, first byte, 50th byte and redzone
> byte. Check if validation finds errors.
>
> There are several differences from the original resiliency test:
> Tests create own caches with known state instead of corrupting
> shared kmalloc caches.
>
> The corruption of freepointer uses correct offset, the original
> resiliency test got broken with freepointer changes.
>
> Scratch changing random byte test, because it does not have
> meaning in this form where we need deterministic results.
>
> Add new option CONFIG_SLUB_KUNIT_TEST in Kconfig.
> Because the test deliberatly modifies non-allocated objects, it depends on

nit: *deliberately

> !KASAN which would have otherwise prevented that.
>
> Use kunit_resource to count errors in cache and silence bug reports.
> Count error whenever slab_bug() or slab_fix() is called or when
> the count of pages is wrong.
>
> Signed-off-by: Oliver Glitta <glittao@gmail.com>

Acked-by: Daniel Latypov <dlatypov@google.com>

Looks good to me!
My one minor suggestion: perhaps let's log a summary of the error or
the func name in slab_add_kunit_errors().

> ---
> Changes since v3
>
> Use kunit_resource to silence bug reports and count errors suggested by
> Marco Elver.
> Make the test depends on !KASAN thanks to report from the kernel test robot.
>
> Changes since v2
>
> Use bit operation & instead of logical && as reported by kernel test
> robot and Dan Carpenter
>
> Changes since v1
>
> Conversion from kselftest to KUnit test suggested by Marco Elver.
> Error silencing.
> Error counting improvements.
>  lib/Kconfig.debug |  12 ++++
>  lib/Makefile      |   1 +
>  lib/slub_kunit.c  | 150 ++++++++++++++++++++++++++++++++++++++++++++++
>  mm/slab.h         |   1 +
>  mm/slub.c         |  50 ++++++++++++++--
>  5 files changed, 209 insertions(+), 5 deletions(-)
>  create mode 100644 lib/slub_kunit.c
>
> diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
> index 2779c29d9981..9b8a0d754278 100644
> --- a/lib/Kconfig.debug
> +++ b/lib/Kconfig.debug
> @@ -2371,6 +2371,18 @@ config BITS_TEST
>
>           If unsure, say N.
>
> +config SLUB_KUNIT_TEST
> +       tristate "KUnit test for SLUB cache error detection" if !KUNIT_ALL_TESTS
> +       depends on SLUB_DEBUG && KUNIT && !KASAN
> +       default KUNIT_ALL_TESTS
> +       help
> +         This builds SLUB allocator unit test.
> +         Tests SLUB cache debugging functionality.
> +         For more information on KUnit and unit tests in general please refer
> +         to the KUnit documentation in Documentation/dev-tools/kunit/.
> +
> +         If unsure, say N.
> +
>  config TEST_UDELAY
>         tristate "udelay test driver"
>         help
> diff --git a/lib/Makefile b/lib/Makefile
> index b5307d3eec1a..1e59c6714ed8 100644
> --- a/lib/Makefile
> +++ b/lib/Makefile
> @@ -352,5 +352,6 @@ obj-$(CONFIG_LIST_KUNIT_TEST) += list-test.o
>  obj-$(CONFIG_LINEAR_RANGES_TEST) += test_linear_ranges.o
>  obj-$(CONFIG_BITS_TEST) += test_bits.o
>  obj-$(CONFIG_CMDLINE_KUNIT_TEST) += cmdline_kunit.o
> +obj-$(CONFIG_SLUB_KUNIT_TEST) += slub_kunit.o
>
>  obj-$(CONFIG_GENERIC_LIB_DEVMEM_IS_ALLOWED) += devmem_is_allowed.o
> diff --git a/lib/slub_kunit.c b/lib/slub_kunit.c
> new file mode 100644
> index 000000000000..cb9ae9f7e8a6
> --- /dev/null
> +++ b/lib/slub_kunit.c
> @@ -0,0 +1,150 @@
> +// SPDX-License-Identifier: GPL-2.0
> +#include <kunit/test.h>
> +#include <linux/mm.h>
> +#include <linux/slab.h>
> +#include <linux/module.h>
> +#include <linux/kernel.h>
> +#include "../mm/slab.h"
> +
> +static struct kunit_resource resource;
> +static int slab_errors;
> +
> +static void test_clobber_zone(struct kunit *test)
> +{
> +       struct kmem_cache *s = kmem_cache_create("TestSlub_RZ_alloc", 64, 0,
> +                               SLAB_RED_ZONE, NULL);
> +       u8 *p = kmem_cache_alloc(s, GFP_KERNEL);
> +
> +       p[64] = 0x12;
> +
> +       validate_slab_cache(s);
> +       KUNIT_EXPECT_EQ(test, 2, slab_errors);
> +
> +       kmem_cache_free(s, p);
> +       kmem_cache_destroy(s);

Might not be worth doing for now:
I see kmem_cache_destroy() has a `if (err) { pr_err(...); dump_stack(); }` call.
Does it make sense to cause that to fail the test?

I see it's defined in mm/slab_common.c, so we might not want to touch
that in this patch.

> +}
> +
> +static void test_next_pointer(struct kunit *test)
> +{
> +       struct kmem_cache *s = kmem_cache_create("TestSlub_next_ptr_free", 64, 0,
> +                               SLAB_POISON, NULL);
> +       u8 *p = kmem_cache_alloc(s, GFP_KERNEL);
> +       unsigned long tmp;
> +       unsigned long *ptr_addr;
> +
> +       kmem_cache_free(s, p);
> +
> +       ptr_addr = (unsigned long *)(p + s->offset);
> +       tmp = *ptr_addr;
> +       p[s->offset] = 0x12;
> +

I really like this test!
I think it'll be a good example to point to wrt handle mutating state
in tests (clear comments, using whitespace to break up steps, setting
up clear EXPECT calls, etc.)

> +       /*
> +        * Expecting three errors.
> +        * One for the corrupted freechain and the other one for the wrong
> +        * count of objects in use. The third error is fixing broken cache.
> +        */
> +       validate_slab_cache(s);
> +       KUNIT_EXPECT_EQ(test, 3, slab_errors);
> +
> +       /*
> +        * Try to repair corrupted freepointer.
> +        * Still expecting two errors. The first for the wrong count
> +        * of objects in use.
> +        * The second error is for fixing broken cache.
> +        */
> +       *ptr_addr = tmp;
> +       slab_errors = 0;
> +
> +       validate_slab_cache(s);
> +       KUNIT_EXPECT_EQ(test, 2, slab_errors);
> +
> +       /*
> +        * Previous validation repaired the count of objects in use.
> +        * Now expecting no error.
> +        */
> +       slab_errors = 0;
> +       validate_slab_cache(s);
> +       KUNIT_EXPECT_EQ(test, 0, slab_errors);
> +
> +       kmem_cache_destroy(s);
> +}
> +
> +static void test_first_word(struct kunit *test)
> +{
> +       struct kmem_cache *s = kmem_cache_create("TestSlub_1th_word_free", 64, 0,
> +                               SLAB_POISON, NULL);
> +       u8 *p = kmem_cache_alloc(s, GFP_KERNEL);
> +
> +       kmem_cache_free(s, p);
> +       *p = 0x78;
> +
> +       validate_slab_cache(s);
> +       KUNIT_EXPECT_EQ(test, 2, slab_errors);
> +
> +       kmem_cache_destroy(s);
> +}
> +
> +static void test_clobber_50th_byte(struct kunit *test)
> +{
> +       struct kmem_cache *s = kmem_cache_create("TestSlub_50th_word_free", 64, 0,
> +                               SLAB_POISON, NULL);
> +       u8 *p = kmem_cache_alloc(s, GFP_KERNEL);
> +
> +       kmem_cache_free(s, p);
> +       p[50] = 0x9a;
> +
> +       validate_slab_cache(s);
> +       KUNIT_EXPECT_EQ(test, 2, slab_errors);
> +       kmem_cache_destroy(s);
> +}
> +
> +static void test_clobber_redzone_free(struct kunit *test)
> +{
> +       struct kmem_cache *s = kmem_cache_create("TestSlub_RZ_free", 64, 0,
> +                               SLAB_RED_ZONE, NULL);
> +       u8 *p = kmem_cache_alloc(s, GFP_KERNEL);
> +
> +       kmem_cache_free(s, p);
> +       p[64] = 0xab;
> +
> +       validate_slab_cache(s);
> +       KUNIT_EXPECT_EQ(test, 2, slab_errors);
> +       kmem_cache_destroy(s);
> +}
> +
> +static int test_init(struct kunit *test)
> +{
> +       slab_errors = 0;
> +
> +       /* FIXME: remove when CONFIG_KASAN requirement is dropped. */
> +       current->kunit_test = test;
> +
> +       kunit_add_named_resource(test, NULL, NULL, &resource,
> +                                       "slab_errors", &slab_errors);
> +       return 0;
> +}
> +
> +static void test_exit(struct kunit *test)
> +{
> +       /* FIXME: remove when CONFIG_KASAN requirement is dropped. */
> +       current->kunit_test = NULL;
> +}
> +
> +static struct kunit_case test_cases[] = {
> +       KUNIT_CASE(test_clobber_zone),
> +       KUNIT_CASE(test_next_pointer),
> +       KUNIT_CASE(test_first_word),
> +       KUNIT_CASE(test_clobber_50th_byte),
> +       KUNIT_CASE(test_clobber_redzone_free),
> +       {}
> +};
> +
> +static struct kunit_suite test_suite = {
> +       .name = "slub_test",
> +       .init = test_init,
> +       .exit = test_exit,
> +       .test_cases = test_cases,
> +};
> +kunit_test_suite(test_suite);
> +
> +MODULE_LICENSE("GPL");
> diff --git a/mm/slab.h b/mm/slab.h
> index 076582f58f68..95cf42eb8396 100644
> --- a/mm/slab.h
> +++ b/mm/slab.h
> @@ -215,6 +215,7 @@ DECLARE_STATIC_KEY_TRUE(slub_debug_enabled);
>  DECLARE_STATIC_KEY_FALSE(slub_debug_enabled);
>  #endif
>  extern void print_tracking(struct kmem_cache *s, void *object);
> +long validate_slab_cache(struct kmem_cache *s);
>  #else
>  static inline void print_tracking(struct kmem_cache *s, void *object)
>  {
> diff --git a/mm/slub.c b/mm/slub.c
> index 3021ce9bf1b3..d7df8841d90a 100644
> --- a/mm/slub.c
> +++ b/mm/slub.c
> @@ -35,6 +35,7 @@
>  #include <linux/prefetch.h>
>  #include <linux/memcontrol.h>
>  #include <linux/random.h>
> +#include <kunit/test.h>
>
>  #include <trace/events/kmem.h>
>
> @@ -447,6 +448,26 @@ static inline bool cmpxchg_double_slab(struct kmem_cache *s, struct page *page,
>  static unsigned long object_map[BITS_TO_LONGS(MAX_OBJS_PER_PAGE)];
>  static DEFINE_SPINLOCK(object_map_lock);
>
> +#if IS_ENABLED(CONFIG_KUNIT)
> +static bool slab_add_kunit_errors(void)
> +{
> +       struct kunit_resource *resource;
> +
> +       if (likely(!current->kunit_test))
> +               return false;
> +
> +       resource = kunit_find_named_resource(current->kunit_test, "slab_errors");
> +       if (!resource)
> +               return false;
> +
> +       (*(int *)resource->data)++;
> +       kunit_put_resource(resource);
> +       return true;
> +}
> +#else
> +static inline bool slab_add_kunit_errors(void) { return false; }
> +#endif
> +
>  /*
>   * Determine a map of object in use on a page.
>   *
> @@ -676,6 +697,9 @@ static void slab_fix(struct kmem_cache *s, char *fmt, ...)
>         struct va_format vaf;
>         va_list args;
>
> +       if (slab_add_kunit_errors())
> +               return;
> +
>         va_start(args, fmt);
>         vaf.fmt = fmt;
>         vaf.va = &args;
> @@ -739,6 +763,9 @@ static void print_trailer(struct kmem_cache *s, struct page *page, u8 *p)
>  void object_err(struct kmem_cache *s, struct page *page,
>                         u8 *object, char *reason)
>  {
> +       if (slab_add_kunit_errors())
> +               return;
> +
>         slab_bug(s, "%s", reason);

Would it be a good idea for us to log the error text in slab_add_kunit_errors()?
Otherwise we could end up with getting the same error count but from
an unexpected set of code paths.

Boils down to using this macro
  kunit_info(current->kunit_test, "%s", reason);

Perhaps we could get away with just duplicating (a subset of) what
we'd pass into slab_bug(), i.e.
  if (slab_add_kunit_errors("%s", reason)) return;
  if (slab_add_kunit_errors("%s overwritten", what) return;
If the string we're printing is too annoying to duplicate, can get
away with just using __func__ as well.

Or a more messy alternative would be to add some #if'd code in
slab_bug(), but I don't know if that's as good of an idea.

>         print_trailer(s, page, object);
>  }
> @@ -749,6 +776,9 @@ static __printf(3, 4) void slab_err(struct kmem_cache *s, struct page *page,
>         va_list args;
>         char buf[100];
>
> +       if (slab_add_kunit_errors())
> +               return;
> +
>         va_start(args, fmt);
>         vsnprintf(buf, sizeof(buf), fmt, args);
>         va_end(args);
> @@ -798,12 +828,16 @@ static int check_bytes_and_report(struct kmem_cache *s, struct page *page,
>         while (end > fault && end[-1] == value)
>                 end--;
>
> +       if (slab_add_kunit_errors())
> +               goto skip_bug_print;
> +
>         slab_bug(s, "%s overwritten", what);
>         pr_err("INFO: 0x%p-0x%p @offset=%tu. First byte 0x%x instead of 0x%x\n",
> -                                       fault, end - 1, fault - addr,
> -                                       fault[0], value);
> +                               fault, end - 1, fault - addr,
> +                               fault[0], value);
>         print_trailer(s, page, object);
>
> +skip_bug_print:
>         restore_bytes(s, what, value, fault, end);
>         return 0;
>  }
> @@ -4650,9 +4684,11 @@ static int validate_slab_node(struct kmem_cache *s,
>                 validate_slab(s, page);
>                 count++;
>         }
> -       if (count != n->nr_partial)
> +       if (count != n->nr_partial) {
>                 pr_err("SLUB %s: %ld partial slabs counted but counter=%ld\n",
>                        s->name, count, n->nr_partial);
> +               slab_add_kunit_errors();
> +       }
>
>         if (!(s->flags & SLAB_STORE_USER))
>                 goto out;
> @@ -4661,16 +4697,18 @@ static int validate_slab_node(struct kmem_cache *s,
>                 validate_slab(s, page);
>                 count++;
>         }
> -       if (count != atomic_long_read(&n->nr_slabs))
> +       if (count != atomic_long_read(&n->nr_slabs)) {
>                 pr_err("SLUB: %s %ld slabs counted but counter=%ld\n",
>                        s->name, count, atomic_long_read(&n->nr_slabs));
> +               slab_add_kunit_errors();
> +       }
>
>  out:
>         spin_unlock_irqrestore(&n->list_lock, flags);
>         return count;
>  }
>
> -static long validate_slab_cache(struct kmem_cache *s)
> +long validate_slab_cache(struct kmem_cache *s)
>  {
>         int node;
>         unsigned long count = 0;
> @@ -4682,6 +4720,8 @@ static long validate_slab_cache(struct kmem_cache *s)
>
>         return count;
>  }
> +EXPORT_SYMBOL(validate_slab_cache);
> +
>  /*
>   * Generate lists of code addresses where slabcache objects are allocated
>   * and freed.
> --
> 2.31.1.272.g89b43f80a5
>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v4 2/3] mm/slub, kunit: add a KUnit test for SLUB debugging functionality
  2021-04-13 13:54   ` Marco Elver
@ 2021-04-15 10:10     ` Oliver Glitta
  2021-04-15 10:38       ` Vlastimil Babka
  0 siblings, 1 reply; 11+ messages in thread
From: Oliver Glitta @ 2021-04-15 10:10 UTC (permalink / raw)
  To: Marco Elver
  Cc: Brendan Higgins, Christoph Lameter, Pekka Enberg, David Rientjes,
	Joonsoo Kim, Andrew Morton, Vlastimil Babka, LKML,
	open list:KERNEL SELFTEST FRAMEWORK, KUnit Development,
	Linux Memory Management List, Daniel Latypov

ut 13. 4. 2021 o 15:54 Marco Elver <elver@google.com> napísal(a):
>
> On Tue, 13 Apr 2021 at 12:07, <glittao@gmail.com> wrote:
> > From: Oliver Glitta <glittao@gmail.com>
> >
> > SLUB has resiliency_test() function which is hidden behind #ifdef
> > SLUB_RESILIENCY_TEST that is not part of Kconfig, so nobody
> > runs it. KUnit should be a proper replacement for it.
> >
> > Try changing byte in redzone after allocation and changing
> > pointer to next free node, first byte, 50th byte and redzone
> > byte. Check if validation finds errors.
> >
> > There are several differences from the original resiliency test:
> > Tests create own caches with known state instead of corrupting
> > shared kmalloc caches.
> >
> > The corruption of freepointer uses correct offset, the original
> > resiliency test got broken with freepointer changes.
> >
> > Scratch changing random byte test, because it does not have
> > meaning in this form where we need deterministic results.
> >
> > Add new option CONFIG_SLUB_KUNIT_TEST in Kconfig.
> > Because the test deliberatly modifies non-allocated objects, it depends on
> > !KASAN which would have otherwise prevented that.
>
> Hmm, did the test fail with KASAN? Is it possible to skip the tests
> and still run a subset of tests with KASAN? It'd be nice if we could
> run some of these tests with KASAN as well.
>
> > Use kunit_resource to count errors in cache and silence bug reports.
> > Count error whenever slab_bug() or slab_fix() is called or when
> > the count of pages is wrong.
> >
> > Signed-off-by: Oliver Glitta <glittao@gmail.com>
>
> Reviewed-by: Marco Elver <elver@google.com>
>

Thank you.

> Thanks, this all looks good to me. But perhaps do test what works with
> KASAN, to see if you need the !KASAN constraint for all cases.

I tried to run tests with KASAN functionality disabled with function
kasan_disable_current() and three of the tests failed with wrong
errors counts.
So I add the !KASAN constraint for all tests, because the merge window
is coming, we want to know if this version is stable and without other
mistakes.
We will take a closer look at that in the follow-up patch.

>
> > ---
> > Changes since v3
> >
> > Use kunit_resource to silence bug reports and count errors suggested by
> > Marco Elver.
> > Make the test depends on !KASAN thanks to report from the kernel test robot.
> >
> > Changes since v2
> >
> > Use bit operation & instead of logical && as reported by kernel test
> > robot and Dan Carpenter
> >
> > Changes since v1
> >
> > Conversion from kselftest to KUnit test suggested by Marco Elver.
> > Error silencing.
> > Error counting improvements.
> >  lib/Kconfig.debug |  12 ++++
> >  lib/Makefile      |   1 +
> >  lib/slub_kunit.c  | 150 ++++++++++++++++++++++++++++++++++++++++++++++
> >  mm/slab.h         |   1 +
> >  mm/slub.c         |  50 ++++++++++++++--
> >  5 files changed, 209 insertions(+), 5 deletions(-)
> >  create mode 100644 lib/slub_kunit.c
> >
> > diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
> > index 2779c29d9981..9b8a0d754278 100644
> > --- a/lib/Kconfig.debug
> > +++ b/lib/Kconfig.debug
> > @@ -2371,6 +2371,18 @@ config BITS_TEST
> >
> >           If unsure, say N.
> >
> > +config SLUB_KUNIT_TEST
> > +       tristate "KUnit test for SLUB cache error detection" if !KUNIT_ALL_TESTS
> > +       depends on SLUB_DEBUG && KUNIT && !KASAN
> > +       default KUNIT_ALL_TESTS
> > +       help
> > +         This builds SLUB allocator unit test.
> > +         Tests SLUB cache debugging functionality.
> > +         For more information on KUnit and unit tests in general please refer
> > +         to the KUnit documentation in Documentation/dev-tools/kunit/.
> > +
> > +         If unsure, say N.
> > +
> >  config TEST_UDELAY
> >         tristate "udelay test driver"
> >         help
> > diff --git a/lib/Makefile b/lib/Makefile
> > index b5307d3eec1a..1e59c6714ed8 100644
> > --- a/lib/Makefile
> > +++ b/lib/Makefile
> > @@ -352,5 +352,6 @@ obj-$(CONFIG_LIST_KUNIT_TEST) += list-test.o
> >  obj-$(CONFIG_LINEAR_RANGES_TEST) += test_linear_ranges.o
> >  obj-$(CONFIG_BITS_TEST) += test_bits.o
> >  obj-$(CONFIG_CMDLINE_KUNIT_TEST) += cmdline_kunit.o
> > +obj-$(CONFIG_SLUB_KUNIT_TEST) += slub_kunit.o
> >
> >  obj-$(CONFIG_GENERIC_LIB_DEVMEM_IS_ALLOWED) += devmem_is_allowed.o
> > diff --git a/lib/slub_kunit.c b/lib/slub_kunit.c
> > new file mode 100644
> > index 000000000000..cb9ae9f7e8a6
> > --- /dev/null
> > +++ b/lib/slub_kunit.c
> > @@ -0,0 +1,150 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +#include <kunit/test.h>
> > +#include <linux/mm.h>
> > +#include <linux/slab.h>
> > +#include <linux/module.h>
> > +#include <linux/kernel.h>
> > +#include "../mm/slab.h"
> > +
> > +static struct kunit_resource resource;
> > +static int slab_errors;
> > +
> > +static void test_clobber_zone(struct kunit *test)
> > +{
> > +       struct kmem_cache *s = kmem_cache_create("TestSlub_RZ_alloc", 64, 0,
> > +                               SLAB_RED_ZONE, NULL);
> > +       u8 *p = kmem_cache_alloc(s, GFP_KERNEL);
> > +
> > +       p[64] = 0x12;
> > +
> > +       validate_slab_cache(s);
> > +       KUNIT_EXPECT_EQ(test, 2, slab_errors);
> > +
> > +       kmem_cache_free(s, p);
> > +       kmem_cache_destroy(s);
> > +}
> > +
> > +static void test_next_pointer(struct kunit *test)
> > +{
> > +       struct kmem_cache *s = kmem_cache_create("TestSlub_next_ptr_free", 64, 0,
> > +                               SLAB_POISON, NULL);
> > +       u8 *p = kmem_cache_alloc(s, GFP_KERNEL);
> > +       unsigned long tmp;
> > +       unsigned long *ptr_addr;
> > +
> > +       kmem_cache_free(s, p);
> > +
> > +       ptr_addr = (unsigned long *)(p + s->offset);
> > +       tmp = *ptr_addr;
> > +       p[s->offset] = 0x12;
> > +
> > +       /*
> > +        * Expecting three errors.
> > +        * One for the corrupted freechain and the other one for the wrong
> > +        * count of objects in use. The third error is fixing broken cache.
> > +        */
> > +       validate_slab_cache(s);
> > +       KUNIT_EXPECT_EQ(test, 3, slab_errors);
> > +
> > +       /*
> > +        * Try to repair corrupted freepointer.
> > +        * Still expecting two errors. The first for the wrong count
> > +        * of objects in use.
> > +        * The second error is for fixing broken cache.
> > +        */
> > +       *ptr_addr = tmp;
> > +       slab_errors = 0;
> > +
> > +       validate_slab_cache(s);
> > +       KUNIT_EXPECT_EQ(test, 2, slab_errors);
> > +
> > +       /*
> > +        * Previous validation repaired the count of objects in use.
> > +        * Now expecting no error.
> > +        */
> > +       slab_errors = 0;
> > +       validate_slab_cache(s);
> > +       KUNIT_EXPECT_EQ(test, 0, slab_errors);
> > +
> > +       kmem_cache_destroy(s);
> > +}
> > +
> > +static void test_first_word(struct kunit *test)
> > +{
> > +       struct kmem_cache *s = kmem_cache_create("TestSlub_1th_word_free", 64, 0,
> > +                               SLAB_POISON, NULL);
> > +       u8 *p = kmem_cache_alloc(s, GFP_KERNEL);
> > +
> > +       kmem_cache_free(s, p);
> > +       *p = 0x78;
> > +
> > +       validate_slab_cache(s);
> > +       KUNIT_EXPECT_EQ(test, 2, slab_errors);
> > +
> > +       kmem_cache_destroy(s);
> > +}
> > +
> > +static void test_clobber_50th_byte(struct kunit *test)
> > +{
> > +       struct kmem_cache *s = kmem_cache_create("TestSlub_50th_word_free", 64, 0,
> > +                               SLAB_POISON, NULL);
> > +       u8 *p = kmem_cache_alloc(s, GFP_KERNEL);
> > +
> > +       kmem_cache_free(s, p);
> > +       p[50] = 0x9a;
> > +
> > +       validate_slab_cache(s);
> > +       KUNIT_EXPECT_EQ(test, 2, slab_errors);
> > +       kmem_cache_destroy(s);
> > +}
> > +
> > +static void test_clobber_redzone_free(struct kunit *test)
> > +{
> > +       struct kmem_cache *s = kmem_cache_create("TestSlub_RZ_free", 64, 0,
> > +                               SLAB_RED_ZONE, NULL);
> > +       u8 *p = kmem_cache_alloc(s, GFP_KERNEL);
> > +
> > +       kmem_cache_free(s, p);
> > +       p[64] = 0xab;
> > +
> > +       validate_slab_cache(s);
> > +       KUNIT_EXPECT_EQ(test, 2, slab_errors);
> > +       kmem_cache_destroy(s);
> > +}
> > +
> > +static int test_init(struct kunit *test)
> > +{
> > +       slab_errors = 0;
> > +
> > +       /* FIXME: remove when CONFIG_KASAN requirement is dropped. */
> > +       current->kunit_test = test;
>
> Note, the patch "kunit: support failure from dynamic analysis tools"
> is already in -next. It's probably safe to leave this, and send a
> follow-up patch later once that kunit patch is in mainline.
>
> > +       kunit_add_named_resource(test, NULL, NULL, &resource,
> > +                                       "slab_errors", &slab_errors);
> > +       return 0;
> > +}
> > +
> > +static void test_exit(struct kunit *test)
> > +{
> > +       /* FIXME: remove when CONFIG_KASAN requirement is dropped. */
> > +       current->kunit_test = NULL;
> > +}
> > +
> > +static struct kunit_case test_cases[] = {
> > +       KUNIT_CASE(test_clobber_zone),
> > +       KUNIT_CASE(test_next_pointer),
> > +       KUNIT_CASE(test_first_word),
> > +       KUNIT_CASE(test_clobber_50th_byte),
> > +       KUNIT_CASE(test_clobber_redzone_free),
> > +       {}
> > +};
> > +
> > +static struct kunit_suite test_suite = {
> > +       .name = "slub_test",
> > +       .init = test_init,
> > +       .exit = test_exit,
> > +       .test_cases = test_cases,
> > +};
> > +kunit_test_suite(test_suite);
> > +
> > +MODULE_LICENSE("GPL");
> > diff --git a/mm/slab.h b/mm/slab.h
> > index 076582f58f68..95cf42eb8396 100644
> > --- a/mm/slab.h
> > +++ b/mm/slab.h
> > @@ -215,6 +215,7 @@ DECLARE_STATIC_KEY_TRUE(slub_debug_enabled);
> >  DECLARE_STATIC_KEY_FALSE(slub_debug_enabled);
> >  #endif
> >  extern void print_tracking(struct kmem_cache *s, void *object);
> > +long validate_slab_cache(struct kmem_cache *s);
> >  #else
> >  static inline void print_tracking(struct kmem_cache *s, void *object)
> >  {
> > diff --git a/mm/slub.c b/mm/slub.c
> > index 3021ce9bf1b3..d7df8841d90a 100644
> > --- a/mm/slub.c
> > +++ b/mm/slub.c
> > @@ -35,6 +35,7 @@
> >  #include <linux/prefetch.h>
> >  #include <linux/memcontrol.h>
> >  #include <linux/random.h>
> > +#include <kunit/test.h>
> >
> >  #include <trace/events/kmem.h>
> >
> > @@ -447,6 +448,26 @@ static inline bool cmpxchg_double_slab(struct kmem_cache *s, struct page *page,
> >  static unsigned long object_map[BITS_TO_LONGS(MAX_OBJS_PER_PAGE)];
> >  static DEFINE_SPINLOCK(object_map_lock);
> >
> > +#if IS_ENABLED(CONFIG_KUNIT)
> > +static bool slab_add_kunit_errors(void)
> > +{
> > +       struct kunit_resource *resource;
> > +
> > +       if (likely(!current->kunit_test))
> > +               return false;
> > +
> > +       resource = kunit_find_named_resource(current->kunit_test, "slab_errors");
> > +       if (!resource)
> > +               return false;
> > +
> > +       (*(int *)resource->data)++;
> > +       kunit_put_resource(resource);
> > +       return true;
> > +}
> > +#else
> > +static inline bool slab_add_kunit_errors(void) { return false; }
> > +#endif
> > +
> >  /*
> >   * Determine a map of object in use on a page.
> >   *
> > @@ -676,6 +697,9 @@ static void slab_fix(struct kmem_cache *s, char *fmt, ...)
> >         struct va_format vaf;
> >         va_list args;
> >
> > +       if (slab_add_kunit_errors())
> > +               return;
> > +
> >         va_start(args, fmt);
> >         vaf.fmt = fmt;
> >         vaf.va = &args;
> > @@ -739,6 +763,9 @@ static void print_trailer(struct kmem_cache *s, struct page *page, u8 *p)
> >  void object_err(struct kmem_cache *s, struct page *page,
> >                         u8 *object, char *reason)
> >  {
> > +       if (slab_add_kunit_errors())
> > +               return;
> > +
> >         slab_bug(s, "%s", reason);
> >         print_trailer(s, page, object);
> >  }
> > @@ -749,6 +776,9 @@ static __printf(3, 4) void slab_err(struct kmem_cache *s, struct page *page,
> >         va_list args;
> >         char buf[100];
> >
> > +       if (slab_add_kunit_errors())
> > +               return;
> > +
> >         va_start(args, fmt);
> >         vsnprintf(buf, sizeof(buf), fmt, args);
> >         va_end(args);
> > @@ -798,12 +828,16 @@ static int check_bytes_and_report(struct kmem_cache *s, struct page *page,
> >         while (end > fault && end[-1] == value)
> >                 end--;
> >
> > +       if (slab_add_kunit_errors())
> > +               goto skip_bug_print;
> > +
> >         slab_bug(s, "%s overwritten", what);
> >         pr_err("INFO: 0x%p-0x%p @offset=%tu. First byte 0x%x instead of 0x%x\n",
> > -                                       fault, end - 1, fault - addr,
> > -                                       fault[0], value);
> > +                               fault, end - 1, fault - addr,
> > +                               fault[0], value);
> >         print_trailer(s, page, object);
> >
> > +skip_bug_print:
> >         restore_bytes(s, what, value, fault, end);
> >         return 0;
> >  }
> > @@ -4650,9 +4684,11 @@ static int validate_slab_node(struct kmem_cache *s,
> >                 validate_slab(s, page);
> >                 count++;
> >         }
> > -       if (count != n->nr_partial)
> > +       if (count != n->nr_partial) {
> >                 pr_err("SLUB %s: %ld partial slabs counted but counter=%ld\n",
> >                        s->name, count, n->nr_partial);
> > +               slab_add_kunit_errors();
> > +       }
> >
> >         if (!(s->flags & SLAB_STORE_USER))
> >                 goto out;
> > @@ -4661,16 +4697,18 @@ static int validate_slab_node(struct kmem_cache *s,
> >                 validate_slab(s, page);
> >                 count++;
> >         }
> > -       if (count != atomic_long_read(&n->nr_slabs))
> > +       if (count != atomic_long_read(&n->nr_slabs)) {
> >                 pr_err("SLUB: %s %ld slabs counted but counter=%ld\n",
> >                        s->name, count, atomic_long_read(&n->nr_slabs));
> > +               slab_add_kunit_errors();
> > +       }
> >
> >  out:
> >         spin_unlock_irqrestore(&n->list_lock, flags);
> >         return count;
> >  }
> >
> > -static long validate_slab_cache(struct kmem_cache *s)
> > +long validate_slab_cache(struct kmem_cache *s)
> >  {
> >         int node;
> >         unsigned long count = 0;
> > @@ -4682,6 +4720,8 @@ static long validate_slab_cache(struct kmem_cache *s)
> >
> >         return count;
> >  }
> > +EXPORT_SYMBOL(validate_slab_cache);
> > +
> >  /*
> >   * Generate lists of code addresses where slabcache objects are allocated
> >   * and freed.
> > --
> > 2.31.1.272.g89b43f80a5
> >

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v4 2/3] mm/slub, kunit: add a KUnit test for SLUB debugging functionality
  2021-04-13 21:33   ` Daniel Latypov
@ 2021-04-15 10:11     ` Oliver Glitta
  0 siblings, 0 replies; 11+ messages in thread
From: Oliver Glitta @ 2021-04-15 10:11 UTC (permalink / raw)
  To: Daniel Latypov
  Cc: Brendan Higgins, Christoph Lameter, Pekka Enberg, David Rientjes,
	Joonsoo Kim, Andrew Morton, Vlastimil Babka,
	Linux Kernel Mailing List, open list:KERNEL SELFTEST FRAMEWORK,
	KUnit Development, Linux Memory Management List, Marco Elver

ut 13. 4. 2021 o 23:33 Daniel Latypov <dlatypov@google.com> napísal(a):
>
> On Tue, Apr 13, 2021 at 3:07 AM <glittao@gmail.com> wrote:
> >
> > From: Oliver Glitta <glittao@gmail.com>
> >
> > SLUB has resiliency_test() function which is hidden behind #ifdef
> > SLUB_RESILIENCY_TEST that is not part of Kconfig, so nobody
> > runs it. KUnit should be a proper replacement for it.
> >
> > Try changing byte in redzone after allocation and changing
> > pointer to next free node, first byte, 50th byte and redzone
> > byte. Check if validation finds errors.
> >
> > There are several differences from the original resiliency test:
> > Tests create own caches with known state instead of corrupting
> > shared kmalloc caches.
> >
> > The corruption of freepointer uses correct offset, the original
> > resiliency test got broken with freepointer changes.
> >
> > Scratch changing random byte test, because it does not have
> > meaning in this form where we need deterministic results.
> >
> > Add new option CONFIG_SLUB_KUNIT_TEST in Kconfig.
> > Because the test deliberatly modifies non-allocated objects, it depends on
>
> nit: *deliberately
>
> > !KASAN which would have otherwise prevented that.
> >
> > Use kunit_resource to count errors in cache and silence bug reports.
> > Count error whenever slab_bug() or slab_fix() is called or when
> > the count of pages is wrong.
> >
> > Signed-off-by: Oliver Glitta <glittao@gmail.com>
>
> Acked-by: Daniel Latypov <dlatypov@google.com>
>

Thank you.

> Looks good to me!
> My one minor suggestion: perhaps let's log a summary of the error or
> the func name in slab_add_kunit_errors().
>

That is a good suggestion, but now we want to know if this version is stable.
We will take a look at that in the follow-up patch.

> > ---
> > Changes since v3
> >
> > Use kunit_resource to silence bug reports and count errors suggested by
> > Marco Elver.
> > Make the test depends on !KASAN thanks to report from the kernel test robot.
> >
> > Changes since v2
> >
> > Use bit operation & instead of logical && as reported by kernel test
> > robot and Dan Carpenter
> >
> > Changes since v1
> >
> > Conversion from kselftest to KUnit test suggested by Marco Elver.
> > Error silencing.
> > Error counting improvements.
> >  lib/Kconfig.debug |  12 ++++
> >  lib/Makefile      |   1 +
> >  lib/slub_kunit.c  | 150 ++++++++++++++++++++++++++++++++++++++++++++++
> >  mm/slab.h         |   1 +
> >  mm/slub.c         |  50 ++++++++++++++--
> >  5 files changed, 209 insertions(+), 5 deletions(-)
> >  create mode 100644 lib/slub_kunit.c
> >
> > diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
> > index 2779c29d9981..9b8a0d754278 100644
> > --- a/lib/Kconfig.debug
> > +++ b/lib/Kconfig.debug
> > @@ -2371,6 +2371,18 @@ config BITS_TEST
> >
> >           If unsure, say N.
> >
> > +config SLUB_KUNIT_TEST
> > +       tristate "KUnit test for SLUB cache error detection" if !KUNIT_ALL_TESTS
> > +       depends on SLUB_DEBUG && KUNIT && !KASAN
> > +       default KUNIT_ALL_TESTS
> > +       help
> > +         This builds SLUB allocator unit test.
> > +         Tests SLUB cache debugging functionality.
> > +         For more information on KUnit and unit tests in general please refer
> > +         to the KUnit documentation in Documentation/dev-tools/kunit/.
> > +
> > +         If unsure, say N.
> > +
> >  config TEST_UDELAY
> >         tristate "udelay test driver"
> >         help
> > diff --git a/lib/Makefile b/lib/Makefile
> > index b5307d3eec1a..1e59c6714ed8 100644
> > --- a/lib/Makefile
> > +++ b/lib/Makefile
> > @@ -352,5 +352,6 @@ obj-$(CONFIG_LIST_KUNIT_TEST) += list-test.o
> >  obj-$(CONFIG_LINEAR_RANGES_TEST) += test_linear_ranges.o
> >  obj-$(CONFIG_BITS_TEST) += test_bits.o
> >  obj-$(CONFIG_CMDLINE_KUNIT_TEST) += cmdline_kunit.o
> > +obj-$(CONFIG_SLUB_KUNIT_TEST) += slub_kunit.o
> >
> >  obj-$(CONFIG_GENERIC_LIB_DEVMEM_IS_ALLOWED) += devmem_is_allowed.o
> > diff --git a/lib/slub_kunit.c b/lib/slub_kunit.c
> > new file mode 100644
> > index 000000000000..cb9ae9f7e8a6
> > --- /dev/null
> > +++ b/lib/slub_kunit.c
> > @@ -0,0 +1,150 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +#include <kunit/test.h>
> > +#include <linux/mm.h>
> > +#include <linux/slab.h>
> > +#include <linux/module.h>
> > +#include <linux/kernel.h>
> > +#include "../mm/slab.h"
> > +
> > +static struct kunit_resource resource;
> > +static int slab_errors;
> > +
> > +static void test_clobber_zone(struct kunit *test)
> > +{
> > +       struct kmem_cache *s = kmem_cache_create("TestSlub_RZ_alloc", 64, 0,
> > +                               SLAB_RED_ZONE, NULL);
> > +       u8 *p = kmem_cache_alloc(s, GFP_KERNEL);
> > +
> > +       p[64] = 0x12;
> > +
> > +       validate_slab_cache(s);
> > +       KUNIT_EXPECT_EQ(test, 2, slab_errors);
> > +
> > +       kmem_cache_free(s, p);
> > +       kmem_cache_destroy(s);
>
> Might not be worth doing for now:
> I see kmem_cache_destroy() has a `if (err) { pr_err(...); dump_stack(); }` call.
> Does it make sense to cause that to fail the test?
>
> I see it's defined in mm/slab_common.c, so we might not want to touch
> that in this patch.
>
> > +}
> > +
> > +static void test_next_pointer(struct kunit *test)
> > +{
> > +       struct kmem_cache *s = kmem_cache_create("TestSlub_next_ptr_free", 64, 0,
> > +                               SLAB_POISON, NULL);
> > +       u8 *p = kmem_cache_alloc(s, GFP_KERNEL);
> > +       unsigned long tmp;
> > +       unsigned long *ptr_addr;
> > +
> > +       kmem_cache_free(s, p);
> > +
> > +       ptr_addr = (unsigned long *)(p + s->offset);
> > +       tmp = *ptr_addr;
> > +       p[s->offset] = 0x12;
> > +
>
> I really like this test!
> I think it'll be a good example to point to wrt handle mutating state
> in tests (clear comments, using whitespace to break up steps, setting
> up clear EXPECT calls, etc.)
>
> > +       /*
> > +        * Expecting three errors.
> > +        * One for the corrupted freechain and the other one for the wrong
> > +        * count of objects in use. The third error is fixing broken cache.
> > +        */
> > +       validate_slab_cache(s);
> > +       KUNIT_EXPECT_EQ(test, 3, slab_errors);
> > +
> > +       /*
> > +        * Try to repair corrupted freepointer.
> > +        * Still expecting two errors. The first for the wrong count
> > +        * of objects in use.
> > +        * The second error is for fixing broken cache.
> > +        */
> > +       *ptr_addr = tmp;
> > +       slab_errors = 0;
> > +
> > +       validate_slab_cache(s);
> > +       KUNIT_EXPECT_EQ(test, 2, slab_errors);
> > +
> > +       /*
> > +        * Previous validation repaired the count of objects in use.
> > +        * Now expecting no error.
> > +        */
> > +       slab_errors = 0;
> > +       validate_slab_cache(s);
> > +       KUNIT_EXPECT_EQ(test, 0, slab_errors);
> > +
> > +       kmem_cache_destroy(s);
> > +}
> > +
> > +static void test_first_word(struct kunit *test)
> > +{
> > +       struct kmem_cache *s = kmem_cache_create("TestSlub_1th_word_free", 64, 0,
> > +                               SLAB_POISON, NULL);
> > +       u8 *p = kmem_cache_alloc(s, GFP_KERNEL);
> > +
> > +       kmem_cache_free(s, p);
> > +       *p = 0x78;
> > +
> > +       validate_slab_cache(s);
> > +       KUNIT_EXPECT_EQ(test, 2, slab_errors);
> > +
> > +       kmem_cache_destroy(s);
> > +}
> > +
> > +static void test_clobber_50th_byte(struct kunit *test)
> > +{
> > +       struct kmem_cache *s = kmem_cache_create("TestSlub_50th_word_free", 64, 0,
> > +                               SLAB_POISON, NULL);
> > +       u8 *p = kmem_cache_alloc(s, GFP_KERNEL);
> > +
> > +       kmem_cache_free(s, p);
> > +       p[50] = 0x9a;
> > +
> > +       validate_slab_cache(s);
> > +       KUNIT_EXPECT_EQ(test, 2, slab_errors);
> > +       kmem_cache_destroy(s);
> > +}
> > +
> > +static void test_clobber_redzone_free(struct kunit *test)
> > +{
> > +       struct kmem_cache *s = kmem_cache_create("TestSlub_RZ_free", 64, 0,
> > +                               SLAB_RED_ZONE, NULL);
> > +       u8 *p = kmem_cache_alloc(s, GFP_KERNEL);
> > +
> > +       kmem_cache_free(s, p);
> > +       p[64] = 0xab;
> > +
> > +       validate_slab_cache(s);
> > +       KUNIT_EXPECT_EQ(test, 2, slab_errors);
> > +       kmem_cache_destroy(s);
> > +}
> > +
> > +static int test_init(struct kunit *test)
> > +{
> > +       slab_errors = 0;
> > +
> > +       /* FIXME: remove when CONFIG_KASAN requirement is dropped. */
> > +       current->kunit_test = test;
> > +
> > +       kunit_add_named_resource(test, NULL, NULL, &resource,
> > +                                       "slab_errors", &slab_errors);
> > +       return 0;
> > +}
> > +
> > +static void test_exit(struct kunit *test)
> > +{
> > +       /* FIXME: remove when CONFIG_KASAN requirement is dropped. */
> > +       current->kunit_test = NULL;
> > +}
> > +
> > +static struct kunit_case test_cases[] = {
> > +       KUNIT_CASE(test_clobber_zone),
> > +       KUNIT_CASE(test_next_pointer),
> > +       KUNIT_CASE(test_first_word),
> > +       KUNIT_CASE(test_clobber_50th_byte),
> > +       KUNIT_CASE(test_clobber_redzone_free),
> > +       {}
> > +};
> > +
> > +static struct kunit_suite test_suite = {
> > +       .name = "slub_test",
> > +       .init = test_init,
> > +       .exit = test_exit,
> > +       .test_cases = test_cases,
> > +};
> > +kunit_test_suite(test_suite);
> > +
> > +MODULE_LICENSE("GPL");
> > diff --git a/mm/slab.h b/mm/slab.h
> > index 076582f58f68..95cf42eb8396 100644
> > --- a/mm/slab.h
> > +++ b/mm/slab.h
> > @@ -215,6 +215,7 @@ DECLARE_STATIC_KEY_TRUE(slub_debug_enabled);
> >  DECLARE_STATIC_KEY_FALSE(slub_debug_enabled);
> >  #endif
> >  extern void print_tracking(struct kmem_cache *s, void *object);
> > +long validate_slab_cache(struct kmem_cache *s);
> >  #else
> >  static inline void print_tracking(struct kmem_cache *s, void *object)
> >  {
> > diff --git a/mm/slub.c b/mm/slub.c
> > index 3021ce9bf1b3..d7df8841d90a 100644
> > --- a/mm/slub.c
> > +++ b/mm/slub.c
> > @@ -35,6 +35,7 @@
> >  #include <linux/prefetch.h>
> >  #include <linux/memcontrol.h>
> >  #include <linux/random.h>
> > +#include <kunit/test.h>
> >
> >  #include <trace/events/kmem.h>
> >
> > @@ -447,6 +448,26 @@ static inline bool cmpxchg_double_slab(struct kmem_cache *s, struct page *page,
> >  static unsigned long object_map[BITS_TO_LONGS(MAX_OBJS_PER_PAGE)];
> >  static DEFINE_SPINLOCK(object_map_lock);
> >
> > +#if IS_ENABLED(CONFIG_KUNIT)
> > +static bool slab_add_kunit_errors(void)
> > +{
> > +       struct kunit_resource *resource;
> > +
> > +       if (likely(!current->kunit_test))
> > +               return false;
> > +
> > +       resource = kunit_find_named_resource(current->kunit_test, "slab_errors");
> > +       if (!resource)
> > +               return false;
> > +
> > +       (*(int *)resource->data)++;
> > +       kunit_put_resource(resource);
> > +       return true;
> > +}
> > +#else
> > +static inline bool slab_add_kunit_errors(void) { return false; }
> > +#endif
> > +
> >  /*
> >   * Determine a map of object in use on a page.
> >   *
> > @@ -676,6 +697,9 @@ static void slab_fix(struct kmem_cache *s, char *fmt, ...)
> >         struct va_format vaf;
> >         va_list args;
> >
> > +       if (slab_add_kunit_errors())
> > +               return;
> > +
> >         va_start(args, fmt);
> >         vaf.fmt = fmt;
> >         vaf.va = &args;
> > @@ -739,6 +763,9 @@ static void print_trailer(struct kmem_cache *s, struct page *page, u8 *p)
> >  void object_err(struct kmem_cache *s, struct page *page,
> >                         u8 *object, char *reason)
> >  {
> > +       if (slab_add_kunit_errors())
> > +               return;
> > +
> >         slab_bug(s, "%s", reason);
>
> Would it be a good idea for us to log the error text in slab_add_kunit_errors()?
> Otherwise we could end up with getting the same error count but from
> an unexpected set of code paths.
>
> Boils down to using this macro
>   kunit_info(current->kunit_test, "%s", reason);
>
> Perhaps we could get away with just duplicating (a subset of) what
> we'd pass into slab_bug(), i.e.
>   if (slab_add_kunit_errors("%s", reason)) return;
>   if (slab_add_kunit_errors("%s overwritten", what) return;
> If the string we're printing is too annoying to duplicate, can get
> away with just using __func__ as well.
>
> Or a more messy alternative would be to add some #if'd code in
> slab_bug(), but I don't know if that's as good of an idea.
>
> >         print_trailer(s, page, object);
> >  }
> > @@ -749,6 +776,9 @@ static __printf(3, 4) void slab_err(struct kmem_cache *s, struct page *page,
> >         va_list args;
> >         char buf[100];
> >
> > +       if (slab_add_kunit_errors())
> > +               return;
> > +
> >         va_start(args, fmt);
> >         vsnprintf(buf, sizeof(buf), fmt, args);
> >         va_end(args);
> > @@ -798,12 +828,16 @@ static int check_bytes_and_report(struct kmem_cache *s, struct page *page,
> >         while (end > fault && end[-1] == value)
> >                 end--;
> >
> > +       if (slab_add_kunit_errors())
> > +               goto skip_bug_print;
> > +
> >         slab_bug(s, "%s overwritten", what);
> >         pr_err("INFO: 0x%p-0x%p @offset=%tu. First byte 0x%x instead of 0x%x\n",
> > -                                       fault, end - 1, fault - addr,
> > -                                       fault[0], value);
> > +                               fault, end - 1, fault - addr,
> > +                               fault[0], value);
> >         print_trailer(s, page, object);
> >
> > +skip_bug_print:
> >         restore_bytes(s, what, value, fault, end);
> >         return 0;
> >  }
> > @@ -4650,9 +4684,11 @@ static int validate_slab_node(struct kmem_cache *s,
> >                 validate_slab(s, page);
> >                 count++;
> >         }
> > -       if (count != n->nr_partial)
> > +       if (count != n->nr_partial) {
> >                 pr_err("SLUB %s: %ld partial slabs counted but counter=%ld\n",
> >                        s->name, count, n->nr_partial);
> > +               slab_add_kunit_errors();
> > +       }
> >
> >         if (!(s->flags & SLAB_STORE_USER))
> >                 goto out;
> > @@ -4661,16 +4697,18 @@ static int validate_slab_node(struct kmem_cache *s,
> >                 validate_slab(s, page);
> >                 count++;
> >         }
> > -       if (count != atomic_long_read(&n->nr_slabs))
> > +       if (count != atomic_long_read(&n->nr_slabs)) {
> >                 pr_err("SLUB: %s %ld slabs counted but counter=%ld\n",
> >                        s->name, count, atomic_long_read(&n->nr_slabs));
> > +               slab_add_kunit_errors();
> > +       }
> >
> >  out:
> >         spin_unlock_irqrestore(&n->list_lock, flags);
> >         return count;
> >  }
> >
> > -static long validate_slab_cache(struct kmem_cache *s)
> > +long validate_slab_cache(struct kmem_cache *s)
> >  {
> >         int node;
> >         unsigned long count = 0;
> > @@ -4682,6 +4720,8 @@ static long validate_slab_cache(struct kmem_cache *s)
> >
> >         return count;
> >  }
> > +EXPORT_SYMBOL(validate_slab_cache);
> > +
> >  /*
> >   * Generate lists of code addresses where slabcache objects are allocated
> >   * and freed.
> > --
> > 2.31.1.272.g89b43f80a5
> >

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v4 2/3] mm/slub, kunit: add a KUnit test for SLUB debugging functionality
  2021-04-13 10:07 ` [PATCH v4 2/3] mm/slub, kunit: add a KUnit test for SLUB debugging functionality glittao
  2021-04-13 13:54   ` Marco Elver
  2021-04-13 21:33   ` Daniel Latypov
@ 2021-04-15 10:30   ` Vlastimil Babka
  2 siblings, 0 replies; 11+ messages in thread
From: Vlastimil Babka @ 2021-04-15 10:30 UTC (permalink / raw)
  To: glittao, brendanhiggins, cl, penberg, rientjes, iamjoonsoo.kim, akpm
  Cc: linux-kernel, linux-kselftest, kunit-dev, linux-mm, elver, dlatypov

On 4/13/21 12:07 PM, glittao@gmail.com wrote:
> From: Oliver Glitta <glittao@gmail.com>
> 
> SLUB has resiliency_test() function which is hidden behind #ifdef
> SLUB_RESILIENCY_TEST that is not part of Kconfig, so nobody
> runs it. KUnit should be a proper replacement for it.
> 
> Try changing byte in redzone after allocation and changing
> pointer to next free node, first byte, 50th byte and redzone
> byte. Check if validation finds errors.
> 
> There are several differences from the original resiliency test:
> Tests create own caches with known state instead of corrupting
> shared kmalloc caches.
> 
> The corruption of freepointer uses correct offset, the original
> resiliency test got broken with freepointer changes.
> 
> Scratch changing random byte test, because it does not have
> meaning in this form where we need deterministic results.
> 
> Add new option CONFIG_SLUB_KUNIT_TEST in Kconfig.
> Because the test deliberatly modifies non-allocated objects, it depends on
> !KASAN which would have otherwise prevented that.
> 
> Use kunit_resource to count errors in cache and silence bug reports.
> Count error whenever slab_bug() or slab_fix() is called or when
> the count of pages is wrong.
> 
> Signed-off-by: Oliver Glitta <glittao@gmail.com>

Acked-by: Vlastimil Babka <vbabka@suse.cz>

(again with a disclaimer that I'm the advisor of Oliver's student project)

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v4 2/3] mm/slub, kunit: add a KUnit test for SLUB debugging functionality
  2021-04-15 10:10     ` Oliver Glitta
@ 2021-04-15 10:38       ` Vlastimil Babka
  2021-04-15 11:01         ` Marco Elver
  0 siblings, 1 reply; 11+ messages in thread
From: Vlastimil Babka @ 2021-04-15 10:38 UTC (permalink / raw)
  To: Oliver Glitta, Marco Elver
  Cc: Brendan Higgins, Christoph Lameter, Pekka Enberg, David Rientjes,
	Joonsoo Kim, Andrew Morton, LKML,
	open list:KERNEL SELFTEST FRAMEWORK, KUnit Development,
	Linux Memory Management List, Daniel Latypov

On 4/15/21 12:10 PM, Oliver Glitta wrote:
> ut 13. 4. 2021 o 15:54 Marco Elver <elver@google.com> napísal(a):
>>
>> On Tue, 13 Apr 2021 at 12:07, <glittao@gmail.com> wrote:
>> > From: Oliver Glitta <glittao@gmail.com>
>> >
>> > SLUB has resiliency_test() function which is hidden behind #ifdef
>> > SLUB_RESILIENCY_TEST that is not part of Kconfig, so nobody
>> > runs it. KUnit should be a proper replacement for it.
>> >
>> > Try changing byte in redzone after allocation and changing
>> > pointer to next free node, first byte, 50th byte and redzone
>> > byte. Check if validation finds errors.
>> >
>> > There are several differences from the original resiliency test:
>> > Tests create own caches with known state instead of corrupting
>> > shared kmalloc caches.
>> >
>> > The corruption of freepointer uses correct offset, the original
>> > resiliency test got broken with freepointer changes.
>> >
>> > Scratch changing random byte test, because it does not have
>> > meaning in this form where we need deterministic results.
>> >
>> > Add new option CONFIG_SLUB_KUNIT_TEST in Kconfig.
>> > Because the test deliberatly modifies non-allocated objects, it depends on
>> > !KASAN which would have otherwise prevented that.
>>
>> Hmm, did the test fail with KASAN? Is it possible to skip the tests
>> and still run a subset of tests with KASAN? It'd be nice if we could
>> run some of these tests with KASAN as well.
>>
>> > Use kunit_resource to count errors in cache and silence bug reports.
>> > Count error whenever slab_bug() or slab_fix() is called or when
>> > the count of pages is wrong.
>> >
>> > Signed-off-by: Oliver Glitta <glittao@gmail.com>
>>
>> Reviewed-by: Marco Elver <elver@google.com>
>>
> 
> Thank you.
> 
>> Thanks, this all looks good to me. But perhaps do test what works with
>> KASAN, to see if you need the !KASAN constraint for all cases.
> 
> I tried to run tests with KASAN functionality disabled with function
> kasan_disable_current() and three of the tests failed with wrong
> errors counts.
> So I add the !KASAN constraint for all tests, because the merge window
> is coming, we want to know if this version is stable and without other
> mistakes.
> We will take a closer look at that in the follow-up patch.

Agreed. In this context, KASAN is essentially a different implementation of the
same checks that SLUB_DEBUG offers (and also does other checks) and we excercise
these SLUB_DEBUG checks by deliberately causing the corruption that they detect
- so instead, KASAN detects it, as it should. I assume that once somebody opts
for a full KASAN kernel build, they don't need the SLUB_DEBUG functionality at
that point, as KASAN is more extensive (On the other hand SLUB_DEBUG kernels can
be (and are) shipped as production distro kernels where specific targetted
debugging can be enabled to help find bugs in production with minimal disruption).
So trying to make both cooperate can work only to some extent and for now we've
chosen the safer way.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v4 2/3] mm/slub, kunit: add a KUnit test for SLUB debugging functionality
  2021-04-15 10:38       ` Vlastimil Babka
@ 2021-04-15 11:01         ` Marco Elver
  0 siblings, 0 replies; 11+ messages in thread
From: Marco Elver @ 2021-04-15 11:01 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: Oliver Glitta, Brendan Higgins, Christoph Lameter, Pekka Enberg,
	David Rientjes, Joonsoo Kim, Andrew Morton, LKML,
	open list:KERNEL SELFTEST FRAMEWORK, KUnit Development,
	Linux Memory Management List, Daniel Latypov

On Thu, 15 Apr 2021 at 12:38, Vlastimil Babka <vbabka@suse.cz> wrote:
>
> On 4/15/21 12:10 PM, Oliver Glitta wrote:
> > ut 13. 4. 2021 o 15:54 Marco Elver <elver@google.com> napísal(a):
> >>
> >> On Tue, 13 Apr 2021 at 12:07, <glittao@gmail.com> wrote:
> >> > From: Oliver Glitta <glittao@gmail.com>
> >> >
> >> > SLUB has resiliency_test() function which is hidden behind #ifdef
> >> > SLUB_RESILIENCY_TEST that is not part of Kconfig, so nobody
> >> > runs it. KUnit should be a proper replacement for it.
> >> >
> >> > Try changing byte in redzone after allocation and changing
> >> > pointer to next free node, first byte, 50th byte and redzone
> >> > byte. Check if validation finds errors.
> >> >
> >> > There are several differences from the original resiliency test:
> >> > Tests create own caches with known state instead of corrupting
> >> > shared kmalloc caches.
> >> >
> >> > The corruption of freepointer uses correct offset, the original
> >> > resiliency test got broken with freepointer changes.
> >> >
> >> > Scratch changing random byte test, because it does not have
> >> > meaning in this form where we need deterministic results.
> >> >
> >> > Add new option CONFIG_SLUB_KUNIT_TEST in Kconfig.
> >> > Because the test deliberatly modifies non-allocated objects, it depends on
> >> > !KASAN which would have otherwise prevented that.
> >>
> >> Hmm, did the test fail with KASAN? Is it possible to skip the tests
> >> and still run a subset of tests with KASAN? It'd be nice if we could
> >> run some of these tests with KASAN as well.
> >>
> >> > Use kunit_resource to count errors in cache and silence bug reports.
> >> > Count error whenever slab_bug() or slab_fix() is called or when
> >> > the count of pages is wrong.
> >> >
> >> > Signed-off-by: Oliver Glitta <glittao@gmail.com>
> >>
> >> Reviewed-by: Marco Elver <elver@google.com>
> >>
> >
> > Thank you.
> >
> >> Thanks, this all looks good to me. But perhaps do test what works with
> >> KASAN, to see if you need the !KASAN constraint for all cases.
> >
> > I tried to run tests with KASAN functionality disabled with function
> > kasan_disable_current() and three of the tests failed with wrong
> > errors counts.
> > So I add the !KASAN constraint for all tests, because the merge window
> > is coming, we want to know if this version is stable and without other
> > mistakes.
> > We will take a closer look at that in the follow-up patch.
>
> Agreed. In this context, KASAN is essentially a different implementation of the
> same checks that SLUB_DEBUG offers (and also does other checks) and we excercise
> these SLUB_DEBUG checks by deliberately causing the corruption that they detect
> - so instead, KASAN detects it, as it should. I assume that once somebody opts
> for a full KASAN kernel build, they don't need the SLUB_DEBUG functionality at
> that point, as KASAN is more extensive (On the other hand SLUB_DEBUG kernels can
> be (and are) shipped as production distro kernels where specific targetted
> debugging can be enabled to help find bugs in production with minimal disruption).
> So trying to make both cooperate can work only to some extent and for now we've
> chosen the safer way.

Sounds reasonable. In any case, I'm fine with this version to land and
my Reviewed-by above remains valid. :-)

Thanks,
-- Marco

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2021-04-15 11:01 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-04-13 10:07 [PATCH v4 1/3] kunit: make test->lock irq safe glittao
2021-04-13 10:07 ` [PATCH v4 2/3] mm/slub, kunit: add a KUnit test for SLUB debugging functionality glittao
2021-04-13 13:54   ` Marco Elver
2021-04-15 10:10     ` Oliver Glitta
2021-04-15 10:38       ` Vlastimil Babka
2021-04-15 11:01         ` Marco Elver
2021-04-13 21:33   ` Daniel Latypov
2021-04-15 10:11     ` Oliver Glitta
2021-04-15 10:30   ` Vlastimil Babka
2021-04-13 10:07 ` [PATCH v4 3/3] slub: remove resiliency_test() function glittao
2021-04-13 13:38 ` [PATCH v4 1/3] kunit: make test->lock irq safe Brendan Higgins

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).