linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v8] mm: cma: support sysfs
@ 2021-03-24 23:07 Minchan Kim
  2021-03-26 13:59 ` Anders Roxell
  0 siblings, 1 reply; 4+ messages in thread
From: Minchan Kim @ 2021-03-24 23:07 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-mm, LKML, gregkh, surenb, joaodias, jhubbard, willy,
	digetx, Minchan Kim, Colin Ian King

Since CMA is getting used more widely, it's more important to
keep monitoring CMA statistics for system health since it's
directly related to user experience.

This patch introduces sysfs statistics for CMA, in order to provide
some basic monitoring of the CMA allocator.

 * the number of CMA page successful allocations
 * the number of CMA page allocation failures

These two values allow the user to calcuate the allocation
failure rate for each CMA area.

e.g.)
  /sys/kernel/mm/cma/WIFI/alloc_pages_[success|fail]
  /sys/kernel/mm/cma/SENSOR/alloc_pages_[success|fail]
  /sys/kernel/mm/cma/BLUETOOTH/alloc_pages_[success|fail]

The cma_stat was intentionally allocated by dynamic allocation
to harmonize with kobject lifetime management.
https://lore.kernel.org/linux-mm/YCOAmXqt6dZkCQYs@kroah.com/

Tested-by: Dmitry Osipenko <digetx@gmail.com>
Reviewed-by: Dmitry Osipenko <digetx@gmail.com>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: John Hubbard <jhubbard@nvidia.com>
Link: https://lore.kernel.org/linux-mm/20210316100433.17665-1-colin.king@canonical.com/
Addresses-Coverity: ("Dereference after null check")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Minchan Kim <minchan@kernel.org>
---
Andrew, you could apply this patch after revering these patches:

mm: cma: fix potential null dereference on pointer cma
mm: cma: support sysfs

From v7 - https://lore.kernel.org/linux-mm/20210324205503.2132082-1-minchan@kernel.org/
From v6 - https://lore.kernel.org/linux-mm/20210324010547.4134370-1-minchan@kernel.org/
From v5 - https://lore.kernel.org/linux-mm/20210323195050.2577017-1-minchan@kernel.org/
From v4 - https://lore.kernel.org/linux-mm/20210309062333.3216138-1-minchan@kernel.org/
 * fix corruption - digetx@
 * refactoring - digetx@, jhubbard@, willy@

From v4 - https://lore.kernel.org/linux-mm/20210309062333.3216138-1-minchan@kernel.org/
 * fix corruption - digetx@

From v3 - https://lore.kernel.org/linux-mm/20210303205053.2906924-1-minchan@kernel.org/
 * fix ZERO_OR_NULL_PTR - kernel test robot
 * remove prefix cma - david@
 * resolve conflict with vmstat cma in mmotm - akpm@
 * rename stat name with success|fail

From v2 - https://lore.kernel.org/linux-mm/20210208180142.2765456-1-minchan@kernel.org/
 * sysfs doc and description modification - jhubbard

From v1 - https://lore.kernel.org/linux-mm/20210203155001.4121868-1-minchan@kernel.org/
 * fix sysfs build and refactoring - willy
 * rename and drop some attributes - jhubbard

 Documentation/ABI/testing/sysfs-kernel-mm-cma |  25 ++++
 mm/Kconfig                                    |   7 ++
 mm/Makefile                                   |   1 +
 mm/cma.c                                      |   8 +-
 mm/cma.h                                      |  23 ++++
 mm/cma_sysfs.c                                | 112 ++++++++++++++++++
 6 files changed, 174 insertions(+), 2 deletions(-)
 create mode 100644 Documentation/ABI/testing/sysfs-kernel-mm-cma
 create mode 100644 mm/cma_sysfs.c

diff --git a/Documentation/ABI/testing/sysfs-kernel-mm-cma b/Documentation/ABI/testing/sysfs-kernel-mm-cma
new file mode 100644
index 000000000000..02b2bb60c296
--- /dev/null
+++ b/Documentation/ABI/testing/sysfs-kernel-mm-cma
@@ -0,0 +1,25 @@
+What:		/sys/kernel/mm/cma/
+Date:		Feb 2021
+Contact:	Minchan Kim <minchan@kernel.org>
+Description:
+		/sys/kernel/mm/cma/ contains a subdirectory for each CMA
+		heap name (also sometimes called CMA areas).
+
+		Each CMA heap subdirectory (that is, each
+		/sys/kernel/mm/cma/<cma-heap-name> directory) contains the
+		following items:
+
+			alloc_pages_success
+			alloc_pages_fail
+
+What:		/sys/kernel/mm/cma/<cma-heap-name>/alloc_pages_success
+Date:		Feb 2021
+Contact:	Minchan Kim <minchan@kernel.org>
+Description:
+		the number of pages CMA API succeeded to allocate
+
+What:		/sys/kernel/mm/cma/<cma-heap-name>/alloc_pages_fail
+Date:		Feb 2021
+Contact:	Minchan Kim <minchan@kernel.org>
+Description:
+		the number of pages CMA API failed to allocate
diff --git a/mm/Kconfig b/mm/Kconfig
index 23a0e3c98ff0..3823a2314256 100644
--- a/mm/Kconfig
+++ b/mm/Kconfig
@@ -524,6 +524,13 @@ config CMA_DEBUGFS
 	help
 	  Turns on the DebugFS interface for CMA.
 
+config CMA_SYSFS
+	bool "CMA information through sysfs interface"
+	depends on CMA && SYSFS
+	help
+	  This option exposes some sysfs attributes to get information
+	  from CMA.
+
 config CMA_AREAS
 	int "Maximum count of the CMA areas"
 	depends on CMA
diff --git a/mm/Makefile b/mm/Makefile
index 9e284dba50ef..788c5ce5c0ef 100644
--- a/mm/Makefile
+++ b/mm/Makefile
@@ -112,6 +112,7 @@ obj-$(CONFIG_CMA)	+= cma.o
 obj-$(CONFIG_MEMORY_BALLOON) += balloon_compaction.o
 obj-$(CONFIG_PAGE_EXTENSION) += page_ext.o
 obj-$(CONFIG_CMA_DEBUGFS) += cma_debug.o
+obj-$(CONFIG_CMA_SYSFS) += cma_sysfs.o
 obj-$(CONFIG_USERFAULTFD) += userfaultfd.o
 obj-$(CONFIG_IDLE_PAGE_TRACKING) += page_idle.o
 obj-$(CONFIG_DEBUG_PAGE_REF) += debug_page_ref.o
diff --git a/mm/cma.c b/mm/cma.c
index 0361e289c31a..08c45157911a 100644
--- a/mm/cma.c
+++ b/mm/cma.c
@@ -507,10 +507,14 @@ struct page *cma_alloc(struct cma *cma, size_t count, unsigned int align,
 
 	pr_debug("%s(): returned %p\n", __func__, page);
 out:
-	if (page)
+	if (page) {
 		count_vm_event(CMA_ALLOC_SUCCESS);
-	else
+		cma_sysfs_account_success_pages(cma, count);
+	} else {
 		count_vm_event(CMA_ALLOC_FAIL);
+		if (cma)
+			cma_sysfs_account_fail_pages(cma, count);
+	}
 
 	return page;
 }
diff --git a/mm/cma.h b/mm/cma.h
index 42ae082cb067..68ffad4e430d 100644
--- a/mm/cma.h
+++ b/mm/cma.h
@@ -3,6 +3,12 @@
 #define __MM_CMA_H__
 
 #include <linux/debugfs.h>
+#include <linux/kobject.h>
+
+struct cma_kobject {
+	struct kobject kobj;
+	struct cma *cma;
+};
 
 struct cma {
 	unsigned long   base_pfn;
@@ -16,6 +22,14 @@ struct cma {
 	struct debugfs_u32_array dfs_bitmap;
 #endif
 	char name[CMA_MAX_NAME];
+#ifdef CONFIG_CMA_SYSFS
+	/* the number of CMA page successful allocations */
+	atomic64_t nr_pages_succeeded;
+	/* the number of CMA page allocation failures */
+	atomic64_t nr_pages_failed;
+	/* kobject requires dynamic object */
+	struct cma_kobject *cma_kobj;
+#endif
 };
 
 extern struct cma cma_areas[MAX_CMA_AREAS];
@@ -26,4 +40,13 @@ static inline unsigned long cma_bitmap_maxno(struct cma *cma)
 	return cma->count >> cma->order_per_bit;
 }
 
+#ifdef CONFIG_CMA_SYSFS
+void cma_sysfs_account_success_pages(struct cma *cma, unsigned long nr_pages);
+void cma_sysfs_account_fail_pages(struct cma *cma, unsigned long nr_pages);
+#else
+static inline void cma_sysfs_account_success_pages(struct cma *cma,
+						   unsigned long nr_pages) {};
+static inline void cma_sysfs_account_fail_pages(struct cma *cma,
+						unsigned long nr_pages) {};
+#endif
 #endif
diff --git a/mm/cma_sysfs.c b/mm/cma_sysfs.c
new file mode 100644
index 000000000000..eb2f39caff59
--- /dev/null
+++ b/mm/cma_sysfs.c
@@ -0,0 +1,112 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * CMA SysFS Interface
+ *
+ * Copyright (c) 2021 Minchan Kim <minchan@kernel.org>
+ */
+
+#include <linux/cma.h>
+#include <linux/kernel.h>
+#include <linux/slab.h>
+
+#include "cma.h"
+
+#define CMA_ATTR_RO(_name) \
+	static struct kobj_attribute _name##_attr = __ATTR_RO(_name)
+
+void cma_sysfs_account_success_pages(struct cma *cma, unsigned long nr_pages)
+{
+	atomic64_add(nr_pages, &cma->nr_pages_succeeded);
+}
+
+void cma_sysfs_account_fail_pages(struct cma *cma, unsigned long nr_pages)
+{
+	atomic64_add(nr_pages, &cma->nr_pages_failed);
+}
+
+static inline struct cma *cma_from_kobj(struct kobject *kobj)
+{
+	return container_of(kobj, struct cma_kobject, kobj)->cma;
+}
+
+static ssize_t alloc_pages_success_show(struct kobject *kobj,
+					struct kobj_attribute *attr, char *buf)
+{
+	struct cma *cma = cma_from_kobj(kobj);
+
+	return sysfs_emit(buf, "%llu\n",
+			  atomic64_read(&cma->nr_pages_succeeded));
+}
+CMA_ATTR_RO(alloc_pages_success);
+
+static ssize_t alloc_pages_fail_show(struct kobject *kobj,
+				     struct kobj_attribute *attr, char *buf)
+{
+	struct cma *cma = cma_from_kobj(kobj);
+
+	return sysfs_emit(buf, "%llu\n", atomic64_read(&cma->nr_pages_failed));
+}
+CMA_ATTR_RO(alloc_pages_fail);
+
+static void cma_kobj_release(struct kobject *kobj)
+{
+	struct cma *cma = cma_from_kobj(kobj);
+	struct cma_kobject *cma_kobj = cma->cma_kobj;
+
+	kfree(cma_kobj);
+	cma->cma_kobj = NULL;
+}
+
+static struct attribute *cma_attrs[] = {
+	&alloc_pages_success_attr.attr,
+	&alloc_pages_fail_attr.attr,
+	NULL,
+};
+ATTRIBUTE_GROUPS(cma);
+
+static struct kobj_type cma_ktype = {
+	.release = cma_kobj_release,
+	.sysfs_ops = &kobj_sysfs_ops,
+	.default_groups = cma_groups,
+};
+
+static int __init cma_sysfs_init(void)
+{
+	struct kobject *cma_kobj_root;
+	struct cma_kobject *cma_kobj;
+	struct cma *cma;
+	int i, err;
+
+	cma_kobj_root = kobject_create_and_add("cma", mm_kobj);
+	if (!cma_kobj_root)
+		return -ENOMEM;
+
+	for (i = 0; i < cma_area_count; i++) {
+		cma_kobj = kzalloc(sizeof(*cma_kobj), GFP_KERNEL);
+		if (!cma_kobj) {
+			err = -ENOMEM;
+			goto out;
+		}
+
+		cma = &cma_areas[i];
+		cma->cma_kobj = cma_kobj;
+		cma_kobj->cma = cma;
+		err = kobject_init_and_add(&cma_kobj->kobj, &cma_ktype,
+					   cma_kobj_root, "%s", cma->name);
+		if (err) {
+			kobject_put(&cma_kobj->kobj);
+			goto out;
+		}
+	}
+
+	return 0;
+out:
+	while (--i >= 0) {
+		cma = &cma_areas[i];
+		kobject_put(&cma->cma_kobj->kobj);
+	}
+	kobject_put(cma_kobj_root);
+
+	return err;
+}
+subsys_initcall(cma_sysfs_init);
-- 
2.31.0.291.g576ba9dcdaf-goog



^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH v8] mm: cma: support sysfs
  2021-03-24 23:07 [PATCH v8] mm: cma: support sysfs Minchan Kim
@ 2021-03-26 13:59 ` Anders Roxell
  2021-03-26 15:51   ` Minchan Kim
  0 siblings, 1 reply; 4+ messages in thread
From: Anders Roxell @ 2021-03-26 13:59 UTC (permalink / raw)
  To: Minchan Kim
  Cc: Andrew Morton, linux-mm, LKML, Greg Kroah-Hartman, surenb,
	joaodias, jhubbard, Matthew Wilcox, digetx, Colin Ian King

On Thu, 25 Mar 2021 at 00:09, Minchan Kim <minchan@kernel.org> wrote:
>
> Since CMA is getting used more widely, it's more important to
> keep monitoring CMA statistics for system health since it's
> directly related to user experience.
>
> This patch introduces sysfs statistics for CMA, in order to provide
> some basic monitoring of the CMA allocator.
>
>  * the number of CMA page successful allocations
>  * the number of CMA page allocation failures
>
> These two values allow the user to calcuate the allocation
> failure rate for each CMA area.
>
> e.g.)
>   /sys/kernel/mm/cma/WIFI/alloc_pages_[success|fail]
>   /sys/kernel/mm/cma/SENSOR/alloc_pages_[success|fail]
>   /sys/kernel/mm/cma/BLUETOOTH/alloc_pages_[success|fail]
>
> The cma_stat was intentionally allocated by dynamic allocation
> to harmonize with kobject lifetime management.
> https://lore.kernel.org/linux-mm/YCOAmXqt6dZkCQYs@kroah.com/
>
> Tested-by: Dmitry Osipenko <digetx@gmail.com>
> Reviewed-by: Dmitry Osipenko <digetx@gmail.com>
> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> Reviewed-by: John Hubbard <jhubbard@nvidia.com>
> Link: https://lore.kernel.org/linux-mm/20210316100433.17665-1-colin.king@canonical.com/
> Addresses-Coverity: ("Dereference after null check")
> Signed-off-by: Colin Ian King <colin.king@canonical.com>
> Signed-off-by: Minchan Kim <minchan@kernel.org>

When I build an arm64 kernel (allmodconfig - boot selftest) on today's
next tag: next-20210326, I see this issue when I'm booting in qemu.

[    0.985891][    T9] Callback from call_rcu_tasks() invoked.
[    1.008860][    T1] smp: Bringing up secondary CPUs ...
[    1.012655][    T1] smp: Brought up 1 node, 1 CPU
[    1.015194][    T1] SMP: Total of 1 processors activated.
[    1.018987][    T1] CPU features: detected: 32-bit EL0 Support
[    1.021995][    T1] CPU features: detected: CRC32 instructions
[    1.026047][    T1] CPU features: detected: 32-bit EL1 Support
[    1.033728][    T1] CPU features: emulated: Privileged Access Never
(PAN) using TTBR0_EL1 switching
[    2.140828][    T1] CPU: All CPU(s) started at EL1
[    2.144773][   T17] alternatives: patching kernel code
[  132.866390][    C0] watchdog: BUG: soft lockup - CPU#0 stuck for
25s! [pgdatinit0:20]
[  132.870865][    C0] Modules linked in:
[  132.873037][    C0] irq event stamp: 739758
[  132.875308][    C0] hardirqs last  enabled at (739757):
[<ffff8000126fb3d0>] _raw_spin_unlock_irqrestore+0x90/0x100
[  132.880740][    C0] hardirqs last disabled at (739758):
[<ffff8000126e30a4>] enter_el1_irq_or_nmi+0xa4/0xc0
[  132.885801][    C0] softirqs last  enabled at (739056):
[<ffff800010010f98>] __do_softirq+0x8b8/0x9ac
[  132.890571][    C0] softirqs last disabled at (739051):
[<ffff80001013742c>] __irq_exit_rcu+0x1ac/0x240
[  132.895560][    C0] CPU: 0 PID: 20 Comm: pgdatinit0 Not tainted
5.12.0-rc4-next-20210326-00008-g23921ff47279 #1
[  132.900759][    C0] Hardware name: linux,dummy-virt (DT)
[  132.903637][    C0] pstate: 40400005 (nZcv daif +PAN -UAO -TCO BTYPE=--)
[  132.907212][    C0] pc : _raw_spin_unlock_irqrestore+0xa4/0x100
[  132.910432][    C0] lr : _raw_spin_unlock_irqrestore+0x90/0x100
[  132.913647][    C0] sp : ffff000007b9f640
[  132.915832][    C0] x29: ffff000007b9f640 x28: ffff800016954518
[  132.919237][    C0] x27: 000000000000000e x26: dead000000000100
[  132.922689][    C0] x25: dead000000000122 x24: 00000000000559b0
[  132.926098][    C0] x23: ffff80001550e000 x22: ffff800016954530
[  132.929479][    C0] x21: ffff800016954518 x20: 0000000000000000
[  132.932901][    C0] x19: ffff800010f662f4 x18: 0000000000001530
[  132.936312][    C0] x17: 0000000000001470 x16: 0000000000005518
[  132.939723][    C0] x15: 0000000000001578 x14: ffff800010189520
[  132.943107][    C0] x13: ffff8000107592e0 x12: ffff600000f73eb1
[  132.946520][    C0] x11: 1fffe00000f73eb0 x10: ffff600000f73eb0
[  132.949914][    C0] x9 : dfff800000000000 x8 : ffff000007b9f587
[  132.953312][    C0] x7 : 0000000000000001 x6 : 00009fffff08c150
[  132.956713][    C0] x5 : 0000000000000000 x4 : 0000000000000000
[  132.960117][    C0] x3 : ffff000007b90040 x2 : 000000000005e6fd
[  132.963521][    C0] x1 : 00000000000000c0 x0 : 0000000000000080
[  132.966889][    C0] Call trace:
[  132.968667][    C0]  _raw_spin_unlock_irqrestore+0xa4/0x100
[  132.971754][    C0]  __debug_check_no_obj_freed+0x1d4/0x2a0
[  132.974890][    C0]  debug_check_no_obj_freed+0x20/0x80
[  132.977813][    C0]  __free_pages_ok+0x5a0/0x740
[  132.980384][    C0]  __free_pages_core+0x24c/0x280
[  132.983091][    C0]  deferred_free_range+0x6c/0xbc
[  132.985826][    C0]  deferred_init_maxorder+0x2d0/0x350
[  132.988735][    C0]  deferred_init_memmap_chunk+0xc8/0x124
[  132.991784][    C0]  padata_do_multithreaded+0x15c/0x578
[  132.994723][    C0]  deferred_init_memmap+0x26c/0x364
[  132.997560][    C0]  kthread+0x23c/0x260
[  132.999851][    C0]  ret_from_fork+0x10/0x18
[  133.002324][    C0] Kernel panic - not syncing: softlockup: hung tasks
[  133.005767][    C0] CPU: 0 PID: 20 Comm: pgdatinit0 Tainted: G
       L    5.12.0-rc4-next-20210326-00008-g23921ff47279 #1
[  133.011613][    C0] Hardware name: linux,dummy-virt (DT)
[  133.014435][    C0] Call trace:
[  133.016143][    C0]  dump_backtrace+0x0/0x420
[  133.018617][    C0]  show_stack+0x38/0x60
[  133.020882][    C0]  dump_stack+0x1fc/0x2c8
[  133.023343][    C0]  panic+0x304/0x5d8
[  133.025567][    C0]  watchdog_timer_fn+0x4ac/0x500
[  133.028209][    C0]  __run_hrtimer+0x770/0xba0
[  133.030734][    C0]  __hrtimer_run_queues+0x1a0/0x220
[  133.033537][    C0]  hrtimer_run_queues+0x20c/0x240
[  133.036202][    C0]  update_process_times+0xbc/0x1a0
[  133.038997][    C0]  tick_periodic+0x27c/0x2c0
[  133.041510][    C0]  tick_handle_periodic+0x44/0x120
[  133.044267][    C0]  arch_timer_handler_virt+0x68/0xa0
[  133.047226][    C0]  handle_percpu_devid_irq+0x118/0x2a0
[  133.050229][    C0]  __handle_domain_irq+0x150/0x1c0
[  133.052959][    C0]  gic_handle_irq+0x130/0x180
[  133.055505][    C0]  el1_irq+0xc0/0x15c
[  133.057723][    C0]  _raw_spin_unlock_irqrestore+0xa4/0x100
[  133.060792][    C0]  __debug_check_no_obj_freed+0x1d4/0x2a0
[  133.063869][    C0]  debug_check_no_obj_freed+0x20/0x80
[  133.066813][    C0]  __free_pages_ok+0x5a0/0x740
[  133.069409][    C0]  __free_pages_core+0x24c/0x280
[  133.072127][    C0]  deferred_free_range+0x6c/0xbc
[  133.074847][    C0]  deferred_init_maxorder+0x2d0/0x350
[  133.077803][    C0]  deferred_init_memmap_chunk+0xc8/0x124
[  133.080834][    C0]  padata_do_multithreaded+0x15c/0x578
[  133.083791][    C0]  deferred_init_memmap+0x26c/0x364
[  133.086614][    C0]  kthread+0x23c/0x260
[  133.088879][    C0]  ret_from_fork+0x10/0x18
[  133.092092][    C0] ---[ end Kernel panic - not syncing:
softlockup: hung tasks ]---

Full log [1], and my .config [2].

I bisected down to patch 799815f497e2 ("mm: cma: support sysfs").

When I revert
799815f497e2 ("mm: cma: support sysfs")
7af97692f30d ("mm: cma: fix potential null dereference on pointer cma")

The kernel boots fine.

Any idea whats happening?

Cheers,
Anders
[1] http://ix.io/2U9S
[2] http://ix.io/2Ua3


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH v8] mm: cma: support sysfs
  2021-03-26 13:59 ` Anders Roxell
@ 2021-03-26 15:51   ` Minchan Kim
  2021-03-26 17:27     ` Anders Roxell
  0 siblings, 1 reply; 4+ messages in thread
From: Minchan Kim @ 2021-03-26 15:51 UTC (permalink / raw)
  To: Anders Roxell
  Cc: Andrew Morton, linux-mm, LKML, Greg Kroah-Hartman, surenb,
	joaodias, jhubbard, Matthew Wilcox, digetx, Colin Ian King

On Fri, Mar 26, 2021 at 02:59:30PM +0100, Anders Roxell wrote:
> On Thu, 25 Mar 2021 at 00:09, Minchan Kim <minchan@kernel.org> wrote:
> >
> > Since CMA is getting used more widely, it's more important to
> > keep monitoring CMA statistics for system health since it's
> > directly related to user experience.
> >
> > This patch introduces sysfs statistics for CMA, in order to provide
> > some basic monitoring of the CMA allocator.
> >
> >  * the number of CMA page successful allocations
> >  * the number of CMA page allocation failures
> >
> > These two values allow the user to calcuate the allocation
> > failure rate for each CMA area.
> >
> > e.g.)
> >   /sys/kernel/mm/cma/WIFI/alloc_pages_[success|fail]
> >   /sys/kernel/mm/cma/SENSOR/alloc_pages_[success|fail]
> >   /sys/kernel/mm/cma/BLUETOOTH/alloc_pages_[success|fail]
> >
> > The cma_stat was intentionally allocated by dynamic allocation
> > to harmonize with kobject lifetime management.
> > https://lore.kernel.org/linux-mm/YCOAmXqt6dZkCQYs@kroah.com/
> >
> > Tested-by: Dmitry Osipenko <digetx@gmail.com>
> > Reviewed-by: Dmitry Osipenko <digetx@gmail.com>
> > Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> > Reviewed-by: John Hubbard <jhubbard@nvidia.com>
> > Link: https://lore.kernel.org/linux-mm/20210316100433.17665-1-colin.king@canonical.com/
> > Addresses-Coverity: ("Dereference after null check")
> > Signed-off-by: Colin Ian King <colin.king@canonical.com>
> > Signed-off-by: Minchan Kim <minchan@kernel.org>
> 
> When I build an arm64 kernel (allmodconfig - boot selftest) on today's
> next tag: next-20210326, I see this issue when I'm booting in qemu.
> 
> [    0.985891][    T9] Callback from call_rcu_tasks() invoked.
> [    1.008860][    T1] smp: Bringing up secondary CPUs ...
> [    1.012655][    T1] smp: Brought up 1 node, 1 CPU
> [    1.015194][    T1] SMP: Total of 1 processors activated.
> [    1.018987][    T1] CPU features: detected: 32-bit EL0 Support
> [    1.021995][    T1] CPU features: detected: CRC32 instructions
> [    1.026047][    T1] CPU features: detected: 32-bit EL1 Support
> [    1.033728][    T1] CPU features: emulated: Privileged Access Never
> (PAN) using TTBR0_EL1 switching
> [    2.140828][    T1] CPU: All CPU(s) started at EL1
> [    2.144773][   T17] alternatives: patching kernel code
> [  132.866390][    C0] watchdog: BUG: soft lockup - CPU#0 stuck for
> 25s! [pgdatinit0:20]
> [  132.870865][    C0] Modules linked in:
> [  132.873037][    C0] irq event stamp: 739758
> [  132.875308][    C0] hardirqs last  enabled at (739757):
> [<ffff8000126fb3d0>] _raw_spin_unlock_irqrestore+0x90/0x100
> [  132.880740][    C0] hardirqs last disabled at (739758):
> [<ffff8000126e30a4>] enter_el1_irq_or_nmi+0xa4/0xc0
> [  132.885801][    C0] softirqs last  enabled at (739056):
> [<ffff800010010f98>] __do_softirq+0x8b8/0x9ac
> [  132.890571][    C0] softirqs last disabled at (739051):
> [<ffff80001013742c>] __irq_exit_rcu+0x1ac/0x240
> [  132.895560][    C0] CPU: 0 PID: 20 Comm: pgdatinit0 Not tainted
> 5.12.0-rc4-next-20210326-00008-g23921ff47279 #1
> [  132.900759][    C0] Hardware name: linux,dummy-virt (DT)
> [  132.903637][    C0] pstate: 40400005 (nZcv daif +PAN -UAO -TCO BTYPE=--)
> [  132.907212][    C0] pc : _raw_spin_unlock_irqrestore+0xa4/0x100
> [  132.910432][    C0] lr : _raw_spin_unlock_irqrestore+0x90/0x100
> [  132.913647][    C0] sp : ffff000007b9f640
> [  132.915832][    C0] x29: ffff000007b9f640 x28: ffff800016954518
> [  132.919237][    C0] x27: 000000000000000e x26: dead000000000100
> [  132.922689][    C0] x25: dead000000000122 x24: 00000000000559b0
> [  132.926098][    C0] x23: ffff80001550e000 x22: ffff800016954530
> [  132.929479][    C0] x21: ffff800016954518 x20: 0000000000000000
> [  132.932901][    C0] x19: ffff800010f662f4 x18: 0000000000001530
> [  132.936312][    C0] x17: 0000000000001470 x16: 0000000000005518
> [  132.939723][    C0] x15: 0000000000001578 x14: ffff800010189520
> [  132.943107][    C0] x13: ffff8000107592e0 x12: ffff600000f73eb1
> [  132.946520][    C0] x11: 1fffe00000f73eb0 x10: ffff600000f73eb0
> [  132.949914][    C0] x9 : dfff800000000000 x8 : ffff000007b9f587
> [  132.953312][    C0] x7 : 0000000000000001 x6 : 00009fffff08c150
> [  132.956713][    C0] x5 : 0000000000000000 x4 : 0000000000000000
> [  132.960117][    C0] x3 : ffff000007b90040 x2 : 000000000005e6fd
> [  132.963521][    C0] x1 : 00000000000000c0 x0 : 0000000000000080
> [  132.966889][    C0] Call trace:
> [  132.968667][    C0]  _raw_spin_unlock_irqrestore+0xa4/0x100
> [  132.971754][    C0]  __debug_check_no_obj_freed+0x1d4/0x2a0
> [  132.974890][    C0]  debug_check_no_obj_freed+0x20/0x80
> [  132.977813][    C0]  __free_pages_ok+0x5a0/0x740
> [  132.980384][    C0]  __free_pages_core+0x24c/0x280
> [  132.983091][    C0]  deferred_free_range+0x6c/0xbc
> [  132.985826][    C0]  deferred_init_maxorder+0x2d0/0x350
> [  132.988735][    C0]  deferred_init_memmap_chunk+0xc8/0x124
> [  132.991784][    C0]  padata_do_multithreaded+0x15c/0x578
> [  132.994723][    C0]  deferred_init_memmap+0x26c/0x364
> [  132.997560][    C0]  kthread+0x23c/0x260
> [  132.999851][    C0]  ret_from_fork+0x10/0x18
> [  133.002324][    C0] Kernel panic - not syncing: softlockup: hung tasks
> [  133.005767][    C0] CPU: 0 PID: 20 Comm: pgdatinit0 Tainted: G
>        L    5.12.0-rc4-next-20210326-00008-g23921ff47279 #1
> [  133.011613][    C0] Hardware name: linux,dummy-virt (DT)
> [  133.014435][    C0] Call trace:
> [  133.016143][    C0]  dump_backtrace+0x0/0x420
> [  133.018617][    C0]  show_stack+0x38/0x60
> [  133.020882][    C0]  dump_stack+0x1fc/0x2c8
> [  133.023343][    C0]  panic+0x304/0x5d8
> [  133.025567][    C0]  watchdog_timer_fn+0x4ac/0x500
> [  133.028209][    C0]  __run_hrtimer+0x770/0xba0
> [  133.030734][    C0]  __hrtimer_run_queues+0x1a0/0x220
> [  133.033537][    C0]  hrtimer_run_queues+0x20c/0x240
> [  133.036202][    C0]  update_process_times+0xbc/0x1a0
> [  133.038997][    C0]  tick_periodic+0x27c/0x2c0
> [  133.041510][    C0]  tick_handle_periodic+0x44/0x120
> [  133.044267][    C0]  arch_timer_handler_virt+0x68/0xa0
> [  133.047226][    C0]  handle_percpu_devid_irq+0x118/0x2a0
> [  133.050229][    C0]  __handle_domain_irq+0x150/0x1c0
> [  133.052959][    C0]  gic_handle_irq+0x130/0x180
> [  133.055505][    C0]  el1_irq+0xc0/0x15c
> [  133.057723][    C0]  _raw_spin_unlock_irqrestore+0xa4/0x100
> [  133.060792][    C0]  __debug_check_no_obj_freed+0x1d4/0x2a0
> [  133.063869][    C0]  debug_check_no_obj_freed+0x20/0x80
> [  133.066813][    C0]  __free_pages_ok+0x5a0/0x740
> [  133.069409][    C0]  __free_pages_core+0x24c/0x280
> [  133.072127][    C0]  deferred_free_range+0x6c/0xbc
> [  133.074847][    C0]  deferred_init_maxorder+0x2d0/0x350
> [  133.077803][    C0]  deferred_init_memmap_chunk+0xc8/0x124
> [  133.080834][    C0]  padata_do_multithreaded+0x15c/0x578
> [  133.083791][    C0]  deferred_init_memmap+0x26c/0x364
> [  133.086614][    C0]  kthread+0x23c/0x260
> [  133.088879][    C0]  ret_from_fork+0x10/0x18
> [  133.092092][    C0] ---[ end Kernel panic - not syncing:
> softlockup: hung tasks ]---
> 
> Full log [1], and my .config [2].
> 
> I bisected down to patch 799815f497e2 ("mm: cma: support sysfs").
> 
> When I revert
> 799815f497e2 ("mm: cma: support sysfs")
> 7af97692f30d ("mm: cma: fix potential null dereference on pointer cma")
> 
> The kernel boots fine.
> 
> Any idea whats happening?

Hi Anders,

Dmitry reported the crash(However, your callstack is not the same
and didn't show any CMA stuffs so I am not sure it's same crash)
and posted the fix.

https://lore.kernel.org/linux-mm/20210324192044.1505747-1-minchan@kernel.org/

However, in the end, it was folded into original patchset to replace it.
That is an this v8 patch. So, could you try it?

1. revert 7af97692f30d ("mm: cma: fix potential null dereference on pointer cma")
2. revert 799815f497e2 ("mm: cma: support sysfs")
3. apply this v8 patch.

Thank you.




> 
> Cheers,
> Anders
> [1] http://ix.io/2U9S
> [2] http://ix.io/2Ua3


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH v8] mm: cma: support sysfs
  2021-03-26 15:51   ` Minchan Kim
@ 2021-03-26 17:27     ` Anders Roxell
  0 siblings, 0 replies; 4+ messages in thread
From: Anders Roxell @ 2021-03-26 17:27 UTC (permalink / raw)
  To: Minchan Kim
  Cc: Andrew Morton, linux-mm, LKML, Greg Kroah-Hartman,
	Suren Baghdasaryan, John Dias, jhubbard, Matthew Wilcox, digetx,
	Colin Ian King

On Fri, 26 Mar 2021 at 16:51, Minchan Kim <minchan@kernel.org> wrote:
>
> On Fri, Mar 26, 2021 at 02:59:30PM +0100, Anders Roxell wrote:
> > On Thu, 25 Mar 2021 at 00:09, Minchan Kim <minchan@kernel.org> wrote:
> > >
> > > Since CMA is getting used more widely, it's more important to
> > > keep monitoring CMA statistics for system health since it's
> > > directly related to user experience.
> > >
> > > This patch introduces sysfs statistics for CMA, in order to provide
> > > some basic monitoring of the CMA allocator.
> > >
> > >  * the number of CMA page successful allocations
> > >  * the number of CMA page allocation failures
> > >
> > > These two values allow the user to calcuate the allocation
> > > failure rate for each CMA area.
> > >
> > > e.g.)
> > >   /sys/kernel/mm/cma/WIFI/alloc_pages_[success|fail]
> > >   /sys/kernel/mm/cma/SENSOR/alloc_pages_[success|fail]
> > >   /sys/kernel/mm/cma/BLUETOOTH/alloc_pages_[success|fail]
> > >
> > > The cma_stat was intentionally allocated by dynamic allocation
> > > to harmonize with kobject lifetime management.
> > > https://lore.kernel.org/linux-mm/YCOAmXqt6dZkCQYs@kroah.com/
> > >
> > > Tested-by: Dmitry Osipenko <digetx@gmail.com>
> > > Reviewed-by: Dmitry Osipenko <digetx@gmail.com>
> > > Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> > > Reviewed-by: John Hubbard <jhubbard@nvidia.com>
> > > Link: https://lore.kernel.org/linux-mm/20210316100433.17665-1-colin.king@canonical.com/
> > > Addresses-Coverity: ("Dereference after null check")
> > > Signed-off-by: Colin Ian King <colin.king@canonical.com>
> > > Signed-off-by: Minchan Kim <minchan@kernel.org>
> >
> > When I build an arm64 kernel (allmodconfig - boot selftest) on today's
> > next tag: next-20210326, I see this issue when I'm booting in qemu.
> >
> > [    0.985891][    T9] Callback from call_rcu_tasks() invoked.
> > [    1.008860][    T1] smp: Bringing up secondary CPUs ...
> > [    1.012655][    T1] smp: Brought up 1 node, 1 CPU
> > [    1.015194][    T1] SMP: Total of 1 processors activated.
> > [    1.018987][    T1] CPU features: detected: 32-bit EL0 Support
> > [    1.021995][    T1] CPU features: detected: CRC32 instructions
> > [    1.026047][    T1] CPU features: detected: 32-bit EL1 Support
> > [    1.033728][    T1] CPU features: emulated: Privileged Access Never
> > (PAN) using TTBR0_EL1 switching
> > [    2.140828][    T1] CPU: All CPU(s) started at EL1
> > [    2.144773][   T17] alternatives: patching kernel code
> > [  132.866390][    C0] watchdog: BUG: soft lockup - CPU#0 stuck for
> > 25s! [pgdatinit0:20]
> > [  132.870865][    C0] Modules linked in:
> > [  132.873037][    C0] irq event stamp: 739758
> > [  132.875308][    C0] hardirqs last  enabled at (739757):
> > [<ffff8000126fb3d0>] _raw_spin_unlock_irqrestore+0x90/0x100
> > [  132.880740][    C0] hardirqs last disabled at (739758):
> > [<ffff8000126e30a4>] enter_el1_irq_or_nmi+0xa4/0xc0
> > [  132.885801][    C0] softirqs last  enabled at (739056):
> > [<ffff800010010f98>] __do_softirq+0x8b8/0x9ac
> > [  132.890571][    C0] softirqs last disabled at (739051):
> > [<ffff80001013742c>] __irq_exit_rcu+0x1ac/0x240
> > [  132.895560][    C0] CPU: 0 PID: 20 Comm: pgdatinit0 Not tainted
> > 5.12.0-rc4-next-20210326-00008-g23921ff47279 #1
> > [  132.900759][    C0] Hardware name: linux,dummy-virt (DT)
> > [  132.903637][    C0] pstate: 40400005 (nZcv daif +PAN -UAO -TCO BTYPE=--)
> > [  132.907212][    C0] pc : _raw_spin_unlock_irqrestore+0xa4/0x100
> > [  132.910432][    C0] lr : _raw_spin_unlock_irqrestore+0x90/0x100
> > [  132.913647][    C0] sp : ffff000007b9f640
> > [  132.915832][    C0] x29: ffff000007b9f640 x28: ffff800016954518
> > [  132.919237][    C0] x27: 000000000000000e x26: dead000000000100
> > [  132.922689][    C0] x25: dead000000000122 x24: 00000000000559b0
> > [  132.926098][    C0] x23: ffff80001550e000 x22: ffff800016954530
> > [  132.929479][    C0] x21: ffff800016954518 x20: 0000000000000000
> > [  132.932901][    C0] x19: ffff800010f662f4 x18: 0000000000001530
> > [  132.936312][    C0] x17: 0000000000001470 x16: 0000000000005518
> > [  132.939723][    C0] x15: 0000000000001578 x14: ffff800010189520
> > [  132.943107][    C0] x13: ffff8000107592e0 x12: ffff600000f73eb1
> > [  132.946520][    C0] x11: 1fffe00000f73eb0 x10: ffff600000f73eb0
> > [  132.949914][    C0] x9 : dfff800000000000 x8 : ffff000007b9f587
> > [  132.953312][    C0] x7 : 0000000000000001 x6 : 00009fffff08c150
> > [  132.956713][    C0] x5 : 0000000000000000 x4 : 0000000000000000
> > [  132.960117][    C0] x3 : ffff000007b90040 x2 : 000000000005e6fd
> > [  132.963521][    C0] x1 : 00000000000000c0 x0 : 0000000000000080
> > [  132.966889][    C0] Call trace:
> > [  132.968667][    C0]  _raw_spin_unlock_irqrestore+0xa4/0x100
> > [  132.971754][    C0]  __debug_check_no_obj_freed+0x1d4/0x2a0
> > [  132.974890][    C0]  debug_check_no_obj_freed+0x20/0x80
> > [  132.977813][    C0]  __free_pages_ok+0x5a0/0x740
> > [  132.980384][    C0]  __free_pages_core+0x24c/0x280
> > [  132.983091][    C0]  deferred_free_range+0x6c/0xbc
> > [  132.985826][    C0]  deferred_init_maxorder+0x2d0/0x350
> > [  132.988735][    C0]  deferred_init_memmap_chunk+0xc8/0x124
> > [  132.991784][    C0]  padata_do_multithreaded+0x15c/0x578
> > [  132.994723][    C0]  deferred_init_memmap+0x26c/0x364
> > [  132.997560][    C0]  kthread+0x23c/0x260
> > [  132.999851][    C0]  ret_from_fork+0x10/0x18
> > [  133.002324][    C0] Kernel panic - not syncing: softlockup: hung tasks
> > [  133.005767][    C0] CPU: 0 PID: 20 Comm: pgdatinit0 Tainted: G
> >        L    5.12.0-rc4-next-20210326-00008-g23921ff47279 #1
> > [  133.011613][    C0] Hardware name: linux,dummy-virt (DT)
> > [  133.014435][    C0] Call trace:
> > [  133.016143][    C0]  dump_backtrace+0x0/0x420
> > [  133.018617][    C0]  show_stack+0x38/0x60
> > [  133.020882][    C0]  dump_stack+0x1fc/0x2c8
> > [  133.023343][    C0]  panic+0x304/0x5d8
> > [  133.025567][    C0]  watchdog_timer_fn+0x4ac/0x500
> > [  133.028209][    C0]  __run_hrtimer+0x770/0xba0
> > [  133.030734][    C0]  __hrtimer_run_queues+0x1a0/0x220
> > [  133.033537][    C0]  hrtimer_run_queues+0x20c/0x240
> > [  133.036202][    C0]  update_process_times+0xbc/0x1a0
> > [  133.038997][    C0]  tick_periodic+0x27c/0x2c0
> > [  133.041510][    C0]  tick_handle_periodic+0x44/0x120
> > [  133.044267][    C0]  arch_timer_handler_virt+0x68/0xa0
> > [  133.047226][    C0]  handle_percpu_devid_irq+0x118/0x2a0
> > [  133.050229][    C0]  __handle_domain_irq+0x150/0x1c0
> > [  133.052959][    C0]  gic_handle_irq+0x130/0x180
> > [  133.055505][    C0]  el1_irq+0xc0/0x15c
> > [  133.057723][    C0]  _raw_spin_unlock_irqrestore+0xa4/0x100
> > [  133.060792][    C0]  __debug_check_no_obj_freed+0x1d4/0x2a0
> > [  133.063869][    C0]  debug_check_no_obj_freed+0x20/0x80
> > [  133.066813][    C0]  __free_pages_ok+0x5a0/0x740
> > [  133.069409][    C0]  __free_pages_core+0x24c/0x280
> > [  133.072127][    C0]  deferred_free_range+0x6c/0xbc
> > [  133.074847][    C0]  deferred_init_maxorder+0x2d0/0x350
> > [  133.077803][    C0]  deferred_init_memmap_chunk+0xc8/0x124
> > [  133.080834][    C0]  padata_do_multithreaded+0x15c/0x578
> > [  133.083791][    C0]  deferred_init_memmap+0x26c/0x364
> > [  133.086614][    C0]  kthread+0x23c/0x260
> > [  133.088879][    C0]  ret_from_fork+0x10/0x18
> > [  133.092092][    C0] ---[ end Kernel panic - not syncing:
> > softlockup: hung tasks ]---
> >
> > Full log [1], and my .config [2].
> >
> > I bisected down to patch 799815f497e2 ("mm: cma: support sysfs").
> >
> > When I revert
> > 799815f497e2 ("mm: cma: support sysfs")
> > 7af97692f30d ("mm: cma: fix potential null dereference on pointer cma")
> >
> > The kernel boots fine.
> >
> > Any idea whats happening?
>
> Hi Anders,
>
> Dmitry reported the crash(However, your callstack is not the same
> and didn't show any CMA stuffs so I am not sure it's same crash)
> and posted the fix.
>
> https://lore.kernel.org/linux-mm/20210324192044.1505747-1-minchan@kernel.org/
>
> However, in the end, it was folded into original patchset to replace it.

Oh, thank you for letting me know, sorry that I reported on the wrong one.

> That is an this v8 patch. So, could you try it?

I was able to boot the kernel.

Cheers,
Anders


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2021-03-26 17:27 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-03-24 23:07 [PATCH v8] mm: cma: support sysfs Minchan Kim
2021-03-26 13:59 ` Anders Roxell
2021-03-26 15:51   ` Minchan Kim
2021-03-26 17:27     ` Anders Roxell

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).