linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* incoming
@ 2022-01-29  2:13 Andrew Morton
  2022-01-29  2:14 ` [patch 1/7] include/linux/sysctl.h: fix register_sysctl_mount_point() return type Andrew Morton
                   ` (7 more replies)
  0 siblings, 8 replies; 10+ messages in thread
From: Andrew Morton @ 2022-01-29  2:13 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: mm-commits, linux-mm

12 patches, based on 169387e2aa291a4e3cb856053730fe99d6cec06f.

Subsystems affected by this patch series:

  sysctl
  binfmt
  ia64
  mm/memory-failure
  mm/folios
  selftests
  mm/kasan
  mm/psi
  ocfs2

Subsystem: sysctl

    Andrew Morton <akpm@linux-foundation.org>:
      include/linux/sysctl.h: fix register_sysctl_mount_point() return type

Subsystem: binfmt

    Tong Zhang <ztong0001@gmail.com>:
      binfmt_misc: fix crash when load/unload module

Subsystem: ia64

    Randy Dunlap <rdunlap@infradead.org>:
      ia64: make IA64_MCA_RECOVERY bool instead of tristate

Subsystem: mm/memory-failure

    Joao Martins <joao.m.martins@oracle.com>:
      memory-failure: fetch compound_head after pgmap_pfn_valid()

Subsystem: mm/folios

    Wei Yang <richard.weiyang@gmail.com>:
      mm: page->mapping folio->mapping should have the same offset

Subsystem: selftests

    Maor Gottlieb <maorg@nvidia.com>:
      tools/testing/scatterlist: add missing defines

Subsystem: mm/kasan

    Marco Elver <elver@google.com>:
      kasan: test: fix compatibility with FORTIFY_SOURCE

    Peter Collingbourne <pcc@google.com>:
      mm, kasan: use compare-exchange operation to set KASAN page tag

Subsystem: mm/psi

    Suren Baghdasaryan <surenb@google.com>:
      psi: fix "no previous prototype" warnings when CONFIG_CGROUPS=n
      psi: fix "defined but not used" warnings when CONFIG_PROC_FS=n

Subsystem: ocfs2

    Joseph Qi <joseph.qi@linux.alibaba.com>:
    Patch series "ocfs2: fix a deadlock case":
      jbd2: export jbd2_journal_[grab|put]_journal_head
      ocfs2: fix a deadlock when commit trans

 arch/ia64/Kconfig                    |    2 
 fs/binfmt_misc.c                     |    8 +--
 fs/jbd2/journal.c                    |    2 
 fs/ocfs2/suballoc.c                  |   25 ++++-------
 include/linux/mm.h                   |   17 +++++--
 include/linux/mm_types.h             |    1 
 include/linux/psi.h                  |   11 ++--
 include/linux/sysctl.h               |    2 
 kernel/sched/psi.c                   |   79 ++++++++++++++++++-----------------
 lib/test_kasan.c                     |    5 ++
 mm/memory-failure.c                  |    6 ++
 tools/testing/scatterlist/linux/mm.h |    3 -
 12 files changed, 91 insertions(+), 70 deletions(-)



^ permalink raw reply	[flat|nested] 10+ messages in thread

* [patch 1/7] include/linux/sysctl.h: fix register_sysctl_mount_point() return type
  2022-01-29  2:13 incoming Andrew Morton
@ 2022-01-29  2:14 ` Andrew Morton
  2022-01-29  2:14 ` [patch 2/7] binfmt_misc: fix crash when load/unload module Andrew Morton
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 10+ messages in thread
From: Andrew Morton @ 2022-01-29  2:14 UTC (permalink / raw)
  To: akpm, linux-mm, lkp, mcgrof, mm-commits, torvalds, ztong0001

From: Andrew Morton <akpm@linux-foundation.org>
Subject: include/linux/sysctl.h: fix register_sysctl_mount_point() return type

The CONFIG_SYSCTL=n stub returns the wrong type.

Fixes: ee9efac48a082 ("sysctl: add helper to register a sysctl mount point")
Reported-by: kernel test robot <lkp@intel.com>
Acked-by: Luis Chamberlain <mcgrof@kernel.org>
Cc: Tong Zhang <ztong0001@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 include/linux/sysctl.h |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/include/linux/sysctl.h~include-linux-sysctlh-fix-register_sysctl_mount_point-return-type
+++ a/include/linux/sysctl.h
@@ -265,7 +265,7 @@ static inline struct ctl_table_header *r
 	return NULL;
 }
 
-static inline struct sysctl_header *register_sysctl_mount_point(const char *path)
+static inline struct ctl_table_header *register_sysctl_mount_point(const char *path)
 {
 	return NULL;
 }
_


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [patch 2/7] binfmt_misc: fix crash when load/unload module
  2022-01-29  2:13 incoming Andrew Morton
  2022-01-29  2:14 ` [patch 1/7] include/linux/sysctl.h: fix register_sysctl_mount_point() return type Andrew Morton
@ 2022-01-29  2:14 ` Andrew Morton
  2022-01-29  2:14 ` [patch 3/7] memory-failure: fetch compound_head after pgmap_pfn_valid() Andrew Morton
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 10+ messages in thread
From: Andrew Morton @ 2022-01-29  2:14 UTC (permalink / raw)
  To: akpm, brauner, ebiederm, keescook, linux-mm, mcgrof, mm-commits,
	torvalds, yzaikin, ztong0001

From: Tong Zhang <ztong0001@gmail.com>
Subject: binfmt_misc: fix crash when load/unload module

We should unregister the table upon module unload otherwise something
horrible will happen when we load binfmt_misc module again.  Also note
that we should keep value returned by register_sysctl_mount_point() and
release it later, otherwise it will leak.

Also, per Christian's comment, to fully restore the old behavior that
won't break userspace the check(binfmt_misc_header) should be eliminated.

reproduce:
modprobe binfmt_misc
modprobe -r binfmt_misc
modprobe binfmt_misc
modprobe -r binfmt_misc
modprobe binfmt_misc

[   18.032038] Call Trace:
[   18.032108]  <TASK>
[   18.032169]  dump_stack_lvl+0x34/0x44
[   18.032273]  __register_sysctl_table+0x6f4/0x720
[   18.032397]  ? preempt_count_sub+0xf/0xb0
[   18.032508]  ? 0xffffffffc0040000
[   18.032600]  init_misc_binfmt+0x2d/0x1000 [binfmt_misc]
[   18.042520] binfmt_misc: Failed to create fs/binfmt_misc sysctl mount point
modprobe: can't load module binfmt_misc (kernel/fs/binfmt_misc.ko): Cannot allocate memory
[   18.063549] binfmt_misc: Failed to create fs/binfmt_misc sysctl mount point
[   18.204779] BUG: unable to handle page fault for address: fffffbfff8004802

Link: https://lkml.kernel.org/r/20220124181812.1869535-2-ztong0001@gmail.com
Fixes: 3ba442d5331f ("fs: move binfmt_misc sysctl to its own file")
Signed-off-by: Tong Zhang <ztong0001@gmail.com>
Co-developed-by: Christian Brauner<brauner@kernel.org>
Acked-by: Luis Chamberlain <mcgrof@kernel.org>
Cc: Eric Biederman <ebiederm@xmission.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Iurii Zaikin <yzaikin@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 fs/binfmt_misc.c |    8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

--- a/fs/binfmt_misc.c~binfmt_misc-fix-crash-when-load-unload-module
+++ a/fs/binfmt_misc.c
@@ -817,20 +817,20 @@ static struct file_system_type bm_fs_typ
 };
 MODULE_ALIAS_FS("binfmt_misc");
 
+static struct ctl_table_header *binfmt_misc_header;
+
 static int __init init_misc_binfmt(void)
 {
 	int err = register_filesystem(&bm_fs_type);
 	if (!err)
 		insert_binfmt(&misc_format);
-	if (!register_sysctl_mount_point("fs/binfmt_misc")) {
-		pr_warn("Failed to create fs/binfmt_misc sysctl mount point");
-		return -ENOMEM;
-	}
+	binfmt_misc_header = register_sysctl_mount_point("fs/binfmt_misc");
 	return 0;
 }
 
 static void __exit exit_misc_binfmt(void)
 {
+	unregister_sysctl_table(binfmt_misc_header);
 	unregister_binfmt(&misc_format);
 	unregister_filesystem(&bm_fs_type);
 }
_


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [patch 3/7] memory-failure: fetch compound_head after pgmap_pfn_valid()
  2022-01-29  2:13 incoming Andrew Morton
  2022-01-29  2:14 ` [patch 1/7] include/linux/sysctl.h: fix register_sysctl_mount_point() return type Andrew Morton
  2022-01-29  2:14 ` [patch 2/7] binfmt_misc: fix crash when load/unload module Andrew Morton
@ 2022-01-29  2:14 ` Andrew Morton
  2022-01-29  2:14 ` [patch 4/7] tools/testing/scatterlist: add missing defines Andrew Morton
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 10+ messages in thread
From: Andrew Morton @ 2022-01-29  2:14 UTC (permalink / raw)
  To: akpm, dan.j.williams, jane.chu, joao.m.martins, linux-mm,
	mm-commits, naoya.horiguchi, songmuchun, torvalds

From: Joao Martins <joao.m.martins@oracle.com>
Subject: memory-failure: fetch compound_head after pgmap_pfn_valid()

memory_failure_dev_pagemap() at the moment assumes base pages (e.g. 
dax_lock_page()).  For devmap with compound pages fetch the compound_head
in case a tail page memory failure is being handled.

Currently this is a nop, but in the advent of compound pages in
dev_pagemap it allows memory_failure_dev_pagemap() to keep working.

Without this fix memory-failure handling (i.e.  MCEs on pmem) with
device-dax configured namespaces will regress (and crash).  

Link: https://lkml.kernel.org/r/20211202204422.26777-2-joao.m.martins@oracle.com
Reported-by: Jane Chu <jane.chu@oracle.com>
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Reviewed-by: Naoya Horiguchi <naoya.horiguchi@nec.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Muchun Song <songmuchun@bytedance.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/memory-failure.c |    6 ++++++
 1 file changed, 6 insertions(+)

--- a/mm/memory-failure.c~memory-failure-fetch-compound_head-after-pgmap_pfn_valid
+++ a/mm/memory-failure.c
@@ -1596,6 +1596,12 @@ static int memory_failure_dev_pagemap(un
 	}
 
 	/*
+	 * Pages instantiated by device-dax (not filesystem-dax)
+	 * may be compound pages.
+	 */
+	page = compound_head(page);
+
+	/*
 	 * Prevent the inode from being freed while we are interrogating
 	 * the address_space, typically this would be handled by
 	 * lock_page(), but dax pages do not use the page lock. This
_


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [patch 4/7] tools/testing/scatterlist: add missing defines
  2022-01-29  2:13 incoming Andrew Morton
                   ` (2 preceding siblings ...)
  2022-01-29  2:14 ` [patch 3/7] memory-failure: fetch compound_head after pgmap_pfn_valid() Andrew Morton
@ 2022-01-29  2:14 ` Andrew Morton
  2022-01-29  2:14 ` [patch 5/7] mm, kasan: use compare-exchange operation to set KASAN page tag Andrew Morton
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 10+ messages in thread
From: Andrew Morton @ 2022-01-29  2:14 UTC (permalink / raw)
  To: akpm, bigeasy, hch, linux-mm, maorg, mm-commits, tglx, torvalds

From: Maor Gottlieb <maorg@nvidia.com>
Subject: tools/testing/scatterlist: add missing defines

The cited commits replaced preemptible with pagefault_disabled and
flush_kernel_dcache_page with flush_dcache_page respectively, hence need
to update the corresponding defines in the test.

scatterlist.c: In function ‘sg_miter_stop’:
scatterlist.c:919:4: warning: implicit declaration of function ‘flush_dcache_page’ [-Wimplicit-function-declaration]
    flush_dcache_page(miter->page);
    ^~~~~~~~~~~~~~~~~
In file included from ./linux/scatterlist.h:8:0,
                 from scatterlist.c:9:
scatterlist.c:922:18: warning: implicit declaration of function ‘pagefault_disabled’ [-Wimplicit-function-declaration]
    WARN_ON_ONCE(!pagefault_disabled());
                  ^
./linux/mm.h:23:25: note: in definition of macro ‘WARN_ON_ONCE’
  int __ret_warn_on = !!(condition);                      \
                         ^~~~~~~~~

Link: https://lkml.kernel.org/r/20220118082105.1737320-1-maorg@nvidia.com
Fixes: 723aca208516 ("mm/scatterlist: replace the !preemptible warning in sg_miter_stop()")
Fixes: 0e84f5dbf8d6 ("scatterlist: replace flush_kernel_dcache_page with flush_dcache_page")
Signed-off-by: Maor Gottlieb <maorg@nvidia.com>
Tested-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 tools/testing/scatterlist/linux/mm.h |    3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

--- a/tools/testing/scatterlist/linux/mm.h~tools-testing-scatterlist-add-missing-defines
+++ a/tools/testing/scatterlist/linux/mm.h
@@ -74,7 +74,7 @@ static inline unsigned long page_to_phys
 	      __UNIQUE_ID(min1_), __UNIQUE_ID(min2_),   \
 	      x, y)
 
-#define preemptible() (1)
+#define pagefault_disabled() (0)
 
 static inline void *kmap(struct page *page)
 {
@@ -127,6 +127,7 @@ kmalloc_array(unsigned int n, unsigned i
 #define kmemleak_free(a)
 
 #define PageSlab(p) (0)
+#define flush_dcache_page(p)
 
 #define MAX_ERRNO	4095
 
_


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [patch 5/7] mm, kasan: use compare-exchange operation to set KASAN page tag
  2022-01-29  2:13 incoming Andrew Morton
                   ` (3 preceding siblings ...)
  2022-01-29  2:14 ` [patch 4/7] tools/testing/scatterlist: add missing defines Andrew Morton
@ 2022-01-29  2:14 ` Andrew Morton
  2022-01-29  2:14 ` [patch 6/7] psi: fix "defined but not used" warnings when CONFIG_PROC_FS=n Andrew Morton
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 10+ messages in thread
From: Andrew Morton @ 2022-01-29  2:14 UTC (permalink / raw)
  To: akpm, andreyknvl, linux-mm, mm-commits, pcc, peterz, stable, torvalds

From: Peter Collingbourne <pcc@google.com>
Subject: mm, kasan: use compare-exchange operation to set KASAN page tag

It has been reported that the tag setting operation on newly-allocated
pages can cause the page flags to be corrupted when performed concurrently
with other flag updates as a result of the use of non-atomic operations. 
Fix the problem by using a compare-exchange loop to update the tag.

Link: https://lkml.kernel.org/r/20220120020148.1632253-1-pcc@google.com
Link: https://linux-review.googlesource.com/id/I456b24a2b9067d93968d43b4bb3351c0cec63101
Fixes: 2813b9c02962 ("kasan, mm, arm64: tag non slab memory allocated via pagealloc")
Signed-off-by: Peter Collingbourne <pcc@google.com>
Reviewed-by: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 include/linux/mm.h |   17 ++++++++++++-----
 1 file changed, 12 insertions(+), 5 deletions(-)

--- a/include/linux/mm.h~mm-use-compare-exchange-operation-to-set-kasan-page-tag
+++ a/include/linux/mm.h
@@ -1506,11 +1506,18 @@ static inline u8 page_kasan_tag(const st
 
 static inline void page_kasan_tag_set(struct page *page, u8 tag)
 {
-	if (kasan_enabled()) {
-		tag ^= 0xff;
-		page->flags &= ~(KASAN_TAG_MASK << KASAN_TAG_PGSHIFT);
-		page->flags |= (tag & KASAN_TAG_MASK) << KASAN_TAG_PGSHIFT;
-	}
+	unsigned long old_flags, flags;
+
+	if (!kasan_enabled())
+		return;
+
+	tag ^= 0xff;
+	old_flags = READ_ONCE(page->flags);
+	do {
+		flags = old_flags;
+		flags &= ~(KASAN_TAG_MASK << KASAN_TAG_PGSHIFT);
+		flags |= (tag & KASAN_TAG_MASK) << KASAN_TAG_PGSHIFT;
+	} while (unlikely(!try_cmpxchg(&page->flags, &old_flags, flags)));
 }
 
 static inline void page_kasan_tag_reset(struct page *page)
_


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [patch 6/7] psi: fix "defined but not used" warnings when CONFIG_PROC_FS=n
  2022-01-29  2:13 incoming Andrew Morton
                   ` (4 preceding siblings ...)
  2022-01-29  2:14 ` [patch 5/7] mm, kasan: use compare-exchange operation to set KASAN page tag Andrew Morton
@ 2022-01-29  2:14 ` Andrew Morton
  2022-01-29  2:14 ` [patch 7/7] ocfs2: fix a deadlock when commit trans Andrew Morton
  2022-01-29  4:25 ` incoming Matthew Wilcox
  7 siblings, 0 replies; 10+ messages in thread
From: Andrew Morton @ 2022-01-29  2:14 UTC (permalink / raw)
  To: akpm, hannes, linux-mm, lkp, mm-commits, surenb, torvalds

From: Suren Baghdasaryan <surenb@google.com>
Subject: psi: fix "defined but not used" warnings when CONFIG_PROC_FS=n

When CONFIG_PROC_FS is disabled psi code generates the following warnings:

kernel/sched/psi.c:1364:30: warning: 'psi_cpu_proc_ops' defined but not used [-Wunused-const-variable=]
    1364 | static const struct proc_ops psi_cpu_proc_ops = {
         |                              ^~~~~~~~~~~~~~~~
kernel/sched/psi.c:1355:30: warning: 'psi_memory_proc_ops' defined but not used [-Wunused-const-variable=]
    1355 | static const struct proc_ops psi_memory_proc_ops = {
         |                              ^~~~~~~~~~~~~~~~~~~
kernel/sched/psi.c:1346:30: warning: 'psi_io_proc_ops' defined but not used [-Wunused-const-variable=]
    1346 | static const struct proc_ops psi_io_proc_ops = {
         |                              ^~~~~~~~~~~~~~~

Make definitions of these structures and related functions conditional on
CONFIG_PROC_FS config.

Link: https://lkml.kernel.org/r/20220119223940.787748-3-surenb@google.com
Fixes: 0e94682b73bf ("psi: introduce psi monitor")
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Reported-by: kernel test robot <lkp@intel.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 kernel/sched/psi.c |   79 ++++++++++++++++++++++---------------------
 1 file changed, 41 insertions(+), 38 deletions(-)

--- a/kernel/sched/psi.c~psi-fix-defined-but-not-used-warnings-when-config_proc_fs=n
+++ a/kernel/sched/psi.c
@@ -1082,44 +1082,6 @@ int psi_show(struct seq_file *m, struct
 	return 0;
 }
 
-static int psi_io_show(struct seq_file *m, void *v)
-{
-	return psi_show(m, &psi_system, PSI_IO);
-}
-
-static int psi_memory_show(struct seq_file *m, void *v)
-{
-	return psi_show(m, &psi_system, PSI_MEM);
-}
-
-static int psi_cpu_show(struct seq_file *m, void *v)
-{
-	return psi_show(m, &psi_system, PSI_CPU);
-}
-
-static int psi_open(struct file *file, int (*psi_show)(struct seq_file *, void *))
-{
-	if (file->f_mode & FMODE_WRITE && !capable(CAP_SYS_RESOURCE))
-		return -EPERM;
-
-	return single_open(file, psi_show, NULL);
-}
-
-static int psi_io_open(struct inode *inode, struct file *file)
-{
-	return psi_open(file, psi_io_show);
-}
-
-static int psi_memory_open(struct inode *inode, struct file *file)
-{
-	return psi_open(file, psi_memory_show);
-}
-
-static int psi_cpu_open(struct inode *inode, struct file *file)
-{
-	return psi_open(file, psi_cpu_show);
-}
-
 struct psi_trigger *psi_trigger_create(struct psi_group *group,
 			char *buf, size_t nbytes, enum psi_res res)
 {
@@ -1278,6 +1240,45 @@ __poll_t psi_trigger_poll(void **trigger
 	return ret;
 }
 
+#ifdef CONFIG_PROC_FS
+static int psi_io_show(struct seq_file *m, void *v)
+{
+	return psi_show(m, &psi_system, PSI_IO);
+}
+
+static int psi_memory_show(struct seq_file *m, void *v)
+{
+	return psi_show(m, &psi_system, PSI_MEM);
+}
+
+static int psi_cpu_show(struct seq_file *m, void *v)
+{
+	return psi_show(m, &psi_system, PSI_CPU);
+}
+
+static int psi_open(struct file *file, int (*psi_show)(struct seq_file *, void *))
+{
+	if (file->f_mode & FMODE_WRITE && !capable(CAP_SYS_RESOURCE))
+		return -EPERM;
+
+	return single_open(file, psi_show, NULL);
+}
+
+static int psi_io_open(struct inode *inode, struct file *file)
+{
+	return psi_open(file, psi_io_show);
+}
+
+static int psi_memory_open(struct inode *inode, struct file *file)
+{
+	return psi_open(file, psi_memory_show);
+}
+
+static int psi_cpu_open(struct inode *inode, struct file *file)
+{
+	return psi_open(file, psi_cpu_show);
+}
+
 static ssize_t psi_write(struct file *file, const char __user *user_buf,
 			 size_t nbytes, enum psi_res res)
 {
@@ -1392,3 +1393,5 @@ static int __init psi_proc_init(void)
 	return 0;
 }
 module_init(psi_proc_init);
+
+#endif /* CONFIG_PROC_FS */
_


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [patch 7/7] ocfs2: fix a deadlock when commit trans
  2022-01-29  2:13 incoming Andrew Morton
                   ` (5 preceding siblings ...)
  2022-01-29  2:14 ` [patch 6/7] psi: fix "defined but not used" warnings when CONFIG_PROC_FS=n Andrew Morton
@ 2022-01-29  2:14 ` Andrew Morton
  2022-01-29  4:25 ` incoming Matthew Wilcox
  7 siblings, 0 replies; 10+ messages in thread
From: Andrew Morton @ 2022-01-29  2:14 UTC (permalink / raw)
  To: adilger.kernel, akpm, gautham.ananthakrishna, gechangwei, ghe,
	jlbec, joseph.qi, junxiao.bi, linux-mm, mark, mm-commits,
	piaojun, saeed.mirzamohammadi, stable, torvalds, tytso

From: Joseph Qi <joseph.qi@linux.alibaba.com>
Subject: ocfs2: fix a deadlock when commit trans

commit 6f1b228529ae introduces a regression which can deadlock as follows:

Task1:                              Task2:
jbd2_journal_commit_transaction     ocfs2_test_bg_bit_allocatable
spin_lock(&jh->b_state_lock)        jbd_lock_bh_journal_head
__jbd2_journal_remove_checkpoint    spin_lock(&jh->b_state_lock)
jbd2_journal_put_journal_head
jbd_lock_bh_journal_head

Task1 and Task2 lock bh->b_state and jh->b_state_lock in different
order, which finally result in a deadlock.

So use jbd2_journal_[grab|put]_journal_head instead in
ocfs2_test_bg_bit_allocatable() to fix it.

Link: https://lkml.kernel.org/r/20220121071205.100648-3-joseph.qi@linux.alibaba.com
Fixes: 6f1b228529ae ("ocfs2: fix race between searching chunks and release journal_head from buffer_head")
Signed-off-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Reported-by: Gautham Ananthakrishna <gautham.ananthakrishna@oracle.com>
Reported-by: Saeed Mirzamohammadi <saeed.mirzamohammadi@oracle.com>
Cc: "Theodore Ts'o" <tytso@mit.edu>
Cc: Andreas Dilger <adilger.kernel@dilger.ca>
Cc: Changwei Ge <gechangwei@live.cn>
Cc: Gang He <ghe@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Jun Piao <piaojun@huawei.com>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 fs/ocfs2/suballoc.c |   25 +++++++++++--------------
 1 file changed, 11 insertions(+), 14 deletions(-)

--- a/fs/ocfs2/suballoc.c~ocfs2-fix-a-deadlock-when-commit-trans
+++ a/fs/ocfs2/suballoc.c
@@ -1251,26 +1251,23 @@ static int ocfs2_test_bg_bit_allocatable
 {
 	struct ocfs2_group_desc *bg = (struct ocfs2_group_desc *) bg_bh->b_data;
 	struct journal_head *jh;
-	int ret = 1;
+	int ret;
 
 	if (ocfs2_test_bit(nr, (unsigned long *)bg->bg_bitmap))
 		return 0;
 
-	if (!buffer_jbd(bg_bh))
+	jh = jbd2_journal_grab_journal_head(bg_bh);
+	if (!jh)
 		return 1;
 
-	jbd_lock_bh_journal_head(bg_bh);
-	if (buffer_jbd(bg_bh)) {
-		jh = bh2jh(bg_bh);
-		spin_lock(&jh->b_state_lock);
-		bg = (struct ocfs2_group_desc *) jh->b_committed_data;
-		if (bg)
-			ret = !ocfs2_test_bit(nr, (unsigned long *)bg->bg_bitmap);
-		else
-			ret = 1;
-		spin_unlock(&jh->b_state_lock);
-	}
-	jbd_unlock_bh_journal_head(bg_bh);
+	spin_lock(&jh->b_state_lock);
+	bg = (struct ocfs2_group_desc *) jh->b_committed_data;
+	if (bg)
+		ret = !ocfs2_test_bit(nr, (unsigned long *)bg->bg_bitmap);
+	else
+		ret = 1;
+	spin_unlock(&jh->b_state_lock);
+	jbd2_journal_put_journal_head(jh);
 
 	return ret;
 }
_


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: incoming
  2022-01-29  2:13 incoming Andrew Morton
                   ` (6 preceding siblings ...)
  2022-01-29  2:14 ` [patch 7/7] ocfs2: fix a deadlock when commit trans Andrew Morton
@ 2022-01-29  4:25 ` Matthew Wilcox
  2022-01-29  6:23   ` incoming Andrew Morton
  7 siblings, 1 reply; 10+ messages in thread
From: Matthew Wilcox @ 2022-01-29  4:25 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Linus Torvalds, mm-commits, linux-mm

On Fri, Jan 28, 2022 at 06:13:41PM -0800, Andrew Morton wrote:
> 12 patches, based on 169387e2aa291a4e3cb856053730fe99d6cec06f.
  ^^

I see 7?


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: incoming
  2022-01-29  4:25 ` incoming Matthew Wilcox
@ 2022-01-29  6:23   ` Andrew Morton
  0 siblings, 0 replies; 10+ messages in thread
From: Andrew Morton @ 2022-01-29  6:23 UTC (permalink / raw)
  To: Matthew Wilcox; +Cc: Linus Torvalds, mm-commits, linux-mm

On Sat, 29 Jan 2022 04:25:33 +0000 Matthew Wilcox <willy@infradead.org> wrote:

> On Fri, Jan 28, 2022 at 06:13:41PM -0800, Andrew Morton wrote:
> > 12 patches, based on 169387e2aa291a4e3cb856053730fe99d6cec06f.
>   ^^
> 
> I see 7?

Crap, sorry, ignore all this, shall redo tomorrow.

(It wasn't a good day over here.  The thing with disk drives is that
the bigger they are, the harder they fall).



^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2022-01-29  6:23 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-01-29  2:13 incoming Andrew Morton
2022-01-29  2:14 ` [patch 1/7] include/linux/sysctl.h: fix register_sysctl_mount_point() return type Andrew Morton
2022-01-29  2:14 ` [patch 2/7] binfmt_misc: fix crash when load/unload module Andrew Morton
2022-01-29  2:14 ` [patch 3/7] memory-failure: fetch compound_head after pgmap_pfn_valid() Andrew Morton
2022-01-29  2:14 ` [patch 4/7] tools/testing/scatterlist: add missing defines Andrew Morton
2022-01-29  2:14 ` [patch 5/7] mm, kasan: use compare-exchange operation to set KASAN page tag Andrew Morton
2022-01-29  2:14 ` [patch 6/7] psi: fix "defined but not used" warnings when CONFIG_PROC_FS=n Andrew Morton
2022-01-29  2:14 ` [patch 7/7] ocfs2: fix a deadlock when commit trans Andrew Morton
2022-01-29  4:25 ` incoming Matthew Wilcox
2022-01-29  6:23   ` incoming Andrew Morton

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).