linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Sven Schnelle <svens@linux.ibm.com>
To: Liam Howlett <liam.howlett@oracle.com>
Cc: Heiko Carstens <hca@linux.ibm.com>,
	Guenter Roeck <linux@roeck-us.net>,
	Andrew Morton <akpm@linux-foundation.org>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH] mapletree-vs-khugepaged
Date: Mon, 16 May 2022 19:10:22 +0200	[thread overview]
Message-ID: <yt9dwnel4d4x.fsf@linux.ibm.com> (raw)
In-Reply-To: <20220516155042.axgwex2enlf54n5m@revolver> (Liam Howlett's message of "Mon, 16 May 2022 15:50:48 +0000")

[-- Attachment #1: Type: text/plain, Size: 3143 bytes --]

Liam Howlett <liam.howlett@oracle.com> writes:

> * Sven Schnelle <svens@linux.ibm.com> [220516 11:37]:
>> Hi Liam,
>> 
>> Liam Howlett <liam.howlett@oracle.com> writes:
>> 
>> > * Sven Schnelle <svens@linux.ibm.com> [220515 16:02]:
>> >
>> > I tried the above on my qemu s390 with kernel 5.18.0-rc6-next-20220513,
>> > but it runs without issue, return code is 0.  Is there something the VM
>> > needs to have for this to trigger?
>> 
>> A coworker said the same. Reason for this seems to be that i've run the
>> code in a unittest environment which seems to make a difference. When
>> compiling the code above with gcc on my system it also doesn't crash.
>> So i have to figure out what makes this unittest binary special.
>> 
>> >> I've added a few debug statements to the maple tree code:
>> >> 
>> >> [   27.769641] mas_next_entry: offset=14
>> >> [   27.769642] mas_next_nentry: entry = 0e00000000000000, slots=0000000090249f80, mas->offset=15 count=14
>> >
>> > Where exactly are you printing this?
>> 
>> I added a lot of debug statements to the code trying to understand
>> it. I'll attach it to this mail.
>
> Thanks.  Can you check to see if that diff you sent was the correct
> file?  It appears to be the git stats and not the changes themselves.

No, it wasn't. I did git stash show with -p, which doesn't make sense.
I've attached the correct diff.

>> > If this is the case, then I think any task that has more than 14 VMAs
>> > would have issues.  I also use mas_next_entry() in mas_find() which is
>> > used for the mas_for_each() macro/iterator.  Can you please enable
>> > CONFIG_DEBUG_VM_MAPLE_TREE ?  mmap.c tests the tree after pretty much
>> > any change and will dump useful information if there is an issue -
>> > including the entire tree. See validate_mm_mt() for details.
>> >
>> > You can find CONFIG_DEBUG_VM_MAPLE_TREE in the config:
>> > kernel hacking -> Memory debugging -> Debug VM -> Debug VM maple trees
>> 
>> I have both DEBUG_MAPPLE_TREE and DEBUG_VM_MAPLE_TREE enabled, but don't
>> see anything printed.
>
> I suspect that this means the issue is actually in the mmap code and not
> the tree.  It may be sensitive to the task-specific layout.  Do you have
> randomization on as well?  I'm thinking maybe I return NULL because I'm
> asking for the next element after the last VMA and not checking that
> correctly or similar.


> Does ./scripts/faddr2line work for you?  What does it say about
> mmap_region+0x19e/0x848 ?

2629        next = mas_next(&mas, ULONG_MAX);
2630        prev = mas_prev(&mas, 0);
2631        if (vm_flags & VM_SPECIAL)
2632               goto cannot_expand;
2633
2634        /* Attempt to expand an old mapping */
2635        /* Check next */
2636        if (next && next->vm_start == end && !vma_policy(next) &&

next above is 0x0e00000000000000 and next->vm_start will trigger the crash.

2637            can_vma_merge_before(next, vm_flags, NULL, file, pgoff+pglen,
2638                              NULL_VM_UFFD_CTX, NULL)) {
2639               merge_end = next->vm_end;
2640               vma = next;
2641               vm_pgoff = next->vm_pgoff - pglen;
2642        }

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: mapple-debug.diff --]
[-- Type: text/x-diff, Size: 6469 bytes --]

diff --git a/lib/maple_tree.c b/lib/maple_tree.c
index 967631055210..a621d17e872d 100644
--- a/lib/maple_tree.c
+++ b/lib/maple_tree.c
@@ -140,6 +140,7 @@ struct maple_subtree_state {
 	struct maple_big_node *bn;
 };
 
+extern int mas_debug;
 /* Functions */
 static inline struct maple_node *mt_alloc_one(gfp_t gfp)
 {
@@ -4550,6 +4551,9 @@ static inline void *mas_next_nentry(struct ma_state *mas,
 	while (mas->offset < count) {
 		pivot = pivots[mas->offset];
 		entry = mas_slot(mas, slots, mas->offset);
+		if (mas_debug)
+			pr_err("%s: entry = %px, slots=%px, mas->offset=%d\n",
+			       __func__, entry, slots, mas->offset);
 		if (ma_dead_node(node))
 			return NULL;
 
@@ -4570,6 +4574,10 @@ static inline void *mas_next_nentry(struct ma_state *mas,
 
 	pivot = mas_safe_pivot(mas, pivots, mas->offset, type);
 	entry = mas_slot(mas, slots, mas->offset);
+	if (mas_debug)
+		pr_err("%s: entry = %px, slots=%px, mas->offset=%d count=%d\n",
+		       __func__, entry, slots, mas->offset, count);
+
 	if (ma_dead_node(node))
 		return NULL;
 
@@ -4580,6 +4588,8 @@ static inline void *mas_next_nentry(struct ma_state *mas,
 		return NULL;
 
 found:
+	if (mas_debug)
+		pr_err("found pivot = %lx, entry = %px\n", pivot, entry);
 	mas->last = pivot;
 	return entry;
 }
@@ -4618,6 +4628,9 @@ static inline void *mas_next_entry(struct ma_state *mas, unsigned long limit)
 	unsigned long last;
 	enum maple_type mt;
 
+	if (mas_debug)
+		pr_err("%s: entry\n", __func__);
+
 	last = mas->last;
 retry:
 	offset = mas->offset;
@@ -4625,10 +4638,17 @@ static inline void *mas_next_entry(struct ma_state *mas, unsigned long limit)
 	node = mas_mn(mas);
 	mt = mte_node_type(mas->node);
 	mas->offset++;
-	if (unlikely(mas->offset >= mt_slots[mt]))
+	if (mas_debug)
+		pr_err("%s: offset=%d\n", __func__, offset);
+	if (unlikely(mas->offset >= mt_slots[mt])) {
+		if (mas_debug)
+			pr_err("%s: next node\n", __func__);
 		goto next_node;
+	}
 
 	while (!mas_is_none(mas)) {
+		if (mas_debug)
+			pr_err("%s: !none\n", __func__);
 		entry = mas_next_nentry(mas, node, limit, mt);
 		if (unlikely(ma_dead_node(node))) {
 			mas_rewalk(mas, last);
@@ -4656,6 +4676,8 @@ static inline void *mas_next_entry(struct ma_state *mas, unsigned long limit)
 	mas->index = mas->last = limit;
 	mas->offset = offset;
 	mas->node = prev_node;
+	if (mas_debug)
+		pr_err("%s: return NULL, mas->node = %px\n", __func__, prev_node);
 	return NULL;
 }
 
@@ -4914,6 +4936,8 @@ static inline bool mas_anode_descend(struct ma_state *mas, unsigned long size)
 void *mas_walk(struct ma_state *mas)
 {
 	void *entry;
+	if (mas_debug)
+		pr_err("%s\n", __func__);
 
 retry:
 	entry = mas_state_walk(mas);
@@ -5838,7 +5862,12 @@ EXPORT_SYMBOL_GPL(mas_pause);
  */
 void *mas_find(struct ma_state *mas, unsigned long max)
 {
+	if (mas_debug)
+		pr_err("%s: max=%lx\n", __func__, max);
+
 	if (unlikely(mas_is_paused(mas))) {
+		if (mas_debug)
+			pr_err("%s: paused\n", __func__);
 		if (unlikely(mas->last == ULONG_MAX)) {
 			mas->node = MAS_NONE;
 			return NULL;
@@ -5848,6 +5877,8 @@ void *mas_find(struct ma_state *mas, unsigned long max)
 	}
 
 	if (unlikely(mas_is_start(mas))) {
+		if (mas_debug)
+			pr_err("%s: start\n", __func__);
 		/* First run or continue */
 		void *entry;
 
@@ -5859,8 +5890,10 @@ void *mas_find(struct ma_state *mas, unsigned long max)
 			return entry;
 	}
 
-	if (unlikely(!mas_searchable(mas)))
+	if (unlikely(!mas_searchable(mas))) {
+		pr_err("%s: not searchable\n", __func__);
 		return NULL;
+	}
 
 	/* Retries on dead nodes handled by mas_next_entry */
 	return mas_next_entry(mas, max);
diff --git a/mm/mmap.c b/mm/mmap.c
index 2049500931ae..16ee834c0a4c 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -57,6 +57,7 @@
 
 #include "internal.h"
 
+int mas_debug;
 #ifndef arch_mmap_check
 #define arch_mmap_check(addr, len, flags)	(0)
 #endif
@@ -2375,6 +2376,10 @@ do_mas_align_munmap(struct ma_state *mas, struct vm_area_struct *vma,
 	int count = 0;
 	int error = -ENOMEM;
 	MA_STATE(mas_detach, &mt_detach, start, end - 1);
+
+	if (mas_debug)
+		pr_err("%s:%d\n", __func__, __LINE__);
+
 	mt_init_flags(&mt_detach, MM_MT_FLAGS);
 	mt_set_external_lock(&mt_detach, &mm->mmap_lock);
 
@@ -2541,28 +2546,37 @@ do_mas_align_munmap(struct ma_state *mas, struct vm_area_struct *vma,
  *
  * Returns: -EINVAL on failure, 1 on success and unlock, 0 otherwise.
  */
+
+
+
 int do_mas_munmap(struct ma_state *mas, struct mm_struct *mm,
 		  unsigned long start, size_t len, struct list_head *uf,
 		  bool downgrade)
 {
 	unsigned long end;
 	struct vm_area_struct *vma;
+	if (mas_debug)
+		pr_err("%s: %lx %lx\n", __func__, start, len);
 
 	if ((offset_in_page(start)) || start > TASK_SIZE || len > TASK_SIZE-start)
 		return -EINVAL;
-
+	if (mas_debug)
+		pr_err("%s:%d\n", __func__, __LINE__);
 	end = start + PAGE_ALIGN(len);
 	if (end == start)
 		return -EINVAL;
-
+	if (mas_debug)
+		pr_err("%s:%d\n", __func__, __LINE__);
 	 /* arch_unmap() might do unmaps itself.  */
 	arch_unmap(mm, start, end);
-
+	if (mas_debug)
+		pr_err("%s:%d\n", __func__, __LINE__);
 	/* Find the first overlapping VMA */
 	vma = mas_find(mas, end - 1);
 	if (!vma)
 		return 0;
-
+	if (mas_debug)
+		pr_err("vma=%px\n", vma);
 	return do_mas_align_munmap(mas, vma, mm, start, end, uf, downgrade);
 }
 
@@ -2594,6 +2608,7 @@ unsigned long mmap_region(struct file *file, unsigned long addr,
 	pgoff_t vm_pgoff;
 	int error;
 	MA_STATE(mas, &mm->mm_mt, addr, end - 1);
+	mas_debug = (addr == (1UL << 54));
 
 	/* Check against address space limit. */
 	if (!may_expand_vm(mm, vm_flags, len >> PAGE_SHIFT)) {
@@ -2609,23 +2624,30 @@ unsigned long mmap_region(struct file *file, unsigned long addr,
 					(len >> PAGE_SHIFT) - nr_pages))
 			return -ENOMEM;
 	}
-
 	/* Unmap any existing mapping in the area */
-	if (do_mas_munmap(&mas, mm, addr, len, uf, false))
+
+	if (do_mas_munmap(&mas, mm, addr, len, uf, false)) {
+		mas_debug = 0;
 		return -ENOMEM;
+	}
 
 	/*
 	 * Private writable mapping: check memory availability
 	 */
 	if (accountable_mapping(file, vm_flags)) {
 		charged = len >> PAGE_SHIFT;
-		if (security_vm_enough_memory_mm(mm, charged))
+		if (security_vm_enough_memory_mm(mm, charged)) {
+			mas_debug = 0;
 			return -ENOMEM;
+		}
 		vm_flags |= VM_ACCOUNT;
 	}
 
 	next = mas_next(&mas, ULONG_MAX);
 	prev = mas_prev(&mas, 0);
+	if (mas_debug)
+		pr_err("%s: next %px\n", __func__, next);
+	mas_debug = 0;
 	if (vm_flags & VM_SPECIAL)
 		goto cannot_expand;
 

  reply	other threads:[~2022-05-16 17:10 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-04-28 17:20 [PATCH] mapletree-vs-khugepaged Guenter Roeck
2022-04-28 19:27 ` Liam Howlett
2022-04-29 12:09 ` Heiko Carstens
2022-04-29 13:01   ` Liam Howlett
2022-04-29 13:10     ` Heiko Carstens
2022-04-29 16:18       ` Liam Howlett
2022-05-02  9:10         ` Geert Uytterhoeven
2022-05-13 14:46   ` Sven Schnelle
2022-05-13 14:51     ` Sven Schnelle
2022-05-13 16:49     ` Andrew Morton
2022-05-13 17:00     ` Liam Howlett
2022-05-15 20:02       ` Sven Schnelle
2022-05-16 14:02         ` Liam Howlett
2022-05-16 15:37           ` Sven Schnelle
2022-05-16 15:50             ` Liam Howlett
2022-05-16 17:10               ` Sven Schnelle [this message]
2022-05-17 14:52                 ` Liam Howlett
2022-05-17 11:53       ` Heiko Carstens
2022-05-17 12:26         ` Heiko Carstens
2022-05-17 13:23         ` Guenter Roeck
2022-05-17 15:03           ` Liam Howlett
2022-05-17 16:28             ` Guenter Roeck
2022-05-17 20:38               ` Liam Howlett
2022-05-17 14:32         ` Guenter Roeck
2022-05-19 14:35           ` Liam Howlett
2022-05-19 21:41             ` Guenter Roeck
2022-05-19 22:38               ` Liam Howlett
2022-05-30 17:38               ` Liam Howlett
2022-05-31 18:56                 ` Liam Howlett
2022-06-01 19:06                   ` Liam Howlett
2022-05-13 17:28     ` Guenter Roeck
2022-05-13 20:12     ` Yang Shi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=yt9dwnel4d4x.fsf@linux.ibm.com \
    --to=svens@linux.ibm.com \
    --cc=akpm@linux-foundation.org \
    --cc=hca@linux.ibm.com \
    --cc=liam.howlett@oracle.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux@roeck-us.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).