linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] hugetlb: fix hugetlb cgroup refcounting during vma split
@ 2021-08-30 21:50 Mike Kravetz
  2021-08-31 14:01 ` Guillaume Morin
  0 siblings, 1 reply; 2+ messages in thread
From: Mike Kravetz @ 2021-08-30 21:50 UTC (permalink / raw)
  To: linux-mm, linux-kernel, cgroups
  Cc: Mina Almasry, David Rientjes, Greg Thelen, Sandipan Das,
	Shakeel Butt, Shuah Khan, Andrew Morton, Mike Kravetz,
	Guillaume Morin, stable

Guillaume Morin reported hitting the following WARNING followed
by GPF or NULL pointer deference either in cgroups_destroy or in
the kill_css path.:

percpu ref (css_release) <= 0 (-1) after switching to atomic
WARNING: CPU: 23 PID: 130 at lib/percpu-refcount.c:196 percpu_ref_switch_to_atomic_rcu+0x127/0x130
CPU: 23 PID: 130 Comm: ksoftirqd/23 Kdump: loaded Tainted: G           O      5.10.60 #1
RIP: 0010:percpu_ref_switch_to_atomic_rcu+0x127/0x130
Call Trace:
 rcu_core+0x30f/0x530
 rcu_core_si+0xe/0x10
 __do_softirq+0x103/0x2a2
 ? sort_range+0x30/0x30
 run_ksoftirqd+0x2b/0x40
 smpboot_thread_fn+0x11a/0x170
 kthread+0x10a/0x140
 ? kthread_create_worker_on_cpu+0x70/0x70
 ret_from_fork+0x22/0x30

Upon further examination, it was discovered that the css structure
was associated with hugetlb reservations.

For private hugetlb mappings the vma points to a reserve map that
contains a pointer to the css.  At mmap time, reservations are set up
and a reference to the css is taken.  This reference is dropped in the
vma close operation; hugetlb_vm_op_close.  However, if a vma is split
no additional reference to the css is taken yet hugetlb_vm_op_close will
be called twice for the split vma resulting in an underflow.

Fix by taking another reference in hugetlb_vm_op_open.  Note that the
reference is only taken for the owner of the reserve map.  In the more
common fork case, the pointer to the reserve map is cleared for
non-owning vmas.

Fixes: e9fe92ae0cd2 ("hugetlb_cgroup: add reservation accounting for
private mappings")
Reported-by: Guillaume Morin <guillaume@morinfr.org>
Suggested-by: Guillaume Morin <guillaume@morinfr.org>
Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
Cc: <stable@vger.kernel.org>
---
 include/linux/hugetlb_cgroup.h | 12 ++++++++++++
 mm/hugetlb.c                   |  4 +++-
 2 files changed, 15 insertions(+), 1 deletion(-)

diff --git a/include/linux/hugetlb_cgroup.h b/include/linux/hugetlb_cgroup.h
index 0b8d1fdda3a1..c137396129db 100644
--- a/include/linux/hugetlb_cgroup.h
+++ b/include/linux/hugetlb_cgroup.h
@@ -121,6 +121,13 @@ static inline void hugetlb_cgroup_put_rsvd_cgroup(struct hugetlb_cgroup *h_cg)
 	css_put(&h_cg->css);
 }
 
+static inline void resv_map_dup_hugetlb_cgroup_uncharge_info(
+						struct resv_map *resv_map)
+{
+	if (resv_map->css)
+		css_get(resv_map->css);
+}
+
 extern int hugetlb_cgroup_charge_cgroup(int idx, unsigned long nr_pages,
 					struct hugetlb_cgroup **ptr);
 extern int hugetlb_cgroup_charge_cgroup_rsvd(int idx, unsigned long nr_pages,
@@ -199,6 +206,11 @@ static inline void hugetlb_cgroup_put_rsvd_cgroup(struct hugetlb_cgroup *h_cg)
 {
 }
 
+static inline void resv_map_dup_hugetlb_cgroup_uncharge_info(
+						struct resv_map *resv_map)
+{
+}
+
 static inline int hugetlb_cgroup_charge_cgroup(int idx, unsigned long nr_pages,
 					       struct hugetlb_cgroup **ptr)
 {
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 8ea35ba6699f..6c583ef079e3 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -4033,8 +4033,10 @@ static void hugetlb_vm_op_open(struct vm_area_struct *vma)
 	 * after this open call completes.  It is therefore safe to take a
 	 * new reference here without additional locking.
 	 */
-	if (resv && is_vma_resv_set(vma, HPAGE_RESV_OWNER))
+	if (resv && is_vma_resv_set(vma, HPAGE_RESV_OWNER)) {
+		resv_map_dup_hugetlb_cgroup_uncharge_info(resv);
 		kref_get(&resv->refs);
+	}
 }
 
 static void hugetlb_vm_op_close(struct vm_area_struct *vma)
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 2+ messages in thread

* Re: [PATCH] hugetlb: fix hugetlb cgroup refcounting during vma split
  2021-08-30 21:50 [PATCH] hugetlb: fix hugetlb cgroup refcounting during vma split Mike Kravetz
@ 2021-08-31 14:01 ` Guillaume Morin
  0 siblings, 0 replies; 2+ messages in thread
From: Guillaume Morin @ 2021-08-31 14:01 UTC (permalink / raw)
  To: Mike Kravetz
  Cc: linux-mm, linux-kernel, cgroups, Mina Almasry, David Rientjes,
	Greg Thelen, Sandipan Das, Shakeel Butt, Shuah Khan,
	Andrew Morton, Guillaume Morin, stable

On 30 Aug 14:50, Mike Kravetz wrote:
> Guillaume Morin reported hitting the following WARNING followed
> by GPF or NULL pointer deference either in cgroups_destroy or in
> the kill_css path.:
> 
> percpu ref (css_release) <= 0 (-1) after switching to atomic
> WARNING: CPU: 23 PID: 130 at lib/percpu-refcount.c:196 percpu_ref_switch_to_atomic_rcu+0x127/0x130
> CPU: 23 PID: 130 Comm: ksoftirqd/23 Kdump: loaded Tainted: G           O      5.10.60 #1
> RIP: 0010:percpu_ref_switch_to_atomic_rcu+0x127/0x130
> Call Trace:
>  rcu_core+0x30f/0x530
>  rcu_core_si+0xe/0x10
>  __do_softirq+0x103/0x2a2
>  ? sort_range+0x30/0x30
>  run_ksoftirqd+0x2b/0x40
>  smpboot_thread_fn+0x11a/0x170
>  kthread+0x10a/0x140
>  ? kthread_create_worker_on_cpu+0x70/0x70
>  ret_from_fork+0x22/0x30
> 
> Upon further examination, it was discovered that the css structure
> was associated with hugetlb reservations.
> 
> For private hugetlb mappings the vma points to a reserve map that
> contains a pointer to the css.  At mmap time, reservations are set up
> and a reference to the css is taken.  This reference is dropped in the
> vma close operation; hugetlb_vm_op_close.  However, if a vma is split
> no additional reference to the css is taken yet hugetlb_vm_op_close will
> be called twice for the split vma resulting in an underflow.
> 
> Fix by taking another reference in hugetlb_vm_op_open.  Note that the
> reference is only taken for the owner of the reserve map.  In the more
> common fork case, the pointer to the reserve map is cleared for
> non-owning vmas.
> 
> Fixes: e9fe92ae0cd2 ("hugetlb_cgroup: add reservation accounting for
> private mappings")
> Reported-by: Guillaume Morin <guillaume@morinfr.org>
> Suggested-by: Guillaume Morin <guillaume@morinfr.org>
> Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
> Cc: <stable@vger.kernel.org>

I verified that the patch does fix the underflow. I appreciate the help!

Feel free to add:
Tested-by: Guillaume Morin <guillaume@morinfr.org>

-- 
Guillaume Morin <guillaume@morinfr.org>

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2021-08-31 14:02 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-08-30 21:50 [PATCH] hugetlb: fix hugetlb cgroup refcounting during vma split Mike Kravetz
2021-08-31 14:01 ` Guillaume Morin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).