linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* incoming
@ 2020-03-29  2:14 Andrew Morton
  2020-03-29  2:17 ` [patch 1/5] mm/swapfile.c: move inode_lock out of claim_swapfile Andrew Morton
                   ` (4 more replies)
  0 siblings, 5 replies; 20+ messages in thread
From: Andrew Morton @ 2020-03-29  2:14 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: mm-commits, linux-mm

5 fixes, based on 83fd69c93340177dcd66fd26ce6441fb581c1dbf:


    Naohiro Aota <naohiro.aota@wdc.com>:
      mm/swapfile.c: move inode_lock out of claim_swapfile

    David Hildenbrand <david@redhat.com>:
      drivers/base/memory.c: indicate all memory blocks as removable

    Mina Almasry <almasrymina@google.com>:
      hugetlb_cgroup: fix illegal access to memory

    Roman Gushchin <guro@fb.com>:
      mm: fork: fix kernel_stack memcg stats for various stack implementations

    "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>:
      mm/sparse: fix kernel crash with pfn_section_valid check

 drivers/base/memory.c      |   23 +++--------------------
 include/linux/memcontrol.h |   12 ++++++++++++
 kernel/fork.c              |    4 ++--
 mm/hugetlb_cgroup.c        |    3 +--
 mm/memcontrol.c            |   38 ++++++++++++++++++++++++++++++++++++++
 mm/sparse.c                |    6 ++++++
 mm/swapfile.c              |   41 ++++++++++++++++++++---------------------
 7 files changed, 82 insertions(+), 45 deletions(-)



^ permalink raw reply	[flat|nested] 20+ messages in thread

* [patch 1/5] mm/swapfile.c: move inode_lock out of claim_swapfile
  2020-03-29  2:14 incoming Andrew Morton
@ 2020-03-29  2:17 ` Andrew Morton
  2020-03-29  2:17 ` [patch 2/5] drivers/base/memory.c: indicate all memory blocks as removable Andrew Morton
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 20+ messages in thread
From: Andrew Morton @ 2020-03-29  2:17 UTC (permalink / raw)
  To: akpm, darrick.wong, hch, linux-mm, mm-commits, naohiro.aota,
	qais.yousef, stable, torvalds

From: Naohiro Aota <naohiro.aota@wdc.com>
Subject: mm/swapfile.c: move inode_lock out of claim_swapfile

claim_swapfile() currently keeps the inode locked when it is successful,
or the file is already swapfile (with -EBUSY).  And, on the other error
cases, it does not lock the inode.

This inconsistency of the lock state and return value is quite confusing
and actually causing a bad unlock balance as below in the "bad_swap"
section of __do_sys_swapon().

This commit fixes this issue by moving the inode_lock() and IS_SWAPFILE
check out of claim_swapfile().  The inode is unlocked in
"bad_swap_unlock_inode" section, so that the inode is ensured to be
unlocked at "bad_swap".  Thus, error handling codes after the locking now
jumps to "bad_swap_unlock_inode" instead of "bad_swap".

    =====================================
    WARNING: bad unlock balance detected!
    5.5.0-rc7+ #176 Not tainted
    -------------------------------------
    swapon/4294 is trying to release lock (&sb->s_type->i_mutex_key) at:
    [<ffffffff8173a6eb>] __do_sys_swapon+0x94b/0x3550
    but there are no more locks to release!

    other info that might help us debug this:
    no locks held by swapon/4294.

    stack backtrace:
    CPU: 5 PID: 4294 Comm: swapon Not tainted 5.5.0-rc7-BTRFS-ZNS+ #176
    Hardware name: ASUS All Series/H87-PRO, BIOS 2102 07/29/2014
    Call Trace:
     dump_stack+0xa1/0xea
     ? __do_sys_swapon+0x94b/0x3550
     print_unlock_imbalance_bug.cold+0x114/0x123
     ? __do_sys_swapon+0x94b/0x3550
     lock_release+0x562/0xed0
     ? kvfree+0x31/0x40
     ? lock_downgrade+0x770/0x770
     ? kvfree+0x31/0x40
     ? rcu_read_lock_sched_held+0xa1/0xd0
     ? rcu_read_lock_bh_held+0xb0/0xb0
     up_write+0x2d/0x490
     ? kfree+0x293/0x2f0
     __do_sys_swapon+0x94b/0x3550
     ? putname+0xb0/0xf0
     ? kmem_cache_free+0x2e7/0x370
     ? do_sys_open+0x184/0x3e0
     ? generic_max_swapfile_size+0x40/0x40
     ? do_syscall_64+0x27/0x4b0
     ? entry_SYSCALL_64_after_hwframe+0x49/0xbe
     ? lockdep_hardirqs_on+0x38c/0x590
     __x64_sys_swapon+0x54/0x80
     do_syscall_64+0xa4/0x4b0
     entry_SYSCALL_64_after_hwframe+0x49/0xbe
    RIP: 0033:0x7f15da0a0dc7

Link: http://lkml.kernel.org/r/20200206090132.154869-1-naohiro.aota@wdc.com
Fixes: 1638045c3677 ("mm: set S_SWAPFILE on blockdev swap devices")
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Tested-by: Qais Youef <qais.yousef@arm.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/swapfile.c |   41 ++++++++++++++++++++---------------------
 1 file changed, 20 insertions(+), 21 deletions(-)

--- a/mm/swapfile.c~mm-swap-move-inode_lock-out-of-claim_swapfile
+++ a/mm/swapfile.c
@@ -2899,10 +2899,6 @@ static int claim_swapfile(struct swap_in
 		p->bdev = inode->i_sb->s_bdev;
 	}
 
-	inode_lock(inode);
-	if (IS_SWAPFILE(inode))
-		return -EBUSY;
-
 	return 0;
 }
 
@@ -3157,36 +3153,41 @@ SYSCALL_DEFINE2(swapon, const char __use
 	mapping = swap_file->f_mapping;
 	inode = mapping->host;
 
-	/* will take i_rwsem; */
 	error = claim_swapfile(p, inode);
 	if (unlikely(error))
 		goto bad_swap;
 
+	inode_lock(inode);
+	if (IS_SWAPFILE(inode)) {
+		error = -EBUSY;
+		goto bad_swap_unlock_inode;
+	}
+
 	/*
 	 * Read the swap header.
 	 */
 	if (!mapping->a_ops->readpage) {
 		error = -EINVAL;
-		goto bad_swap;
+		goto bad_swap_unlock_inode;
 	}
 	page = read_mapping_page(mapping, 0, swap_file);
 	if (IS_ERR(page)) {
 		error = PTR_ERR(page);
-		goto bad_swap;
+		goto bad_swap_unlock_inode;
 	}
 	swap_header = kmap(page);
 
 	maxpages = read_swap_header(p, swap_header, inode);
 	if (unlikely(!maxpages)) {
 		error = -EINVAL;
-		goto bad_swap;
+		goto bad_swap_unlock_inode;
 	}
 
 	/* OK, set up the swap map and apply the bad block list */
 	swap_map = vzalloc(maxpages);
 	if (!swap_map) {
 		error = -ENOMEM;
-		goto bad_swap;
+		goto bad_swap_unlock_inode;
 	}
 
 	if (bdi_cap_stable_pages_required(inode_to_bdi(inode)))
@@ -3211,7 +3212,7 @@ SYSCALL_DEFINE2(swapon, const char __use
 					GFP_KERNEL);
 		if (!cluster_info) {
 			error = -ENOMEM;
-			goto bad_swap;
+			goto bad_swap_unlock_inode;
 		}
 
 		for (ci = 0; ci < nr_cluster; ci++)
@@ -3220,7 +3221,7 @@ SYSCALL_DEFINE2(swapon, const char __use
 		p->percpu_cluster = alloc_percpu(struct percpu_cluster);
 		if (!p->percpu_cluster) {
 			error = -ENOMEM;
-			goto bad_swap;
+			goto bad_swap_unlock_inode;
 		}
 		for_each_possible_cpu(cpu) {
 			struct percpu_cluster *cluster;
@@ -3234,13 +3235,13 @@ SYSCALL_DEFINE2(swapon, const char __use
 
 	error = swap_cgroup_swapon(p->type, maxpages);
 	if (error)
-		goto bad_swap;
+		goto bad_swap_unlock_inode;
 
 	nr_extents = setup_swap_map_and_extents(p, swap_header, swap_map,
 		cluster_info, maxpages, &span);
 	if (unlikely(nr_extents < 0)) {
 		error = nr_extents;
-		goto bad_swap;
+		goto bad_swap_unlock_inode;
 	}
 	/* frontswap enabled? set up bit-per-page map for frontswap */
 	if (IS_ENABLED(CONFIG_FRONTSWAP))
@@ -3280,7 +3281,7 @@ SYSCALL_DEFINE2(swapon, const char __use
 
 	error = init_swap_address_space(p->type, maxpages);
 	if (error)
-		goto bad_swap;
+		goto bad_swap_unlock_inode;
 
 	/*
 	 * Flush any pending IO and dirty mappings before we start using this
@@ -3290,7 +3291,7 @@ SYSCALL_DEFINE2(swapon, const char __use
 	error = inode_drain_writes(inode);
 	if (error) {
 		inode->i_flags &= ~S_SWAPFILE;
-		goto bad_swap;
+		goto bad_swap_unlock_inode;
 	}
 
 	mutex_lock(&swapon_mutex);
@@ -3315,6 +3316,8 @@ SYSCALL_DEFINE2(swapon, const char __use
 
 	error = 0;
 	goto out;
+bad_swap_unlock_inode:
+	inode_unlock(inode);
 bad_swap:
 	free_percpu(p->percpu_cluster);
 	p->percpu_cluster = NULL;
@@ -3322,6 +3325,7 @@ bad_swap:
 		set_blocksize(p->bdev, p->old_block_size);
 		blkdev_put(p->bdev, FMODE_READ | FMODE_WRITE | FMODE_EXCL);
 	}
+	inode = NULL;
 	destroy_swap_extents(p);
 	swap_cgroup_swapoff(p->type);
 	spin_lock(&swap_lock);
@@ -3333,13 +3337,8 @@ bad_swap:
 	kvfree(frontswap_map);
 	if (inced_nr_rotate_swap)
 		atomic_dec(&nr_rotate_swap);
-	if (swap_file) {
-		if (inode) {
-			inode_unlock(inode);
-			inode = NULL;
-		}
+	if (swap_file)
 		filp_close(swap_file, NULL);
-	}
 out:
 	if (page && !IS_ERR(page)) {
 		kunmap(page);
_


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [patch 2/5] drivers/base/memory.c: indicate all memory blocks as removable
  2020-03-29  2:14 incoming Andrew Morton
  2020-03-29  2:17 ` [patch 1/5] mm/swapfile.c: move inode_lock out of claim_swapfile Andrew Morton
@ 2020-03-29  2:17 ` Andrew Morton
  2020-03-29 16:46   ` Linus Torvalds
  2020-03-29  2:17 ` [patch 3/5] hugetlb_cgroup: fix illegal access to memory Andrew Morton
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 20+ messages in thread
From: Andrew Morton @ 2020-03-29  2:17 UTC (permalink / raw)
  To: akpm, dan.j.williams, david, gregkh, heiko.carstens, kzak,
	linux-mm, mhocko, mhocko, mm-commits, ndfont, pbadari, rafael,
	rcj, stable, steve.scargall, torvalds

From: David Hildenbrand <david@redhat.com>
Subject: drivers/base/memory.c: indicate all memory blocks as removable

We see multiple issues with the implementation/interface to compute
whether a memory block can be offlined (exposed via
/sys/devices/system/memory/memoryX/removable) and would like to simplify
it (remove the implementation).

1. It runs basically lockless. While this might be good for performance,
   we see possible races with memory offlining that will require at least
   some sort of locking to fix.

2. Nowadays, more false positives are possible. No arch-specific checks
   are performed that validate if memory offlining will not be denied
   right away (and such check will require locking). For example, arm64
   won't allow to offline any memory block that was added during boot -
   which will imply a very high error rate. Other archs have other
   constraints.

3. The interface is inherently racy. E.g., if a memory block is
   detected to be removable (and was not a false positive at that time),
   there is still no guarantee that offlining will actually succeed. So
   any caller already has to deal with false positives.

4. It is unclear which performance benefit this interface actually
   provides. The introducing commit 5c755e9fd813 ("memory-hotplug: add
   sysfs removable attribute for hotplug memory remove") mentioned
	"A user-level agent must be able to identify which sections of
	 memory are likely to be removable before attempting the
	 potentially expensive operation."
   However, no actual performance comparison was included.

Known users:
- lsmem: Will group memory blocks based on the "removable" property. [1]
- chmem: Indirect user. It has a RANGE mode where one can specify
	 removable ranges identified via lsmem to be offlined. However, it
	 also has a "SIZE" mode, which allows a sysadmin to skip the manual
	 "identify removable blocks" step. [2]
- powerpc-utils: Uses the "removable" attribute to skip some memory
		 blocks right away when trying to find some to
		 offline+remove. However, with ballooning enabled, it
		 already skips this information completely (because it
		 once resulted in many false negatives). Therefore, the
		 implementation can deal with false positives properly
		 already. [3]

According to Nathan Fontenot, DLPAR on powerpc is nowadays no longer
driven from userspace via the drmgr command (powerpc-utils). Nowadays
it's managed in the kernel - including onlining/offlining of memory
blocks - triggered by drmgr writing to /sys/kernel/dlpar. So the
affected legacy userspace handling is only active on old kernels. Only ve=
ry
old versions of drmgr on a new kernel (unlikely) might execute slower -
totally acceptable.

With CONFIG_MEMORY_HOTREMOVE, always indicating "removable" should not
break any user space tool. We implement a very bad heuristic now.  Withou=
t
CONFIG_MEMORY_HOTREMOVE we cannot offline anything, so report
"not removable" as before.

Original discussion can be found in [4] ("[PATCH RFC v1] mm:
is_mem_section_removable() overhaul").

Other users of is_mem_section_removable() will be removed next, so that
we can remove is_mem_section_removable() completely.

[1] http://man7.org/linux/man-pages/man1/lsmem.1.html
[2] http://man7.org/linux/man-pages/man8/chmem.8.html
[3] https://github.com/ibm-power-utilities/powerpc-utils
[4] https://lkml.kernel.org/r/20200117105759.27905-1-david@redhat.com

Also, this patch probably fixes a crash reported by Steve. 
http://lkml.kernel.org/r/CAPcyv4jpdaNvJ67SkjyUJLBnBnXXQv686BiVW042g03FUmWLXw@mail.gmail.com

Link: http://lkml.kernel.org/r/20200128093542.6908-1-david@redhat.com
Signed-off-by: David Hildenbrand <david@redhat.com>
Suggested-by: Michal Hocko <mhocko@kernel.org>
Acked-by: Michal Hocko <mhocko@suse.com>
Reviewed-by: Nathan Fontenot <ndfont@gmail.com>
Reported-by: "Scargall, Steve" <steve.scargall@intel.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: Badari Pulavarty <pbadari@us.ibm.com>
Cc: Robert Jennings <rcj@linux.vnet.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Karel Zak <kzak@redhat.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 drivers/base/memory.c |   23 +++--------------------
 1 file changed, 3 insertions(+), 20 deletions(-)

--- a/drivers/base/memory.c~drivers-base-memoryc-indicate-all-memory-blocks-as-removable
+++ a/drivers/base/memory.c
@@ -97,30 +97,13 @@ static ssize_t phys_index_show(struct de
 }
 
 /*
- * Show whether the memory block is likely to be offlineable (or is already
- * offline). Once offline, the memory block could be removed. The return
- * value does, however, not indicate that there is a way to remove the
- * memory block.
+ * Legacy interface that we cannot remove. Always indicate "removable"
+ * with CONFIG_MEMORY_HOTREMOVE - bad heuristic.
  */
 static ssize_t removable_show(struct device *dev, struct device_attribute *attr,
 			      char *buf)
 {
-	struct memory_block *mem = to_memory_block(dev);
-	unsigned long pfn;
-	int ret = 1, i;
-
-	if (mem->state != MEM_ONLINE)
-		goto out;
-
-	for (i = 0; i < sections_per_block; i++) {
-		if (!present_section_nr(mem->start_section_nr + i))
-			continue;
-		pfn = section_nr_to_pfn(mem->start_section_nr + i);
-		ret &= is_mem_section_removable(pfn, PAGES_PER_SECTION);
-	}
-
-out:
-	return sprintf(buf, "%d\n", ret);
+	return sprintf(buf, "%d\n", (int)IS_ENABLED(CONFIG_MEMORY_HOTREMOVE));
 }
 
 /*
_


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [patch 3/5] hugetlb_cgroup: fix illegal access to memory
  2020-03-29  2:14 incoming Andrew Morton
  2020-03-29  2:17 ` [patch 1/5] mm/swapfile.c: move inode_lock out of claim_swapfile Andrew Morton
  2020-03-29  2:17 ` [patch 2/5] drivers/base/memory.c: indicate all memory blocks as removable Andrew Morton
@ 2020-03-29  2:17 ` Andrew Morton
  2020-03-29  2:17 ` [patch 4/5] mm: fork: fix kernel_stack memcg stats for various stack implementations Andrew Morton
  2020-03-29  2:17 ` [patch 5/5] mm/sparse: fix kernel crash with pfn_section_valid check Andrew Morton
  4 siblings, 0 replies; 20+ messages in thread
From: Andrew Morton @ 2020-03-29  2:17 UTC (permalink / raw)
  To: akpm, almasrymina, gscrivan, linux-mm, mike.kravetz, mm-commits,
	rientjes, tj, torvalds

From: Mina Almasry <almasrymina@google.com>
Subject: hugetlb_cgroup: fix illegal access to memory

This appears to be a mistake in commit faced7e0806cf ("mm: hugetlb
controller for cgroups v2").  Essentially that commit does a
hugetlb_cgroup_from_counter assuming that page_counter_try_charge has
initialized counter, but if page_counter_try_charge has failed then it
seems it does not initialize counter, so
hugetlb_cgroup_from_counter(counter) ends up pointing to random memory,
causing kasan to complain.

Solution, simply use h_cg, instead of
hugetlb_cgroup_from_counter(counter), since that is a reference to the
hugetlb_cgroup anyway.  After this change kasan ceases to complain.

Link: http://lkml.kernel.org/r/20200313223920.124230-1-almasrymina@google.com
Fixes: faced7e0806cf ("mm: hugetlb controller for cgroups v2")
Signed-off-by: Mina Almasry <almasrymina@google.com>
Reported-by: syzbot+cac0c4e204952cf449b1@syzkaller.appspotmail.com
Acked-by: Giuseppe Scrivano <gscrivan@redhat.com>
Acked-by: Tejun Heo <tj@kernel.org>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/hugetlb_cgroup.c |    3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

--- a/mm/hugetlb_cgroup.c~hugetlb_cgroup-fix-illegal-access-to-memory
+++ a/mm/hugetlb_cgroup.c
@@ -240,8 +240,7 @@ again:
 	if (!page_counter_try_charge(&h_cg->hugepage[idx], nr_pages,
 				     &counter)) {
 		ret = -ENOMEM;
-		hugetlb_event(hugetlb_cgroup_from_counter(counter, idx), idx,
-			      HUGETLB_MAX);
+		hugetlb_event(h_cg, idx, HUGETLB_MAX);
 	}
 	css_put(&h_cg->css);
 done:
_


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [patch 4/5] mm: fork: fix kernel_stack memcg stats for various stack implementations
  2020-03-29  2:14 incoming Andrew Morton
                   ` (2 preceding siblings ...)
  2020-03-29  2:17 ` [patch 3/5] hugetlb_cgroup: fix illegal access to memory Andrew Morton
@ 2020-03-29  2:17 ` Andrew Morton
  2020-03-29 16:41   ` Linus Torvalds
  2020-03-29  2:17 ` [patch 5/5] mm/sparse: fix kernel crash with pfn_section_valid check Andrew Morton
  4 siblings, 1 reply; 20+ messages in thread
From: Andrew Morton @ 2020-03-29  2:17 UTC (permalink / raw)
  To: akpm, bharata, guro, hannes, linux-mm, mhocko, mm-commits,
	shakeelb, stable, torvalds

From: Roman Gushchin <guro@fb.com>
Subject: mm: fork: fix kernel_stack memcg stats for various stack implementations

Depending on CONFIG_VMAP_STACK and the THREAD_SIZE / PAGE_SIZE ratio the
space for task stacks can be allocated using __vmalloc_node_range(),
alloc_pages_node() and kmem_cache_alloc_node().  In the first and the
second cases page->mem_cgroup pointer is set, but in the third it's not:
memcg membership of a slab page should be determined using the
memcg_from_slab_page() function, which looks at
page->slab_cache->memcg_params.memcg .  In this case, using
mod_memcg_page_state() (as in account_kernel_stack()) is incorrect:
page->mem_cgroup pointer is NULL even for pages charged to a non-root
memory cgroup.

It can lead to kernel_stack per-memcg counters permanently showing 0 on
some architectures (depending on the configuration).

In order to fix it, let's introduce a mod_memcg_obj_state() helper, which
takes a pointer to a kernel object as a first argument, uses
mem_cgroup_from_obj() to get a RCU-protected memcg pointer and calls
mod_memcg_state().  It allows to handle all possible configurations
(CONFIG_VMAP_STACK and various THREAD_SIZE/PAGE_SIZE values) without
spilling any memcg/kmem specifics into fork.c .

Note: This is a special version of the patch created for stable
backports. It contains code from the following two patches:
  - mm: memcg/slab: introduce mem_cgroup_from_obj()
  - mm: fork: fix kernel_stack memcg stats for various stack implementations

[guro@fb.com: introduce mem_cgroup_from_obj()]
  Link: http://lkml.kernel.org/r/20200324004221.GA36662@carbon.dhcp.thefacebook.com
Link: http://lkml.kernel.org/r/20200303233550.251375-1-guro@fb.com
Fixes: 4d96ba353075 ("mm: memcg/slab: stop setting page->mem_cgroup pointer for slab pages")
Signed-off-by: Roman Gushchin <guro@fb.com>
Reviewed-by: Shakeel Butt <shakeelb@google.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Bharata B Rao <bharata@linux.ibm.com>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 include/linux/memcontrol.h |   12 +++++++++++
 kernel/fork.c              |    4 +--
 mm/memcontrol.c            |   38 +++++++++++++++++++++++++++++++++++
 3 files changed, 52 insertions(+), 2 deletions(-)

--- a/include/linux/memcontrol.h~mm-fork-fix-kernel_stack-memcg-stats-for-various-stack-implementations
+++ a/include/linux/memcontrol.h
@@ -695,6 +695,7 @@ static inline unsigned long lruvec_page_
 void __mod_lruvec_state(struct lruvec *lruvec, enum node_stat_item idx,
 			int val);
 void __mod_lruvec_slab_state(void *p, enum node_stat_item idx, int val);
+void mod_memcg_obj_state(void *p, int idx, int val);
 
 static inline void mod_lruvec_state(struct lruvec *lruvec,
 				    enum node_stat_item idx, int val)
@@ -1123,6 +1124,10 @@ static inline void __mod_lruvec_slab_sta
 	__mod_node_page_state(page_pgdat(page), idx, val);
 }
 
+static inline void mod_memcg_obj_state(void *p, int idx, int val)
+{
+}
+
 static inline
 unsigned long mem_cgroup_soft_limit_reclaim(pg_data_t *pgdat, int order,
 					    gfp_t gfp_mask,
@@ -1427,6 +1432,8 @@ static inline int memcg_cache_id(struct
 	return memcg ? memcg->kmemcg_id : -1;
 }
 
+struct mem_cgroup *mem_cgroup_from_obj(void *p);
+
 #else
 
 static inline int memcg_kmem_charge(struct page *page, gfp_t gfp, int order)
@@ -1468,6 +1475,11 @@ static inline void memcg_put_cache_ids(v
 {
 }
 
+static inline struct mem_cgroup *mem_cgroup_from_obj(void *p)
+{
+       return NULL;
+}
+
 #endif /* CONFIG_MEMCG_KMEM */
 
 #endif /* _LINUX_MEMCONTROL_H */
--- a/kernel/fork.c~mm-fork-fix-kernel_stack-memcg-stats-for-various-stack-implementations
+++ a/kernel/fork.c
@@ -397,8 +397,8 @@ static void account_kernel_stack(struct
 		mod_zone_page_state(page_zone(first_page), NR_KERNEL_STACK_KB,
 				    THREAD_SIZE / 1024 * account);
 
-		mod_memcg_page_state(first_page, MEMCG_KERNEL_STACK_KB,
-				     account * (THREAD_SIZE / 1024));
+		mod_memcg_obj_state(stack, MEMCG_KERNEL_STACK_KB,
+				    account * (THREAD_SIZE / 1024));
 	}
 }
 
--- a/mm/memcontrol.c~mm-fork-fix-kernel_stack-memcg-stats-for-various-stack-implementations
+++ a/mm/memcontrol.c
@@ -777,6 +777,17 @@ void __mod_lruvec_slab_state(void *p, en
 	rcu_read_unlock();
 }
 
+void mod_memcg_obj_state(void *p, int idx, int val)
+{
+	struct mem_cgroup *memcg;
+
+	rcu_read_lock();
+	memcg = mem_cgroup_from_obj(p);
+	if (memcg)
+		mod_memcg_state(memcg, idx, val);
+	rcu_read_unlock();
+}
+
 /**
  * __count_memcg_events - account VM events in a cgroup
  * @memcg: the memory cgroup
@@ -2661,6 +2672,33 @@ static void commit_charge(struct page *p
 }
 
 #ifdef CONFIG_MEMCG_KMEM
+/*
+ * Returns a pointer to the memory cgroup to which the kernel object is charged.
+ *
+ * The caller must ensure the memcg lifetime, e.g. by taking rcu_read_lock(),
+ * cgroup_mutex, etc.
+ */
+struct mem_cgroup *mem_cgroup_from_obj(void *p)
+{
+	struct page *page;
+
+	if (mem_cgroup_disabled())
+		return NULL;
+
+	page = virt_to_head_page(p);
+
+	/*
+	 * Slab pages don't have page->mem_cgroup set because corresponding
+	 * kmem caches can be reparented during the lifetime. That's why
+	 * memcg_from_slab_page() should be used instead.
+	 */
+	if (PageSlab(page))
+		return memcg_from_slab_page(page);
+
+	/* All other pages use page->mem_cgroup */
+	return page->mem_cgroup;
+}
+
 static int memcg_alloc_cache_id(void)
 {
 	int id, size;
_


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [patch 5/5] mm/sparse: fix kernel crash with pfn_section_valid check
  2020-03-29  2:14 incoming Andrew Morton
                   ` (3 preceding siblings ...)
  2020-03-29  2:17 ` [patch 4/5] mm: fork: fix kernel_stack memcg stats for various stack implementations Andrew Morton
@ 2020-03-29  2:17 ` Andrew Morton
  4 siblings, 0 replies; 20+ messages in thread
From: Andrew Morton @ 2020-03-29  2:17 UTC (permalink / raw)
  To: akpm, aneesh.kumar, bhe, dan.j.williams, david, linux-mm, mhocko,
	mm-commits, mpe, osalvador, pankaj.gupta.linux, richard.weiyang,
	rppt, sachinp, stable, torvalds

From: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>
Subject: mm/sparse: fix kernel crash with pfn_section_valid check

Fix the below crash

BUG: Kernel NULL pointer dereference on read at 0x00000000
Faulting instruction address: 0xc000000000c3447c
Oops: Kernel access of bad area, sig: 11 [#1]
LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries
CPU: 11 PID: 7519 Comm: lt-ndctl Not tainted 5.6.0-rc7-autotest #1
...
NIP [c000000000c3447c] vmemmap_populated+0x98/0xc0
LR [c000000000088354] vmemmap_free+0x144/0x320
Call Trace:
 section_deactivate+0x220/0x240
 __remove_pages+0x118/0x170
 arch_remove_memory+0x3c/0x150
 memunmap_pages+0x1cc/0x2f0
 devm_action_release+0x30/0x50
 release_nodes+0x2f8/0x3e0
 device_release_driver_internal+0x168/0x270
 unbind_store+0x130/0x170
 drv_attr_store+0x44/0x60
 sysfs_kf_write+0x68/0x80
 kernfs_fop_write+0x100/0x290
 __vfs_write+0x3c/0x70
 vfs_write+0xcc/0x240
 ksys_write+0x7c/0x140
 system_call+0x5c/0x68

The crash is due to NULL dereference at

test_bit(idx, ms->usage->subsection_map); due to ms->usage = NULL; in
pfn_section_valid()

With commit d41e2f3bd546 ("mm/hotplug: fix hot remove failure in
SPARSEMEM|!VMEMMAP case") section_mem_map is set to NULL after
depopulate_section_mem().  This was done so that pfn_page() can work
correctly with kernel config that disables SPARSEMEM_VMEMMAP.  With that
config pfn_to_page does

	__section_mem_map_addr(__sec) + __pfn;

where

static inline struct page *__section_mem_map_addr(struct mem_section *section)
{
	unsigned long map = section->section_mem_map;
	map &= SECTION_MAP_MASK;
	return (struct page *)map;
}

Now with SPASEMEM_VMEMAP enabled, mem_section->usage->subsection_map is
used to check the pfn validity (pfn_valid()).  Since section_deactivate
release mem_section->usage if a section is fully deactivated, pfn_valid()
check after a subsection_deactivate cause a kernel crash.

static inline int pfn_valid(unsigned long pfn)
{
...
	return early_section(ms) || pfn_section_valid(ms, pfn);
}

where

static inline int pfn_section_valid(struct mem_section *ms, unsigned long pfn)
{
	int idx = subsection_map_index(pfn);

	return test_bit(idx, ms->usage->subsection_map);
}

Avoid this by clearing SECTION_HAS_MEM_MAP when mem_section->usage is
freed.  For architectures like ppc64 where large pages are used for
vmmemap mapping (16MB), a specific vmemmap mapping can cover multiple
sections.  Hence before a vmemmap mapping page can be freed, the kernel
needs to make sure there are no valid sections within that mapping. 
Clearing the section valid bit before depopulate_section_memap enables
this.

[aneesh.kumar@linux.ibm.com: add comment]
  Link: http://lkml.kernel.org/r/20200326133235.343616-1-aneesh.kumar@linux.ibm.comLink: http://lkml.kernel.org/r/20200325031914.107660-1-aneesh.kumar@linux.ibm.com
Fixes: d41e2f3bd546 ("mm/hotplug: fix hot remove failure in SPARSEMEM|!VMEMMAP case")
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Reported-by: Sachin Sant <sachinp@linux.vnet.ibm.com>
Tested-by: Sachin Sant <sachinp@linux.vnet.ibm.com>
Reviewed-by: Baoquan He <bhe@redhat.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
Reviewed-by: Wei Yang <richard.weiyang@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Mike Rapoport <rppt@linux.ibm.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/sparse.c |    6 ++++++
 1 file changed, 6 insertions(+)

--- a/mm/sparse.c~mm-sparse-fix-kernel-crash-with-pfn_section_valid-check
+++ a/mm/sparse.c
@@ -781,6 +781,12 @@ static void section_deactivate(unsigned
 			ms->usage = NULL;
 		}
 		memmap = sparse_decode_mem_map(ms->section_mem_map, section_nr);
+		/*
+		 * Mark the section invalid so that valid_section()
+		 * return false. This prevents code from dereferencing
+		 * ms->usage array.
+		 */
+		ms->section_mem_map &= ~SECTION_HAS_MEM_MAP;
 	}
 
 	if (section_is_early && memmap)
_


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [patch 4/5] mm: fork: fix kernel_stack memcg stats for various stack implementations
  2020-03-29  2:17 ` [patch 4/5] mm: fork: fix kernel_stack memcg stats for various stack implementations Andrew Morton
@ 2020-03-29 16:41   ` Linus Torvalds
  0 siblings, 0 replies; 20+ messages in thread
From: Linus Torvalds @ 2020-03-29 16:41 UTC (permalink / raw)
  To: Andrew Morton
  Cc: bharata, Roman Gushchin, Johannes Weiner, Linux-MM, Michal Hocko,
	mm-commits, Shakeel Butt, stable

On Sat, Mar 28, 2020 at 7:17 PM Andrew Morton <akpm@linux-foundation.org> wrote:
>
> Note: This is a special version of the patch created for stable
> backports. It contains code from the following two patches:
>   - mm: memcg/slab: introduce mem_cgroup_from_obj()
>   - mm: fork: fix kernel_stack memcg stats for various stack implementations

Whaa?

Nonsensical commit comment removed.

                Linus


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [patch 2/5] drivers/base/memory.c: indicate all memory blocks as removable
  2020-03-29  2:17 ` [patch 2/5] drivers/base/memory.c: indicate all memory blocks as removable Andrew Morton
@ 2020-03-29 16:46   ` Linus Torvalds
  2020-03-29 18:03     ` David Hildenbrand
  0 siblings, 1 reply; 20+ messages in thread
From: Linus Torvalds @ 2020-03-29 16:46 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Dan Williams, David Hildenbrand, Greg Kroah-Hartman,
	Heiko Carstens, Karel Zak, Linux-MM, Michal Hocko, Michal Hocko,
	mm-commits, ndfont, pbadari, Rafael Wysocki, rcj, stable,
	steve.scargall

Please, David H - whatever you do with email is WRONG.

Fix your completely broken email client. Stop doing this.

On Sat, Mar 28, 2020 at 7:17 PM Andrew Morton <akpm@linux-foundation.org> wrote:
>
> From: David Hildenbrand <david@redhat.com>
>  [...]
> According to Nathan Fontenot, DLPAR on powerpc is nowadays no longer
> driven from userspace via the drmgr command (powerpc-utils). Nowadays
> it's managed in the kernel - including onlining/offlining of memory
> blocks - triggered by drmgr writing to /sys/kernel/dlpar. So the
> affected legacy userspace handling is only active on old kernels. Only ve=
> ry
> old versions of drmgr on a new kernel (unlikely) might execute slower -
> totally acceptable.
>
> With CONFIG_MEMORY_HOTREMOVE, always indicating "removable" should not
> break any user space tool. We implement a very bad heuristic now.  Withou=
> t
> CONFIG_MEMORY_HOTREMOVE we cannot offline anything, so report
> "not removable" as before.

Notice the bogus MIME line continuation left-overs?

  [...] Only ve=
  ry

and

  [...] Withou=
  t

is just completely wrong.

You either have a completely broken email client that doesn't handle
MIME at all - get rid of it - or you're then dealing with raw mbox
data in a completely broken manner without handling MIME wrapping.

I can't figure out _what_ you're doing wrong, but the pattern is clear
by now: it's not Andrew (although Andrew should check explanations
better!), since it _only_ happens with patches from David Hildenbrand.

Fix your workflow. Because it's broken.

             Linus


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [patch 2/5] drivers/base/memory.c: indicate all memory blocks as removable
  2020-03-29 16:46   ` Linus Torvalds
@ 2020-03-29 18:03     ` David Hildenbrand
  2020-03-29 18:28       ` Linus Torvalds
  2020-03-29 19:43       ` Konstantin Ryabitsev
  0 siblings, 2 replies; 20+ messages in thread
From: David Hildenbrand @ 2020-03-29 18:03 UTC (permalink / raw)
  To: Linus Torvalds, Andrew Morton
  Cc: Dan Williams, Greg Kroah-Hartman, Heiko Carstens, Karel Zak,
	Linux-MM, Michal Hocko, Michal Hocko, mm-commits, ndfont,
	pbadari, Rafael Wysocki, rcj, stable, steve.scargall

On 29.03.20 18:46, Linus Torvalds wrote:
> Please, David H - whatever you do with email is WRONG.
> 
> Fix your completely broken email client. Stop doing this.

I'm really sorry this happened again - but I am afraid it's
not my email client that's broken :(

I've been using bare "git format-patch" + "git send-email" to
send out patches for years now, so I don't see what's wrong about that.

Now, I took a look at what arrived in my mail box (via cc) without
doing round trips through the RH mailing infrastructure and
what arrived via the mailing list:


What I received via CC (without anybody messing with my mail content):

Message-Id: <20200128093542.6908-1-david@redhat.com>
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit
X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13
[...]

And none of that "=" MIME crap.


What I received via the mailing list (e.g., linux-mm@kvack.org)

Message-Id: <20200128093542.6908-1-david@redhat.com>
MIME-Version: 1.0
X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13
X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4
Sender: owner-linux-mm@kvack.org
Precedence: bulk
X-Loop: owner-majordomo@kvack.org
List-ID: <linux-mm.kvack.org>
[...]
X-Mimecast-Spam-Score: 1
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: quoted-printable
X-Scanned-By: MIMEDefang 2.78 on 10.11.54.4
[...]

And a lot of this MIME crap.

I have no idea if such a conversion is expected to be done.

> 
> On Sat, Mar 28, 2020 at 7:17 PM Andrew Morton <akpm@linux-foundation.org> wrote:
>>
>> From: David Hildenbrand <david@redhat.com>
>>  [...]
>> According to Nathan Fontenot, DLPAR on powerpc is nowadays no longer
>> driven from userspace via the drmgr command (powerpc-utils). Nowadays
>> it's managed in the kernel - including onlining/offlining of memory
>> blocks - triggered by drmgr writing to /sys/kernel/dlpar. So the
>> affected legacy userspace handling is only active on old kernels. Only ve=
>> ry
>> old versions of drmgr on a new kernel (unlikely) might execute slower -
>> totally acceptable.
>>
>> With CONFIG_MEMORY_HOTREMOVE, always indicating "removable" should not
>> break any user space tool. We implement a very bad heuristic now.  Withou=
>> t
>> CONFIG_MEMORY_HOTREMOVE we cannot offline anything, so report
>> "not removable" as before.
> 
> Notice the bogus MIME line continuation left-overs?
> 
>   [...] Only ve=
>   ry
> 
> and
> 
>   [...] Withou=
>   t
> 
> is just completely wrong.

Yes, absolutely broken.

> You either have a completely broken email client that doesn't handle
> MIME at all - get rid of it - or you're then dealing with raw mbox
> data in a completely broken manner without handling MIME wrapping.

Again, just using git send-email :/

> > I can't figure out _what_ you're doing wrong, but the pattern is clear
> by now: it's not Andrew (although Andrew should check explanations
> better!), since it _only_ happens with patches from David Hildenbrand.
> 
> Fix your workflow. Because it's broken.

Let's have a look at some stuff I sent out during the last weeks:

https://lkml.kernel.org/r/20200319131221.14044-1-david@redhat.com

Already the cover letter looks horrible:

https://lore.kernel.org/linux-hyperv/20200319131221.14044-1-david@redhat.com/raw

And *again* does not match at all what I received directly via cc in my
mail box (== what I expect was sent via "git send-email").

Even the patch content has been partially converted

https://lore.kernel.org/linux-hyperv/20200319131221.14044-2-david@redhat.com/raw


Unless I am missing something important, the issue is not in mail client
setup, but there is something in the mailing infrastructure horribly
messing with my mails. Red Hat has recently switched to Mimecast and
there have been plenty of issues, maybe this is one of these.

I guess the only thing I can do is sending mails via a different mail
server / different email address?

I'll double check if any other patches still queued in -next are
similarly broken. I guess so.

-- 
Thanks,

David / dhildenb



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [patch 2/5] drivers/base/memory.c: indicate all memory blocks as removable
  2020-03-29 18:03     ` David Hildenbrand
@ 2020-03-29 18:28       ` Linus Torvalds
  2020-03-29 19:11         ` David Hildenbrand
  2020-03-29 19:43       ` Konstantin Ryabitsev
  1 sibling, 1 reply; 20+ messages in thread
From: Linus Torvalds @ 2020-03-29 18:28 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Andrew Morton, Dan Williams, Greg Kroah-Hartman, Heiko Carstens,
	Karel Zak, Linux-MM, Michal Hocko, Michal Hocko, mm-commits,
	ndfont, pbadari, Rafael Wysocki, rcj, stable, steve.scargall

On Sun, Mar 29, 2020 at 11:04 AM David Hildenbrand <david@redhat.com> wrote:
>
>
> What I received via the mailing list (e.g., linux-mm@kvack.org)
>
> Message-Id: <20200128093542.6908-1-david@redhat.com>
> MIME-Version: 1.0
> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13
> X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4
> Sender: owner-linux-mm@kvack.org
> Precedence: bulk
> X-Loop: owner-majordomo@kvack.org
> List-ID: <linux-mm.kvack.org>
> [...]
> X-Mimecast-Spam-Score: 1
> Content-Type: text/plain; charset=US-ASCII
> Content-Transfer-Encoding: quoted-printable
> X-Scanned-By: MIMEDefang 2.78 on 10.11.54.4
> [...]
>
> And a lot of this MIME crap.

Well, that may still be a perfectly fine email.

Yes, it has the MIME crap, but it also has that

  Content-Transfer-Encoding: quoted-printable

which should tell all users how to _handle_ that MIME crap.

It's sad that people in this day and age still don't just handle

  Content-Transfer-Encoding: 8bit

and just send it on untouched, but SMTP certainly encourages that bad
behavior of "convert to 7-bit MIME crap", because in theory there
could be SMTP servers out there that can't handle anything 8-bit or
with longer lines.

Those SMTP servers should just be scrapped and people told not to use
them, but sadly that's not the approach email people have taken.
They've taken the approach that old garbage SMTP servers should be
allowed to exist and destroy email for the rest of us.

> I have no idea if such a conversion is expected to be done.

It is (sadly) expected to be done by a lot of mail software.

But the problem is that some part of your email handling code then
doesn't _undo_ the MIME conversion, and leaves the MIME turds alone,
while then that "Content-Transfer-Encoding: quoted-printable" got
lost.

Do you at any point end up using a raw mbox and cut-and-pasting stuff?
Reading email in a broken mail-reader that doesn't undo MIME? Because
that's the usual way that these kinds of turds get copied.. Using raw
emails without honoring or taking that "Content-Transfer-Encoding"
into account.

                   Linus


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [patch 2/5] drivers/base/memory.c: indicate all memory blocks as removable
  2020-03-29 18:28       ` Linus Torvalds
@ 2020-03-29 19:11         ` David Hildenbrand
  2020-03-29 19:15           ` Linus Torvalds
  0 siblings, 1 reply; 20+ messages in thread
From: David Hildenbrand @ 2020-03-29 19:11 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Andrew Morton, Dan Williams, Greg Kroah-Hartman, Heiko Carstens,
	Karel Zak, Linux-MM, Michal Hocko, Michal Hocko, mm-commits,
	ndfont, pbadari, Rafael Wysocki, rcj, stable, steve.scargall

> Well, that may still be a perfectly fine email.
> 
> Yes, it has the MIME crap, but it also has that
> 
>   Content-Transfer-Encoding: quoted-printable
> 
> which should tell all users how to _handle_ that MIME crap.
> 
> It's sad that people in this day and age still don't just handle
> 
>   Content-Transfer-Encoding: 8bit
> 
> and just send it on untouched, but SMTP certainly encourages that bad
> behavior of "convert to 7-bit MIME crap", because in theory there
> could be SMTP servers out there that can't handle anything 8-bit or
> with longer lines.
> 
> Those SMTP servers should just be scrapped and people told not to use
> them, but sadly that's not the approach email people have taken.
> They've taken the approach that old garbage SMTP servers should be
> allowed to exist and destroy email for the rest of us.

Yeah, would save us trouble :)

> 
>> I have no idea if such a conversion is expected to be done.
> 
> It is (sadly) expected to be done by a lot of mail software.
> 
> But the problem is that some part of your email handling code then
> doesn't _undo_ the MIME conversion, and leaves the MIME turds alone,
> while then that "Content-Transfer-Encoding: quoted-printable" got
> lost.
> 
> Do you at any point end up using a raw mbox and cut-and-pasting stuff?

Just Thunderbolt for reading, and vim for editing. Really nothing
special. In this specific case, I don't think I copied anything back and
forth. Just a simple git commit and editing the message in vim.

The mail was sent around the same time the other two (?) broken mails
showed up (end of January/beginning of February) and ended up in your
mail box.


I checked my other patches that are in -next. All (especially the stuff
I sent recently) seem to be fine except one remaining patch, sent end of
February IIRC:

https://lkml.kernel.org/r/20200228095819.10750-2-david@redhat.com

It also has this issue with long lines in one instance. And for that
patch, I still have the original commit lying around here. Did a fresh
format-patch+send-mail to another mail address (via RH mailing
infrastructure). Again, converted to

Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: quoted-printable

But the MIME crap (for the newline) is gone. So the issue seems to be fixed.

> Reading email in a broken mail-reader that doesn't undo MIME? Because
> that's the usual way that these kinds of turds get copied.. Using raw
> emails without honoring or taking that "Content-Transfer-Encoding"
> into account.

Again, sorry for the trouble, I suspect bad mailing infrastructure that
has been fixed. Will pay attention if this starts happening again, and
then switch to another mail server/mail address, because something
within RH mailing infrastructure is making our life more difficult than
it should be.

-- 
Thanks,

David / dhildenb



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [patch 2/5] drivers/base/memory.c: indicate all memory blocks as removable
  2020-03-29 19:11         ` David Hildenbrand
@ 2020-03-29 19:15           ` Linus Torvalds
  0 siblings, 0 replies; 20+ messages in thread
From: Linus Torvalds @ 2020-03-29 19:15 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Andrew Morton, Dan Williams, Greg Kroah-Hartman, Heiko Carstens,
	Karel Zak, Linux-MM, Michal Hocko, Michal Hocko, mm-commits,
	ndfont, pbadari, Rafael Wysocki, rcj, stable, steve.scargall

On Sun, Mar 29, 2020 at 12:11 PM David Hildenbrand <david@redhat.com> wrote:
>
> Again, sorry for the trouble, I suspect bad mailing infrastructure that
> has been fixed. Will pay attention if this starts happening again, and
> then switch to another mail server/mail address, because something
> within RH mailing infrastructure is making our life more difficult than
> it should be.

Ok, thanks. Fingers crossed that it's all behind us,

               Linus


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [patch 2/5] drivers/base/memory.c: indicate all memory blocks as removable
  2020-03-29 18:03     ` David Hildenbrand
  2020-03-29 18:28       ` Linus Torvalds
@ 2020-03-29 19:43       ` Konstantin Ryabitsev
  2020-03-29 19:55         ` Linus Torvalds
  2020-03-30 14:54         ` David Hildenbrand
  1 sibling, 2 replies; 20+ messages in thread
From: Konstantin Ryabitsev @ 2020-03-29 19:43 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Linus Torvalds, Andrew Morton, Dan Williams, Greg Kroah-Hartman,
	Heiko Carstens, Karel Zak, Linux-MM, Michal Hocko, Michal Hocko,
	mm-commits, ndfont, pbadari, Rafael Wysocki, rcj, stable,
	steve.scargall

On Sun, Mar 29, 2020 at 08:03:52PM +0200, David Hildenbrand wrote:
> > Please, David H - whatever you do with email is WRONG.
> > 
> > Fix your completely broken email client. Stop doing this.

To butt in uninvited into this conversation, it doesn't look like 
there's anything really wrong with what David sends, except for the 
quoted-printable formatting, which is probably converted automatically 
by one of Red Hat's relay MTAs.

> What I received via the mailing list (e.g., linux-mm@kvack.org)
> 
> Message-Id: <20200128093542.6908-1-david@redhat.com>
> MIME-Version: 1.0
> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13
> X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4
> Sender: owner-linux-mm@kvack.org
> Precedence: bulk
> X-Loop: owner-majordomo@kvack.org
> List-ID: <linux-mm.kvack.org>
> [...]
> X-Mimecast-Spam-Score: 1
> Content-Type: text/plain; charset=US-ASCII
> Content-Transfer-Encoding: quoted-printable
> X-Scanned-By: MIMEDefang 2.78 on 10.11.54.4
> [...]
> 
> And a lot of this MIME crap.
> 
> I have no idea if such a conversion is expected to be done.

In theory, it doesn't really matter, as mail clients are supposed to be 
properly undoing all this 7-bit legacy madness. When we run that thread 
through "b4 am" to get things back into 8bit, everything looks just 
fine. You can try it yourself:

b4 am 20200128093542.6908-1-david@redhat.com

> Unless I am missing something important, the issue is not in mail client
> setup, but there is something in the mailing infrastructure horribly
> messing with my mails. Red Hat has recently switched to Mimecast and
> there have been plenty of issues, maybe this is one of these.
> 
> I guess the only thing I can do is sending mails via a different mail
> server / different email address?

It would appear that the workflow Andrew uses to queue up patches from 
you isn't expecting quoted-printable formatting, which is why when Linus 
gets them, they are mangled.

We would either need to switch Andrew to a set of tools that handle 7bit 
legacy formats better, or figure out how you can send things via MTAs 
that won't convert from 8bit to quoted-printable. Maybe you can convince 
Red Hat to set up their relays to always preserve 8bit?

-K


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [patch 2/5] drivers/base/memory.c: indicate all memory blocks as removable
  2020-03-29 19:43       ` Konstantin Ryabitsev
@ 2020-03-29 19:55         ` Linus Torvalds
  2020-03-29 19:56           ` Linus Torvalds
  2020-03-30 14:46           ` Konstantin Ryabitsev
  2020-03-30 14:54         ` David Hildenbrand
  1 sibling, 2 replies; 20+ messages in thread
From: Linus Torvalds @ 2020-03-29 19:55 UTC (permalink / raw)
  To: Konstantin Ryabitsev
  Cc: David Hildenbrand, Andrew Morton, Dan Williams,
	Greg Kroah-Hartman, Heiko Carstens, Karel Zak, Linux-MM,
	Michal Hocko, Michal Hocko, mm-commits, ndfont, pbadari,
	Rafael Wysocki, rcj, stable, steve.scargall

On Sun, Mar 29, 2020 at 12:43 PM Konstantin Ryabitsev
<konstantin@linuxfoundation.org> wrote:
>
> It would appear that the workflow Andrew uses to queue up patches from
> you isn't expecting quoted-printable formatting, which is why when Linus
> gets them, they are mangled.

I don't think that's the case.

Why? Because I see _proper_ handling of MIME and quoted-printable from
Andrew all the time.

For example, anything from Jérôme Glisse always ends up having been
quoted-printable, simply because of how Jérôme's emails look, and
because he has 8-bit characters in his name.

There are other examples of the same thing - a lot of the emails I get
from Andrew do end up having quoted-printable encoding.

It's only David's patches that then end up having lost the encoding
marker, but have QP sequences in the commit message.

The odd thing is that the *patches* are fine, even if they have equals
signs etc that would have been QP-encoded too. So it's literally just
the commit message that tends to be corrupt.

Which is why I was suspecting people cut-and-pasting the raw emails
for examplanations or something similar.

                Linus


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [patch 2/5] drivers/base/memory.c: indicate all memory blocks as removable
  2020-03-29 19:55         ` Linus Torvalds
@ 2020-03-29 19:56           ` Linus Torvalds
  2020-03-30 14:46           ` Konstantin Ryabitsev
  1 sibling, 0 replies; 20+ messages in thread
From: Linus Torvalds @ 2020-03-29 19:56 UTC (permalink / raw)
  To: Konstantin Ryabitsev
  Cc: David Hildenbrand, Andrew Morton, Dan Williams,
	Greg Kroah-Hartman, Heiko Carstens, Karel Zak, Linux-MM,
	Michal Hocko, Michal Hocko, mm-commits, ndfont, pbadari,
	Rafael Wysocki, rcj, stable, steve.scargall

On Sun, Mar 29, 2020 at 12:55 PM Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> Which is why I was suspecting people cut-and-pasting the raw emails
> for examplanations or something similar.

"examplantions"? Really?

I had some kind of mini-stroke. Example/explanations ;)

             Linus


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [patch 2/5] drivers/base/memory.c: indicate all memory blocks as removable
  2020-03-29 19:55         ` Linus Torvalds
  2020-03-29 19:56           ` Linus Torvalds
@ 2020-03-30 14:46           ` Konstantin Ryabitsev
  1 sibling, 0 replies; 20+ messages in thread
From: Konstantin Ryabitsev @ 2020-03-30 14:46 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: David Hildenbrand, Andrew Morton, Dan Williams,
	Greg Kroah-Hartman, Heiko Carstens, Karel Zak, Linux-MM,
	Michal Hocko, Michal Hocko, mm-commits, ndfont, pbadari,
	Rafael Wysocki, rcj, stable, steve.scargall

On Sun, Mar 29, 2020 at 12:55:09PM -0700, Linus Torvalds wrote:
> On Sun, Mar 29, 2020 at 12:43 PM Konstantin Ryabitsev
> <konstantin@linuxfoundation.org> wrote:
> >
> > It would appear that the workflow Andrew uses to queue up patches from
> > you isn't expecting quoted-printable formatting, which is why when Linus
> > gets them, they are mangled.
> 
> I don't think that's the case.
> 
> Why? Because I see _proper_ handling of MIME and quoted-printable from
> Andrew all the time.

Hmm... You are correct. I see that Naohiro Aota's original patch was 
also QP-encoded. I'm just as confused as everyone, then. :) As far as I 
can tell, there is no meaningful difference between David's emails and 
Naohiro's:

https://lore.kernel.org/linux-mm/20200206090132.154869-1-naohiro.aota@wdc.com/raw
https://lore.kernel.org/linux-mm/20200128093542.6908-1-david@redhat.com/raw

David's original patch is well-formed and the only notable difference 
between the two is that there's a line of ==== in Nahorio's email that 
makes it immediately obvious that the message needs to be decoded before 
it can be used.

-K


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [patch 2/5] drivers/base/memory.c: indicate all memory blocks as removable
  2020-03-29 19:43       ` Konstantin Ryabitsev
  2020-03-29 19:55         ` Linus Torvalds
@ 2020-03-30 14:54         ` David Hildenbrand
  2020-04-01 13:35           ` David Hildenbrand
  1 sibling, 1 reply; 20+ messages in thread
From: David Hildenbrand @ 2020-03-30 14:54 UTC (permalink / raw)
  To: Konstantin Ryabitsev
  Cc: Linus Torvalds, Andrew Morton, Dan Williams, Greg Kroah-Hartman,
	Heiko Carstens, Karel Zak, Linux-MM, Michal Hocko, Michal Hocko,
	mm-commits, ndfont, pbadari, Rafael Wysocki, rcj, stable,
	steve.scargall

> It would appear that the workflow Andrew uses to queue up patches from 
> you isn't expecting quoted-printable formatting, which is why when Linus 
> gets them, they are mangled.
> 
> We would either need to switch Andrew to a set of tools that handle 7bit 
> legacy formats better, or figure out how you can send things via MTAs 
> that won't convert from 8bit to quoted-printable. Maybe you can convince 
> Red Hat to set up their relays to always preserve 8bit?

I'll give it a try, but I think it's rather unlikely ... :)


-- 
Thanks,

David / dhildenb



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [patch 2/5] drivers/base/memory.c: indicate all memory blocks as removable
  2020-03-30 14:54         ` David Hildenbrand
@ 2020-04-01 13:35           ` David Hildenbrand
  2020-04-01 15:33             ` Konstantin Ryabitsev
  0 siblings, 1 reply; 20+ messages in thread
From: David Hildenbrand @ 2020-04-01 13:35 UTC (permalink / raw)
  To: Konstantin Ryabitsev
  Cc: Linus Torvalds, Andrew Morton, Dan Williams, Greg Kroah-Hartman,
	Heiko Carstens, Karel Zak, Linux-MM, Michal Hocko, Michal Hocko,
	mm-commits, ndfont, pbadari, Rafael Wysocki, rcj, stable,
	steve.scargall

On 30.03.20 16:54, David Hildenbrand wrote:
>> It would appear that the workflow Andrew uses to queue up patches from 
>> you isn't expecting quoted-printable formatting, which is why when Linus 
>> gets them, they are mangled.
>>
>> We would either need to switch Andrew to a set of tools that handle 7bit 
>> legacy formats better, or figure out how you can send things via MTAs 
>> that won't convert from 8bit to quoted-printable. Maybe you can convince 
>> Red Hat to set up their relays to always preserve 8bit?
> 
> I'll give it a try, but I think it's rather unlikely ... :)

So, people are looking into. Literally any mail that goes via Mimecast
servers (at least sent by me!) is converted *for whatever reason* to
quoted-printable.

E.g., patches I punched out today via "git send-email" even have the
line continuations thingy again (they disappeared for a while, maybe
there are different MTAs involved and it's like playing the lottery)

https://lore.kernel.org/linux-mm/20200401104156.11564-2-david@redhat.com/raw

From what I can tell the mail itself is fine once converted, it's just
nasty that 8-bit is converted *for whatever reason*.

-- 
Thanks,

David / dhildenb



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [patch 2/5] drivers/base/memory.c: indicate all memory blocks as removable
  2020-04-01 13:35           ` David Hildenbrand
@ 2020-04-01 15:33             ` Konstantin Ryabitsev
  2020-04-01 15:40               ` David Hildenbrand
  0 siblings, 1 reply; 20+ messages in thread
From: Konstantin Ryabitsev @ 2020-04-01 15:33 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Linus Torvalds, Andrew Morton, Dan Williams, Greg Kroah-Hartman,
	Heiko Carstens, Karel Zak, Linux-MM, Michal Hocko, Michal Hocko,
	mm-commits, ndfont, pbadari, Rafael Wysocki, rcj, stable,
	steve.scargall

On Wed, Apr 01, 2020 at 03:35:01PM +0200, David Hildenbrand wrote:
> >> We would either need to switch Andrew to a set of tools that handle 
> >> 7bit legacy formats better, or figure out how you can send things 
> >> via MTAs that won't convert from 8bit to quoted-printable. Maybe 
> >> you can convince Red Hat to set up their relays to always preserve 
> >> 8bit?
> > 
> > I'll give it a try, but I think it's rather unlikely ... :)
> 
> So, people are looking into. Literally any mail that goes via Mimecast
> servers (at least sent by me!) is converted *for whatever reason* to
> quoted-printable.

I mean, it's not *wrong* to do that -- older mail standards required 
that all MTA-to-MTA communication should be done in 7bit. But we're 
literally talking previous-century legacy protocols here. Forcefully 
converting all mail to 7bit is about the most 90s thing you can do these 
days, short of being really into mullets and Arsenio Hall.

> E.g., patches I punched out today via "git send-email" even have the
> line continuations thingy again (they disappeared for a while, maybe
> there are different MTAs involved and it's like playing the lottery)

Those show up when your lines are longer than 76 characters. Because, 
you know, otherwise the message would be too wide to fit through the 
ethernet cable.

https://en.wikipedia.org/wiki/Quoted-printable#Quoted-Printable_encoding

Best regards,
-K


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [patch 2/5] drivers/base/memory.c: indicate all memory blocks as removable
  2020-04-01 15:33             ` Konstantin Ryabitsev
@ 2020-04-01 15:40               ` David Hildenbrand
  0 siblings, 0 replies; 20+ messages in thread
From: David Hildenbrand @ 2020-04-01 15:40 UTC (permalink / raw)
  To: Konstantin Ryabitsev
  Cc: Linus Torvalds, Andrew Morton, Dan Williams, Greg Kroah-Hartman,
	Heiko Carstens, Karel Zak, Linux-MM, Michal Hocko, Michal Hocko,
	mm-commits, ndfont, pbadari, Rafael Wysocki, rcj, stable,
	steve.scargall

On 01.04.20 17:33, Konstantin Ryabitsev wrote:
> On Wed, Apr 01, 2020 at 03:35:01PM +0200, David Hildenbrand wrote:
>>>> We would either need to switch Andrew to a set of tools that handle 
>>>> 7bit legacy formats better, or figure out how you can send things 
>>>> via MTAs that won't convert from 8bit to quoted-printable. Maybe 
>>>> you can convince Red Hat to set up their relays to always preserve 
>>>> 8bit?
>>>
>>> I'll give it a try, but I think it's rather unlikely ... :)
>>
>> So, people are looking into. Literally any mail that goes via Mimecast
>> servers (at least sent by me!) is converted *for whatever reason* to
>> quoted-printable.
> 
> I mean, it's not *wrong* to do that -- older mail standards required 
> that all MTA-to-MTA communication should be done in 7bit. But we're 
> literally talking previous-century legacy protocols here. Forcefully 
> converting all mail to 7bit is about the most 90s thing you can do these 
> days, short of being really into mullets and Arsenio Hall.

The last sentence really made my day, thanks :D

> 
>> E.g., patches I punched out today via "git send-email" even have the
>> line continuations thingy again (they disappeared for a while, maybe
>> there are different MTAs involved and it's like playing the lottery)
> 
> Those show up when your lines are longer than 76 characters. Because, 
> you know, otherwise the message would be too wide to fit through the 
> ethernet cable.

Yeah, however, the mail servers I'm using are not doing this
consistently. Maybe some of them are more advanced than others :)

Let's see if IT can teach these mail servers about the 21 century ...

-- 
Thanks,

David / dhildenb



^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2020-04-01 15:40 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-03-29  2:14 incoming Andrew Morton
2020-03-29  2:17 ` [patch 1/5] mm/swapfile.c: move inode_lock out of claim_swapfile Andrew Morton
2020-03-29  2:17 ` [patch 2/5] drivers/base/memory.c: indicate all memory blocks as removable Andrew Morton
2020-03-29 16:46   ` Linus Torvalds
2020-03-29 18:03     ` David Hildenbrand
2020-03-29 18:28       ` Linus Torvalds
2020-03-29 19:11         ` David Hildenbrand
2020-03-29 19:15           ` Linus Torvalds
2020-03-29 19:43       ` Konstantin Ryabitsev
2020-03-29 19:55         ` Linus Torvalds
2020-03-29 19:56           ` Linus Torvalds
2020-03-30 14:46           ` Konstantin Ryabitsev
2020-03-30 14:54         ` David Hildenbrand
2020-04-01 13:35           ` David Hildenbrand
2020-04-01 15:33             ` Konstantin Ryabitsev
2020-04-01 15:40               ` David Hildenbrand
2020-03-29  2:17 ` [patch 3/5] hugetlb_cgroup: fix illegal access to memory Andrew Morton
2020-03-29  2:17 ` [patch 4/5] mm: fork: fix kernel_stack memcg stats for various stack implementations Andrew Morton
2020-03-29 16:41   ` Linus Torvalds
2020-03-29  2:17 ` [patch 5/5] mm/sparse: fix kernel crash with pfn_section_valid check Andrew Morton

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).