linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Michael Ellerman <mpe@ellerman.id.au>
To: David Hildenbrand <david@redhat.com>, linux-kernel@vger.kernel.org
Cc: linux-mm@kvack.org, linuxppc-dev@lists.ozlabs.org,
	David Hildenbrand <david@redhat.com>,
	Michal Hocko <mhocko@kernel.org>,
	Benjamin Herrenschmidt <benh@kernel.crashing.org>,
	Paul Mackerras <paulus@samba.org>,
	Rashmica Gupta <rashmica.g@gmail.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Mike Rapoport <rppt@kernel.org>, Michal Hocko <mhocko@suse.com>,
	Oscar Salvador <osalvador@suse.de>,
	Wei Yang <richard.weiyang@linux.alibaba.com>
Subject: Re: [PATCH v1 4/4] powernv/memtrace: don't abuse memory hot(un)plug infrastructure for memory allocations
Date: Thu, 05 Nov 2020 13:40:42 +1100	[thread overview]
Message-ID: <87o8kcttjp.fsf@mpe.ellerman.id.au> (raw)
In-Reply-To: <20201029162718.29910-5-david@redhat.com>

David Hildenbrand <david@redhat.com> writes:
> Let's use alloc_contig_pages() for allocating memory and remove the
> linear mapping manually via arch_remove_linear_mapping(). Mark all pages
> PG_offline, such that they will definitely not get touched - e.g.,
> when hibernating. When freeing memory, try to revert what we did.
>
> The original idea was discussed in:
>  https://lkml.kernel.org/r/48340e96-7e6b-736f-9e23-d3111b915b6e@redhat.com
>
> This is similar to CONFIG_DEBUG_PAGEALLOC handling on other
> architectures, whereby only single pages are unmapped from the linear
> mapping. Let's mimic what memory hot(un)plug would do with the linear
> mapping.
>
> We now need MEMORY_HOTPLUG and CONTIG_ALLOC as dependencies.
>
> Simple test under QEMU TCG (10GB RAM, single NUMA node):
>
> sh-5.0# mount -t debugfs none /sys/kernel/debug/
> sh-5.0# cat /sys/devices/system/memory/block_size_bytes
> 40000000
> sh-5.0# echo 0x40000000 > /sys/kernel/debug/powerpc/memtrace/enable
> [   71.052836][  T356] memtrace: Allocated trace memory on node 0 at 0x0000000080000000
> sh-5.0# echo 0x80000000 > /sys/kernel/debug/powerpc/memtrace/enable
> [   75.424302][  T356] radix-mmu: Mapped 0x0000000080000000-0x00000000c0000000 with 64.0 KiB pages
> [   75.430549][  T356] memtrace: Freed trace memory back on node 0
> [   75.604520][  T356] memtrace: Allocated trace memory on node 0 at 0x0000000080000000
> sh-5.0# echo 0x100000000 > /sys/kernel/debug/powerpc/memtrace/enable
> [   80.418835][  T356] radix-mmu: Mapped 0x0000000080000000-0x0000000100000000 with 64.0 KiB pages
> [   80.430493][  T356] memtrace: Freed trace memory back on node 0
> [   80.433882][  T356] memtrace: Failed to allocate trace memory on node 0
> sh-5.0# echo 0x40000000 > /sys/kernel/debug/powerpc/memtrace/enable
> [   91.920158][  T356] memtrace: Allocated trace memory on node 0 at 0x0000000080000000

I gave this a quick spin on a real machine, seems to work OK.

I don't have the actual memtrace tools setup to do an actual trace, will
try and get someone to test that also.

One observation is that previously the memory was zeroed when enabling
the memtrace, whereas now it's not.

eg, before:

  # hexdump -C /sys/kernel/debug/powerpc/memtrace/00000000/trace 
  00000000  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
  *
  10000000

whereas after:

  # hexdump -C /sys/kernel/debug/powerpc/memtrace/00000000/trace
  00000000  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
  *
  00000080  e0 fd 43 00 00 00 00 00  e0 fd 43 00 00 00 00 00  |..C.......C.....|
  00000090  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
  *
  00000830  98 bf 39 00 00 00 00 00  98 bf 39 00 00 00 00 00  |..9.......9.....|
  00000840  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
  *
  000008a0  b0 c8 47 00 00 00 00 00  b0 c8 47 00 00 00 00 00  |..G.......G.....|
  000008b0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
  ...
  0fffff70  78 53 49 7d 00 00 29 2e  88 00 92 41 01 00 49 39  |xSI}..)....A..I9|
  0fffff80  b4 07 4a 7d 28 f8 00 7d  00 48 08 7c 0c 00 c2 40  |..J}(..}.H.|...@|
  0fffff90  2d f9 40 7d f0 ff c2 40  b4 07 0a 7d 00 48 8a 7f  |-.@}...@...}.H..|
  0fffffa0  70 fe 9e 41 cc ff ff 4b  00 00 00 60 00 00 00 60  |p..A...K...`...`|
  0fffffb0  01 00 00 48 00 00 00 60  00 00 a3 2f 0c fd 9e 40  |...H...`.../...@|
  0fffffc0  00 00 a2 3c 00 00 a5 e8  00 00 62 3c 00 00 63 e8  |...<......b<..c.|
  0fffffd0  01 00 20 39 83 02 80 38  00 00 3c 99 01 00 00 48  |.. 9...8..<....H|
  0fffffe0  00 00 00 60 e4 fc ff 4b  00 00 80 38 78 fb e3 7f  |...`...K...8x...|
  0ffffff0  01 00 00 48 00 00 00 60  2c fe ff 4b 00 00 00 60  |...H...`,..K...`|
  10000000


That's a nice way for root to read kernel memory, so we should probably
add a __GFP_ZERO or memset in there somewhere.

cheers

  parent reply	other threads:[~2020-11-05  2:40 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-10-29 16:27 [PATCH v1 0/4] powernv/memtrace: don't abuse memory hot(un)plug infrastructure for memory allocations David Hildenbrand
2020-10-29 16:27 ` [PATCH v1 1/4] powerpc/mm: factor out creating/removing linear mapping David Hildenbrand
2020-10-29 16:27 ` [PATCH v1 2/4] powerpc/mm: print warning in arch_remove_linear_mapping() David Hildenbrand
2020-11-04  9:42   ` osalvador
2020-11-11 12:10     ` David Hildenbrand
2020-10-29 16:27 ` [PATCH v1 3/4] powerpc/mm: remove linear mapping if __add_pages() fails in arch_add_memory() David Hildenbrand
2020-11-04  9:50   ` osalvador
2020-11-04 12:06     ` Mike Rapoport
2020-11-04 12:11       ` Oscar Salvador
2020-11-11 12:07         ` David Hildenbrand
2020-11-04 12:11   ` Oscar Salvador
2020-10-29 16:27 ` [PATCH v1 4/4] powernv/memtrace: don't abuse memory hot(un)plug infrastructure for memory allocations David Hildenbrand
2020-11-03  9:23   ` Michal Hocko
2020-11-03  9:29     ` David Hildenbrand
2020-11-05  2:40   ` Michael Ellerman [this message]
2020-11-05  8:29     ` David Hildenbrand
2020-11-05 10:47       ` Michael Ellerman
2020-11-25 11:57 ` [PATCH v1 0/4] " Michael Ellerman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87o8kcttjp.fsf@mpe.ellerman.id.au \
    --to=mpe@ellerman.id.au \
    --cc=akpm@linux-foundation.org \
    --cc=benh@kernel.crashing.org \
    --cc=david@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=mhocko@kernel.org \
    --cc=mhocko@suse.com \
    --cc=osalvador@suse.de \
    --cc=paulus@samba.org \
    --cc=rashmica.g@gmail.com \
    --cc=richard.weiyang@linux.alibaba.com \
    --cc=rppt@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).