linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] Add results of early memtest to /proc/meminfo
@ 2023-03-17 19:30 Tomáš Mudruňka
  2023-03-17 23:56 ` Andrew Morton
  0 siblings, 1 reply; 5+ messages in thread
From: Tomáš Mudruňka @ 2023-03-17 19:30 UTC (permalink / raw)
  To: Mike Rapoport, Andrew Morton; +Cc: linux-kernel, linux-fsdevel, linux-mm

[-- Attachment #1: Type: text/plain, Size: 3014 bytes --]

Currently the memtest results were only presented in dmesg.
This adds /proc/meminfo entry which can be easily used by scripts.

Signed-off-by: Tomas Mudrunka <tomas.mudrunka@gmail.com>
---
 fs/proc/meminfo.c        | 13 +++++++++++++
 include/linux/memblock.h |  2 ++
 mm/memtest.c             |  5 +++++
 3 files changed, 20 insertions(+)

diff --git a/fs/proc/meminfo.c b/fs/proc/meminfo.c
index 440960110..844bb7e17 100644
--- a/fs/proc/meminfo.c
+++ b/fs/proc/meminfo.c
@@ -6,6 +6,7 @@
 #include <linux/hugetlb.h>
 #include <linux/mman.h>
 #include <linux/mmzone.h>
+#include <linux/memblock.h>
 #include <linux/proc_fs.h>
 #include <linux/percpu.h>
 #include <linux/seq_file.h>
@@ -131,6 +132,18 @@ static int meminfo_proc_show(struct seq_file *m, void
*v)
  show_val_kb(m, "VmallocChunk:   ", 0ul);
  show_val_kb(m, "Percpu:         ", pcpu_nr_pages());

+#ifdef CONFIG_MEMTEST
+ /* Only show 0 Bad memory when test was actually run.
+ * Make sure bad regions smaller than 1kB are not reported as 0.
+ * That way when 0 is reported we can be sure there actually was
successful test */
+ if (early_memtest_done)
+ seq_printf(m, "EarlyMemtestBad:   %5lu kB\n",
+ (unsigned long) (
+ ((early_memtest_bad_size>0) && (early_memtest_bad_size>>10 <= 0))
+ ? 1
+ : early_memtest_bad_size>>10));
+#endif
+
 #ifdef CONFIG_MEMORY_FAILURE
  seq_printf(m, "HardwareCorrupted: %5lu kB\n",
    atomic_long_read(&num_poisoned_pages) << (PAGE_SHIFT - 10));
diff --git a/include/linux/memblock.h b/include/linux/memblock.h
index 50ad19662..b206b2d9d 100644
--- a/include/linux/memblock.h
+++ b/include/linux/memblock.h
@@ -597,6 +597,8 @@ extern int hashdist; /* Distribute hashes across NUMA
nodes? */
 #endif

 #ifdef CONFIG_MEMTEST
+extern phys_addr_t early_memtest_bad_size; /* Size of faulty ram found by
memtest */
+extern int early_memtest_done; /* How many memtest passes were done? */
 extern void early_memtest(phys_addr_t start, phys_addr_t end);
 #else
 static inline void early_memtest(phys_addr_t start, phys_addr_t end)
diff --git a/mm/memtest.c b/mm/memtest.c
index f53ace709..f8e9edebf 100644
--- a/mm/memtest.c
+++ b/mm/memtest.c
@@ -4,6 +4,9 @@
 #include <linux/init.h>
 #include <linux/memblock.h>

+int early_memtest_done = 0;
+phys_addr_t early_memtest_bad_size = 0;
+
 static u64 patterns[] __initdata = {
  /* The first entry has to be 0 to leave memtest with zeroed memory */
  0,
@@ -30,6 +33,7 @@ static void __init reserve_bad_mem(u64 pattern,
phys_addr_t start_bad, phys_addr
  pr_info("  %016llx bad mem addr %pa - %pa reserved\n",
  cpu_to_be64(pattern), &start_bad, &end_bad);
  memblock_reserve(start_bad, end_bad - start_bad);
+ early_memtest_bad_size += (end_bad - start_bad);
 }

 static void __init memtest(u64 pattern, phys_addr_t start_phys,
phys_addr_t size)
@@ -78,6 +82,7 @@ static void __init do_one_pass(u64 pattern, phys_addr_t
start, phys_addr_t end)
  memtest(pattern, this_start, this_end - this_start);
  }
  }
+ early_memtest_done++;
 }

 /* default is disabled */
-- 
2.40.0

[-- Attachment #2: Type: text/html, Size: 3602 bytes --]

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] Add results of early memtest to /proc/meminfo
  2023-03-17 19:30 [PATCH] Add results of early memtest to /proc/meminfo Tomáš Mudruňka
@ 2023-03-17 23:56 ` Andrew Morton
  2023-03-21 10:34   ` [PATCH v2] " Tomas Mudrunka
  2023-03-21 10:58   ` Re: [PATCH] " Tomas Mudrunka
  0 siblings, 2 replies; 5+ messages in thread
From: Andrew Morton @ 2023-03-17 23:56 UTC (permalink / raw)
  To: Tomáš Mudruňka
  Cc: Mike Rapoport, linux-kernel, linux-fsdevel, linux-mm

On Fri, 17 Mar 2023 20:30:01 +0100 Tomáš Mudruňka <tomas.mudrunka@gmail.com> wrote:

> Currently the memtest results were only presented in dmesg.
> This adds /proc/meminfo entry which can be easily used by scripts.
> 

/proc/meminfo is documented in Documentation/filesystems/proc.rst,
please.

meminfo is rather top-level and important.  Is this data sufficiently
important to justify a place there?

Please describe the value.  The use-case(s).  Why would people want
this?

> --- a/fs/proc/meminfo.c
> +++ b/fs/proc/meminfo.c
> @@ -6,6 +6,7 @@
>  #include <linux/hugetlb.h>
>  #include <linux/mman.h>
>  #include <linux/mmzone.h>
> +#include <linux/memblock.h>
>  #include <linux/proc_fs.h>
>  #include <linux/percpu.h>
>  #include <linux/seq_file.h>
> @@ -131,6 +132,18 @@ static int meminfo_proc_show(struct seq_file *m, void
> *v)
>   show_val_kb(m, "VmallocChunk:   ", 0ul);
>   show_val_kb(m, "Percpu:         ", pcpu_nr_pages());
> 
> +#ifdef CONFIG_MEMTEST
> + /* Only show 0 Bad memory when test was actually run.
> + * Make sure bad regions smaller than 1kB are not reported as 0.
> + * That way when 0 is reported we can be sure there actually was
> successful test */

Comment layout is unconventional.

> + if (early_memtest_done)
> + seq_printf(m, "EarlyMemtestBad:   %5lu kB\n",
> + (unsigned long) (
> + ((early_memtest_bad_size>0) && (early_memtest_bad_size>>10 <= 0))
> + ? 1
> + : early_memtest_bad_size>>10));

Coding style is unconventional (white spaces).

I expect this code would look much cleaner if some temporaries were used.

	if (early_memtest_done) {
		unsigned long size = 1;
		long sz =  early_memtest_bad_size  >> 10;

		if (early_memtest_bad_size > 0 && sz <= 0)
			size = sz;
		seq_printf(m, "EarlyMemtestBad:   %5lu kB\n", size)
	}

(or something like that, I didn't try hard)

I don't understand this logic anyway.  Why not just print the value of
early_memtest_bad_size>>10 and be done with it.


> +extern int early_memtest_done; /* How many memtest passes were done? */

The name implies a bool, but the comment says otherwise.

> start, phys_addr_t end)
>   memtest(pattern, this_start, this_end - this_start);
>   }
>   }
> + early_memtest_done++;

It's a counter, but it's used as a boolean.  Why not make it bool, and do

	early_memtest_done = true;

here?

Also, your email client is replacing tabs with spaces.


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH v2] Add results of early memtest to /proc/meminfo
  2023-03-17 23:56 ` Andrew Morton
@ 2023-03-21 10:34   ` Tomas Mudrunka
  2023-03-21 20:01     ` Andrew Morton
  2023-03-21 10:58   ` Re: [PATCH] " Tomas Mudrunka
  1 sibling, 1 reply; 5+ messages in thread
From: Tomas Mudrunka @ 2023-03-21 10:34 UTC (permalink / raw)
  To: akpm
  Cc: linux-fsdevel, linux-kernel, linux-mm, rppt, linux-doc, corbet,
	tomas.mudrunka

Currently the memtest results were only presented in dmesg.
This adds /proc/meminfo entry which can be easily used by scripts.

Signed-off-by: Tomas Mudrunka <tomas.mudrunka@gmail.com>
---
 Documentation/filesystems/proc.rst |  8 ++++++++
 fs/proc/meminfo.c                  | 13 +++++++++++++
 include/linux/memblock.h           |  2 ++
 mm/memtest.c                       |  6 ++++++
 4 files changed, 29 insertions(+)

diff --git a/Documentation/filesystems/proc.rst b/Documentation/filesystems/proc.rst
index 9d5fd9424..8740362f3 100644
--- a/Documentation/filesystems/proc.rst
+++ b/Documentation/filesystems/proc.rst
@@ -996,6 +996,7 @@ Example output. You may not have all of these fields.
     VmallocUsed:       40444 kB
     VmallocChunk:          0 kB
     Percpu:            29312 kB
+    EarlyMemtestBad:       0 kB
     HardwareCorrupted:     0 kB
     AnonHugePages:   4149248 kB
     ShmemHugePages:        0 kB
@@ -1146,6 +1147,13 @@ VmallocChunk
 Percpu
               Memory allocated to the percpu allocator used to back percpu
               allocations. This stat excludes the cost of metadata.
+EarlyMemtestBad
+              The amount of RAM/memory in kB, that was identified as corrupted
+              by early memtest. If memtest was not run, this field will not
+              be displayed at all. Size is never rounded down to 0 kB.
+              That means if 0 kB is reported, you can safely assume
+              there was at least one pass of memtest and none of the passes
+              found a single faulty byte of RAM.
 HardwareCorrupted
               The amount of RAM/memory in KB, the kernel identifies as
               corrupted.
diff --git a/fs/proc/meminfo.c b/fs/proc/meminfo.c
index 440960110..b43d0bd42 100644
--- a/fs/proc/meminfo.c
+++ b/fs/proc/meminfo.c
@@ -6,6 +6,7 @@
 #include <linux/hugetlb.h>
 #include <linux/mman.h>
 #include <linux/mmzone.h>
+#include <linux/memblock.h>
 #include <linux/proc_fs.h>
 #include <linux/percpu.h>
 #include <linux/seq_file.h>
@@ -131,6 +132,18 @@ static int meminfo_proc_show(struct seq_file *m, void *v)
 	show_val_kb(m, "VmallocChunk:   ", 0ul);
 	show_val_kb(m, "Percpu:         ", pcpu_nr_pages());
 
+#ifdef CONFIG_MEMTEST
+	if (early_memtest_done) {
+		unsigned long early_memtest_bad_size_kb;
+
+		early_memtest_bad_size_kb = early_memtest_bad_size>>10;
+		if (early_memtest_bad_size && !early_memtest_bad_size_kb)
+			early_memtest_bad_size_kb = 1;
+		/* When 0 is reported, it means there actually was a successful test */
+		seq_printf(m, "EarlyMemtestBad:   %5lu kB\n", early_memtest_bad_size_kb);
+	}
+#endif
+
 #ifdef CONFIG_MEMORY_FAILURE
 	seq_printf(m, "HardwareCorrupted: %5lu kB\n",
 		   atomic_long_read(&num_poisoned_pages) << (PAGE_SHIFT - 10));
diff --git a/include/linux/memblock.h b/include/linux/memblock.h
index 50ad19662..f82ee3fac 100644
--- a/include/linux/memblock.h
+++ b/include/linux/memblock.h
@@ -597,6 +597,8 @@ extern int hashdist;		/* Distribute hashes across NUMA nodes? */
 #endif
 
 #ifdef CONFIG_MEMTEST
+extern phys_addr_t early_memtest_bad_size;	/* Size of faulty ram found by memtest */
+extern bool early_memtest_done;			/* Was early memtest done? */
 extern void early_memtest(phys_addr_t start, phys_addr_t end);
 #else
 static inline void early_memtest(phys_addr_t start, phys_addr_t end)
diff --git a/mm/memtest.c b/mm/memtest.c
index f53ace709..57149dfee 100644
--- a/mm/memtest.c
+++ b/mm/memtest.c
@@ -4,6 +4,9 @@
 #include <linux/init.h>
 #include <linux/memblock.h>
 
+bool early_memtest_done;
+phys_addr_t early_memtest_bad_size;
+
 static u64 patterns[] __initdata = {
 	/* The first entry has to be 0 to leave memtest with zeroed memory */
 	0,
@@ -30,6 +33,7 @@ static void __init reserve_bad_mem(u64 pattern, phys_addr_t start_bad, phys_addr
 	pr_info("  %016llx bad mem addr %pa - %pa reserved\n",
 		cpu_to_be64(pattern), &start_bad, &end_bad);
 	memblock_reserve(start_bad, end_bad - start_bad);
+	early_memtest_bad_size += (end_bad - start_bad);
 }
 
 static void __init memtest(u64 pattern, phys_addr_t start_phys, phys_addr_t size)
@@ -61,6 +65,8 @@ static void __init memtest(u64 pattern, phys_addr_t start_phys, phys_addr_t size
 	}
 	if (start_bad)
 		reserve_bad_mem(pattern, start_bad, last_bad + incr);
+
+	early_memtest_done = true;
 }
 
 static void __init do_one_pass(u64 pattern, phys_addr_t start, phys_addr_t end)
-- 
2.40.0



^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: Re: [PATCH] Add results of early memtest to /proc/meminfo
  2023-03-17 23:56 ` Andrew Morton
  2023-03-21 10:34   ` [PATCH v2] " Tomas Mudrunka
@ 2023-03-21 10:58   ` Tomas Mudrunka
  1 sibling, 0 replies; 5+ messages in thread
From: Tomas Mudrunka @ 2023-03-21 10:58 UTC (permalink / raw)
  To: akpm
  Cc: linux-fsdevel, linux-kernel, linux-mm, rppt, linux-doc, corbet,
	tomas.mudrunka

> meminfo is rather top-level and important.  Is this data sufficiently
> important to justify a place there?

There is already "HardwareCorrupted" which is similar, but for
errors reported by ECC, so it makes sense to have both in single place.

> Please describe the value.  The use-case(s).  Why would people want
> this?

When running large fleet of devices without ECC RAM it's currently not
easy to do bulk monitoring for memory corruption. You have to parse dmesg,
but that's ring buffer so the error might disappear after some time.
In general i do not consider dmesg to be great API to query RAM status.

In several companies i've seen such errors remain undetected and cause
issues for way too long. So i think it makes sense to provide monitoring
API, so that we can safely detect and act upon them.

> Comment layout is unconventional.

Fixed in PATCH v2

> Coding style is unconventional (white spaces).
> I expect this code would look much cleaner if some temporaries were used.

Fixed in PATCH v2

> I don't understand this logic anyway.  Why not just print the value of
> early_memtest_bad_size>>10 and be done with it.

I think 0 should be reported only when there was no error found at all.
If memtest detect 256 corrupt bytes it would report 0 kB without such logic.
Rounding down to 0 like that is not a good idea in my opinion,
because it will hide the fact something is wrong with the RAM in such case.
Therefore i've added logic that prevents rounding down to 0.

> The name implies a bool, but the comment says otherwise.

Fixed in PATCH v2

> It's a counter, but it's used as a boolean.  Why not make it bool, and do

Fixed in PATCH v2

> Also, your email client is replacing tabs with spaces.

Fixed in PATCH v2


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v2] Add results of early memtest to /proc/meminfo
  2023-03-21 10:34   ` [PATCH v2] " Tomas Mudrunka
@ 2023-03-21 20:01     ` Andrew Morton
  0 siblings, 0 replies; 5+ messages in thread
From: Andrew Morton @ 2023-03-21 20:01 UTC (permalink / raw)
  To: Tomas Mudrunka
  Cc: linux-fsdevel, linux-kernel, linux-mm, rppt, linux-doc, corbet

On Tue, 21 Mar 2023 11:34:30 +0100 Tomas Mudrunka <tomas.mudrunka@gmail.com> wrote:

> Currently the memtest results were only presented in dmesg.
> This adds /proc/meminfo entry which can be easily used by scripts.

Looks good to me, thanks.  But the changelog still doesn't explain why
we should make this change.  I grabbed that from your other email and
used the below as the changelog:


: Currently the memtest results were only presented in dmesg.
: 
: When running a large fleet of devices without ECC RAM it's currently not
: easy to do bulk monitoring for memory corruption.  You have to parse
: dmesg, but that's a ring buffer so the error might disappear after some
: time.  In general I do not consider dmesg to be a great API to query RAM
: status.
: 
: In several companies I've seen such errors remain undetected and cause
: issues for way too long.  So I think it makes sense to provide a monitoring
: API, so that we can safely detect and act upon them.
: 
: This adds /proc/meminfo entry which can be easily used by scripts.



^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2023-03-21 20:02 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-03-17 19:30 [PATCH] Add results of early memtest to /proc/meminfo Tomáš Mudruňka
2023-03-17 23:56 ` Andrew Morton
2023-03-21 10:34   ` [PATCH v2] " Tomas Mudrunka
2023-03-21 20:01     ` Andrew Morton
2023-03-21 10:58   ` Re: [PATCH] " Tomas Mudrunka

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).