All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH -next v4 0/7]arm64: add machine check safe support
@ 2022-04-20  3:04 ` Tong Tiangen
  0 siblings, 0 replies; 96+ messages in thread
From: Tong Tiangen @ 2022-04-20  3:04 UTC (permalink / raw)
  To: Mark Rutland, James Morse, Andrew Morton, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Robin Murphy, Dave Hansen,
	Catalin Marinas, Will Deacon, Alexander Viro, Michael Ellerman,
	Benjamin Herrenschmidt, Paul Mackerras, x86, H . Peter Anvin
  Cc: linuxppc-dev, linux-arm-kernel, linux-kernel, linux-mm,
	Kefeng Wang, Xie XiuQi, Guohanjun, Tong Tiangen

With the increase of memory capacity and density, the probability of
memory error increases. The increasing size and density of server RAM
in the data center and cloud have shown increased uncorrectable memory
errors.

Currently, the kernel has a mechanism to recover from hardware memory
errors. This patchset provides an new recovery mechanism.

For arm64, the hardware memory error handling is do_sea() which divided
into two cases:
 1. The user state consumed the memory errors, the solution is kill the
    user process and isolate the error page.
 2. The kernel state consumed the memory errors, the solution is panic.

For case 2, Undifferentiated panic maybe not the optimal choice, it can be
handled better, in some scenes, we can avoid panic, such as uaccess, if the
uaccess fails due to memory error, only the user process will be affected,
kill the user process and isolate the user page with hardware memory errors
is a better choice.

This patchset can be divided into three parts:
 1. Patch 0/1/4    - make some minor fixes to the associated code.
 2. Patch 3      - arm64 add support for machine check safe framework.
 3. Pathc 5/6/7  - arm64 add uaccess and cow to machine check safe.

Since V4:
 1. According to Robin's suggestion, direct modify user_ldst and
 user_ldp in asm-uaccess.h and modify mte.S.
 2. Add new macro USER_MC in asm-uaccess.h, used in copy_from_user.S
 and copy_to_user.S.
 3. According to Robin's suggestion, using micro in copy_page_mc.S to
 simplify code.
 4. According to KeFeng's suggestion, modify powerpc code in patch1.
 5. According to KeFeng's suggestion, modify mm/extable.c and some code
 optimization.

Since V3:
 1. According to Mark's suggestion, all uaccess can be recovered due to
    memory error.
 2. Scenario pagecache reading is also supported as part of uaccess
    (copy_to_user()) and duplication code problem is also solved. 
    Thanks for Robin's suggestion.
 3. According Mark's suggestion, update commit message of patch 2/5.
 4. According Borisllav's suggestion, update commit message of patch 1/5.

Since V2:
 1.Consistent with PPC/x86, Using CONFIG_ARCH_HAS_COPY_MC instead of
   ARM64_UCE_KERNEL_RECOVERY.
 2.Add two new scenes, cow and pagecache reading.
 3.Fix two small bug(the first two patch).

V1 in here:
https://lore.kernel.org/lkml/20220323033705.3966643-1-tongtiangen@huawei.com/

Robin Murphy (1):
  arm64: mte: Clean up user tag accessors

Tong Tiangen (6):
  x86, powerpc: fix function define in copy_mc_to_user
  arm64: fix types in copy_highpage()
  arm64: add support for machine check error safe
  arm64: add copy_{to, from}_user to machine check safe
  arm64: add {get, put}_user to machine check safe
  arm64: add cow to machine check safe

 arch/arm64/Kconfig                   |  1 +
 arch/arm64/include/asm/asm-extable.h | 33 +++++++++++
 arch/arm64/include/asm/asm-uaccess.h | 15 +++--
 arch/arm64/include/asm/extable.h     |  1 +
 arch/arm64/include/asm/page.h        | 10 ++++
 arch/arm64/include/asm/uaccess.h     |  4 +-
 arch/arm64/lib/Makefile              |  2 +
 arch/arm64/lib/copy_from_user.S      | 18 +++---
 arch/arm64/lib/copy_page_mc.S        | 86 ++++++++++++++++++++++++++++
 arch/arm64/lib/copy_to_user.S        | 18 +++---
 arch/arm64/lib/mte.S                 |  4 +-
 arch/arm64/mm/copypage.c             | 36 ++++++++++--
 arch/arm64/mm/extable.c              | 33 +++++++++++
 arch/arm64/mm/fault.c                | 27 ++++++++-
 arch/powerpc/include/asm/uaccess.h   |  1 +
 arch/x86/include/asm/uaccess.h       |  1 +
 include/linux/highmem.h              |  8 +++
 include/linux/uaccess.h              |  9 +++
 mm/memory.c                          |  2 +-
 19 files changed, 278 insertions(+), 31 deletions(-)
 create mode 100644 arch/arm64/lib/copy_page_mc.S

-- 
2.25.1


^ permalink raw reply	[flat|nested] 96+ messages in thread

* [PATCH -next v4 0/7]arm64: add machine check safe support
@ 2022-04-20  3:04 ` Tong Tiangen
  0 siblings, 0 replies; 96+ messages in thread
From: Tong Tiangen @ 2022-04-20  3:04 UTC (permalink / raw)
  To: Mark Rutland, James Morse, Andrew Morton, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Robin Murphy, Dave Hansen,
	Catalin Marinas, Will Deacon, Alexander Viro, Michael Ellerman,
	Benjamin Herrenschmidt, Paul Mackerras, x86, H . Peter Anvin
  Cc: linuxppc-dev, linux-arm-kernel, linux-kernel, linux-mm,
	Kefeng Wang, Xie XiuQi, Guohanjun, Tong Tiangen

With the increase of memory capacity and density, the probability of
memory error increases. The increasing size and density of server RAM
in the data center and cloud have shown increased uncorrectable memory
errors.

Currently, the kernel has a mechanism to recover from hardware memory
errors. This patchset provides an new recovery mechanism.

For arm64, the hardware memory error handling is do_sea() which divided
into two cases:
 1. The user state consumed the memory errors, the solution is kill the
    user process and isolate the error page.
 2. The kernel state consumed the memory errors, the solution is panic.

For case 2, Undifferentiated panic maybe not the optimal choice, it can be
handled better, in some scenes, we can avoid panic, such as uaccess, if the
uaccess fails due to memory error, only the user process will be affected,
kill the user process and isolate the user page with hardware memory errors
is a better choice.

This patchset can be divided into three parts:
 1. Patch 0/1/4    - make some minor fixes to the associated code.
 2. Patch 3      - arm64 add support for machine check safe framework.
 3. Pathc 5/6/7  - arm64 add uaccess and cow to machine check safe.

Since V4:
 1. According to Robin's suggestion, direct modify user_ldst and
 user_ldp in asm-uaccess.h and modify mte.S.
 2. Add new macro USER_MC in asm-uaccess.h, used in copy_from_user.S
 and copy_to_user.S.
 3. According to Robin's suggestion, using micro in copy_page_mc.S to
 simplify code.
 4. According to KeFeng's suggestion, modify powerpc code in patch1.
 5. According to KeFeng's suggestion, modify mm/extable.c and some code
 optimization.

Since V3:
 1. According to Mark's suggestion, all uaccess can be recovered due to
    memory error.
 2. Scenario pagecache reading is also supported as part of uaccess
    (copy_to_user()) and duplication code problem is also solved. 
    Thanks for Robin's suggestion.
 3. According Mark's suggestion, update commit message of patch 2/5.
 4. According Borisllav's suggestion, update commit message of patch 1/5.

Since V2:
 1.Consistent with PPC/x86, Using CONFIG_ARCH_HAS_COPY_MC instead of
   ARM64_UCE_KERNEL_RECOVERY.
 2.Add two new scenes, cow and pagecache reading.
 3.Fix two small bug(the first two patch).

V1 in here:
https://lore.kernel.org/lkml/20220323033705.3966643-1-tongtiangen@huawei.com/

Robin Murphy (1):
  arm64: mte: Clean up user tag accessors

Tong Tiangen (6):
  x86, powerpc: fix function define in copy_mc_to_user
  arm64: fix types in copy_highpage()
  arm64: add support for machine check error safe
  arm64: add copy_{to, from}_user to machine check safe
  arm64: add {get, put}_user to machine check safe
  arm64: add cow to machine check safe

 arch/arm64/Kconfig                   |  1 +
 arch/arm64/include/asm/asm-extable.h | 33 +++++++++++
 arch/arm64/include/asm/asm-uaccess.h | 15 +++--
 arch/arm64/include/asm/extable.h     |  1 +
 arch/arm64/include/asm/page.h        | 10 ++++
 arch/arm64/include/asm/uaccess.h     |  4 +-
 arch/arm64/lib/Makefile              |  2 +
 arch/arm64/lib/copy_from_user.S      | 18 +++---
 arch/arm64/lib/copy_page_mc.S        | 86 ++++++++++++++++++++++++++++
 arch/arm64/lib/copy_to_user.S        | 18 +++---
 arch/arm64/lib/mte.S                 |  4 +-
 arch/arm64/mm/copypage.c             | 36 ++++++++++--
 arch/arm64/mm/extable.c              | 33 +++++++++++
 arch/arm64/mm/fault.c                | 27 ++++++++-
 arch/powerpc/include/asm/uaccess.h   |  1 +
 arch/x86/include/asm/uaccess.h       |  1 +
 include/linux/highmem.h              |  8 +++
 include/linux/uaccess.h              |  9 +++
 mm/memory.c                          |  2 +-
 19 files changed, 278 insertions(+), 31 deletions(-)
 create mode 100644 arch/arm64/lib/copy_page_mc.S

-- 
2.25.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 96+ messages in thread

* [PATCH -next v4 0/7]arm64: add machine check safe support
@ 2022-04-20  3:04 ` Tong Tiangen
  0 siblings, 0 replies; 96+ messages in thread
From: Tong Tiangen @ 2022-04-20  3:04 UTC (permalink / raw)
  To: Mark Rutland, James Morse, Andrew Morton, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Robin Murphy, Dave Hansen,
	Catalin Marinas, Will Deacon, Alexander Viro, Michael Ellerman,
	Benjamin Herrenschmidt, Paul Mackerras, x86, H . Peter Anvin
  Cc: Kefeng Wang, Xie XiuQi, linux-kernel, linux-mm, Tong Tiangen,
	Guohanjun, linuxppc-dev, linux-arm-kernel

With the increase of memory capacity and density, the probability of
memory error increases. The increasing size and density of server RAM
in the data center and cloud have shown increased uncorrectable memory
errors.

Currently, the kernel has a mechanism to recover from hardware memory
errors. This patchset provides an new recovery mechanism.

For arm64, the hardware memory error handling is do_sea() which divided
into two cases:
 1. The user state consumed the memory errors, the solution is kill the
    user process and isolate the error page.
 2. The kernel state consumed the memory errors, the solution is panic.

For case 2, Undifferentiated panic maybe not the optimal choice, it can be
handled better, in some scenes, we can avoid panic, such as uaccess, if the
uaccess fails due to memory error, only the user process will be affected,
kill the user process and isolate the user page with hardware memory errors
is a better choice.

This patchset can be divided into three parts:
 1. Patch 0/1/4    - make some minor fixes to the associated code.
 2. Patch 3      - arm64 add support for machine check safe framework.
 3. Pathc 5/6/7  - arm64 add uaccess and cow to machine check safe.

Since V4:
 1. According to Robin's suggestion, direct modify user_ldst and
 user_ldp in asm-uaccess.h and modify mte.S.
 2. Add new macro USER_MC in asm-uaccess.h, used in copy_from_user.S
 and copy_to_user.S.
 3. According to Robin's suggestion, using micro in copy_page_mc.S to
 simplify code.
 4. According to KeFeng's suggestion, modify powerpc code in patch1.
 5. According to KeFeng's suggestion, modify mm/extable.c and some code
 optimization.

Since V3:
 1. According to Mark's suggestion, all uaccess can be recovered due to
    memory error.
 2. Scenario pagecache reading is also supported as part of uaccess
    (copy_to_user()) and duplication code problem is also solved. 
    Thanks for Robin's suggestion.
 3. According Mark's suggestion, update commit message of patch 2/5.
 4. According Borisllav's suggestion, update commit message of patch 1/5.

Since V2:
 1.Consistent with PPC/x86, Using CONFIG_ARCH_HAS_COPY_MC instead of
   ARM64_UCE_KERNEL_RECOVERY.
 2.Add two new scenes, cow and pagecache reading.
 3.Fix two small bug(the first two patch).

V1 in here:
https://lore.kernel.org/lkml/20220323033705.3966643-1-tongtiangen@huawei.com/

Robin Murphy (1):
  arm64: mte: Clean up user tag accessors

Tong Tiangen (6):
  x86, powerpc: fix function define in copy_mc_to_user
  arm64: fix types in copy_highpage()
  arm64: add support for machine check error safe
  arm64: add copy_{to, from}_user to machine check safe
  arm64: add {get, put}_user to machine check safe
  arm64: add cow to machine check safe

 arch/arm64/Kconfig                   |  1 +
 arch/arm64/include/asm/asm-extable.h | 33 +++++++++++
 arch/arm64/include/asm/asm-uaccess.h | 15 +++--
 arch/arm64/include/asm/extable.h     |  1 +
 arch/arm64/include/asm/page.h        | 10 ++++
 arch/arm64/include/asm/uaccess.h     |  4 +-
 arch/arm64/lib/Makefile              |  2 +
 arch/arm64/lib/copy_from_user.S      | 18 +++---
 arch/arm64/lib/copy_page_mc.S        | 86 ++++++++++++++++++++++++++++
 arch/arm64/lib/copy_to_user.S        | 18 +++---
 arch/arm64/lib/mte.S                 |  4 +-
 arch/arm64/mm/copypage.c             | 36 ++++++++++--
 arch/arm64/mm/extable.c              | 33 +++++++++++
 arch/arm64/mm/fault.c                | 27 ++++++++-
 arch/powerpc/include/asm/uaccess.h   |  1 +
 arch/x86/include/asm/uaccess.h       |  1 +
 include/linux/highmem.h              |  8 +++
 include/linux/uaccess.h              |  9 +++
 mm/memory.c                          |  2 +-
 19 files changed, 278 insertions(+), 31 deletions(-)
 create mode 100644 arch/arm64/lib/copy_page_mc.S

-- 
2.25.1


^ permalink raw reply	[flat|nested] 96+ messages in thread

* [PATCH -next v4 1/7] x86, powerpc: fix function define in copy_mc_to_user
  2022-04-20  3:04 ` Tong Tiangen
  (?)
@ 2022-04-20  3:04   ` Tong Tiangen
  -1 siblings, 0 replies; 96+ messages in thread
From: Tong Tiangen @ 2022-04-20  3:04 UTC (permalink / raw)
  To: Mark Rutland, James Morse, Andrew Morton, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Robin Murphy, Dave Hansen,
	Catalin Marinas, Will Deacon, Alexander Viro, Michael Ellerman,
	Benjamin Herrenschmidt, Paul Mackerras, x86, H . Peter Anvin
  Cc: linuxppc-dev, linux-arm-kernel, linux-kernel, linux-mm,
	Kefeng Wang, Xie XiuQi, Guohanjun, Tong Tiangen

x86/powerpc has it's implementation of copy_mc_to_user but not use #define
to declare.

This may cause problems, for example, if other architectures open
CONFIG_ARCH_HAS_COPY_MC, but want to use copy_mc_to_user() outside the
architecture, the code add to include/linux/uaddess.h is as follows:

    #ifndef copy_mc_to_user
    static inline unsigned long __must_check
    copy_mc_to_user(void *dst, const void *src, size_t cnt)
    {
	    ...
    }
    #endif

Then this definition will conflict with the implementation of x86/powerpc
and cause compilation errors as follow:

Fixes: ec6347bb4339 ("x86, powerpc: Rename memcpy_mcsafe() to copy_mc_to_{user, kernel}()")
Signed-off-by: Tong Tiangen <tongtiangen@huawei.com>
---
 arch/powerpc/include/asm/uaccess.h | 1 +
 arch/x86/include/asm/uaccess.h     | 1 +
 2 files changed, 2 insertions(+)

diff --git a/arch/powerpc/include/asm/uaccess.h b/arch/powerpc/include/asm/uaccess.h
index 9b82b38ff867..58dbe8e2e318 100644
--- a/arch/powerpc/include/asm/uaccess.h
+++ b/arch/powerpc/include/asm/uaccess.h
@@ -358,6 +358,7 @@ copy_mc_to_user(void __user *to, const void *from, unsigned long n)
 
 	return n;
 }
+#define copy_mc_to_user copy_mc_to_user
 #endif
 
 extern long __copy_from_user_flushcache(void *dst, const void __user *src,
diff --git a/arch/x86/include/asm/uaccess.h b/arch/x86/include/asm/uaccess.h
index f78e2b3501a1..e18c5f098025 100644
--- a/arch/x86/include/asm/uaccess.h
+++ b/arch/x86/include/asm/uaccess.h
@@ -415,6 +415,7 @@ copy_mc_to_kernel(void *to, const void *from, unsigned len);
 
 unsigned long __must_check
 copy_mc_to_user(void *to, const void *from, unsigned len);
+#define copy_mc_to_user copy_mc_to_user
 #endif
 
 /*
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 96+ messages in thread

* [PATCH -next v4 1/7] x86, powerpc: fix function define in copy_mc_to_user
@ 2022-04-20  3:04   ` Tong Tiangen
  0 siblings, 0 replies; 96+ messages in thread
From: Tong Tiangen @ 2022-04-20  3:04 UTC (permalink / raw)
  To: Mark Rutland, James Morse, Andrew Morton, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Robin Murphy, Dave Hansen,
	Catalin Marinas, Will Deacon, Alexander Viro, Michael Ellerman,
	Benjamin Herrenschmidt, Paul Mackerras, x86, H . Peter Anvin
  Cc: linuxppc-dev, linux-arm-kernel, linux-kernel, linux-mm,
	Kefeng Wang, Xie XiuQi, Guohanjun, Tong Tiangen

x86/powerpc has it's implementation of copy_mc_to_user but not use #define
to declare.

This may cause problems, for example, if other architectures open
CONFIG_ARCH_HAS_COPY_MC, but want to use copy_mc_to_user() outside the
architecture, the code add to include/linux/uaddess.h is as follows:

    #ifndef copy_mc_to_user
    static inline unsigned long __must_check
    copy_mc_to_user(void *dst, const void *src, size_t cnt)
    {
	    ...
    }
    #endif

Then this definition will conflict with the implementation of x86/powerpc
and cause compilation errors as follow:

Fixes: ec6347bb4339 ("x86, powerpc: Rename memcpy_mcsafe() to copy_mc_to_{user, kernel}()")
Signed-off-by: Tong Tiangen <tongtiangen@huawei.com>
---
 arch/powerpc/include/asm/uaccess.h | 1 +
 arch/x86/include/asm/uaccess.h     | 1 +
 2 files changed, 2 insertions(+)

diff --git a/arch/powerpc/include/asm/uaccess.h b/arch/powerpc/include/asm/uaccess.h
index 9b82b38ff867..58dbe8e2e318 100644
--- a/arch/powerpc/include/asm/uaccess.h
+++ b/arch/powerpc/include/asm/uaccess.h
@@ -358,6 +358,7 @@ copy_mc_to_user(void __user *to, const void *from, unsigned long n)
 
 	return n;
 }
+#define copy_mc_to_user copy_mc_to_user
 #endif
 
 extern long __copy_from_user_flushcache(void *dst, const void __user *src,
diff --git a/arch/x86/include/asm/uaccess.h b/arch/x86/include/asm/uaccess.h
index f78e2b3501a1..e18c5f098025 100644
--- a/arch/x86/include/asm/uaccess.h
+++ b/arch/x86/include/asm/uaccess.h
@@ -415,6 +415,7 @@ copy_mc_to_kernel(void *to, const void *from, unsigned len);
 
 unsigned long __must_check
 copy_mc_to_user(void *to, const void *from, unsigned len);
+#define copy_mc_to_user copy_mc_to_user
 #endif
 
 /*
-- 
2.25.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 96+ messages in thread

* [PATCH -next v4 1/7] x86, powerpc: fix function define in copy_mc_to_user
@ 2022-04-20  3:04   ` Tong Tiangen
  0 siblings, 0 replies; 96+ messages in thread
From: Tong Tiangen @ 2022-04-20  3:04 UTC (permalink / raw)
  To: Mark Rutland, James Morse, Andrew Morton, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Robin Murphy, Dave Hansen,
	Catalin Marinas, Will Deacon, Alexander Viro, Michael Ellerman,
	Benjamin Herrenschmidt, Paul Mackerras, x86, H . Peter Anvin
  Cc: Kefeng Wang, Xie XiuQi, linux-kernel, linux-mm, Tong Tiangen,
	Guohanjun, linuxppc-dev, linux-arm-kernel

x86/powerpc has it's implementation of copy_mc_to_user but not use #define
to declare.

This may cause problems, for example, if other architectures open
CONFIG_ARCH_HAS_COPY_MC, but want to use copy_mc_to_user() outside the
architecture, the code add to include/linux/uaddess.h is as follows:

    #ifndef copy_mc_to_user
    static inline unsigned long __must_check
    copy_mc_to_user(void *dst, const void *src, size_t cnt)
    {
	    ...
    }
    #endif

Then this definition will conflict with the implementation of x86/powerpc
and cause compilation errors as follow:

Fixes: ec6347bb4339 ("x86, powerpc: Rename memcpy_mcsafe() to copy_mc_to_{user, kernel}()")
Signed-off-by: Tong Tiangen <tongtiangen@huawei.com>
---
 arch/powerpc/include/asm/uaccess.h | 1 +
 arch/x86/include/asm/uaccess.h     | 1 +
 2 files changed, 2 insertions(+)

diff --git a/arch/powerpc/include/asm/uaccess.h b/arch/powerpc/include/asm/uaccess.h
index 9b82b38ff867..58dbe8e2e318 100644
--- a/arch/powerpc/include/asm/uaccess.h
+++ b/arch/powerpc/include/asm/uaccess.h
@@ -358,6 +358,7 @@ copy_mc_to_user(void __user *to, const void *from, unsigned long n)
 
 	return n;
 }
+#define copy_mc_to_user copy_mc_to_user
 #endif
 
 extern long __copy_from_user_flushcache(void *dst, const void __user *src,
diff --git a/arch/x86/include/asm/uaccess.h b/arch/x86/include/asm/uaccess.h
index f78e2b3501a1..e18c5f098025 100644
--- a/arch/x86/include/asm/uaccess.h
+++ b/arch/x86/include/asm/uaccess.h
@@ -415,6 +415,7 @@ copy_mc_to_kernel(void *to, const void *from, unsigned len);
 
 unsigned long __must_check
 copy_mc_to_user(void *to, const void *from, unsigned len);
+#define copy_mc_to_user copy_mc_to_user
 #endif
 
 /*
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 96+ messages in thread

* [PATCH -next v4 2/7] arm64: fix types in copy_highpage()
  2022-04-20  3:04 ` Tong Tiangen
  (?)
@ 2022-04-20  3:04   ` Tong Tiangen
  -1 siblings, 0 replies; 96+ messages in thread
From: Tong Tiangen @ 2022-04-20  3:04 UTC (permalink / raw)
  To: Mark Rutland, James Morse, Andrew Morton, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Robin Murphy, Dave Hansen,
	Catalin Marinas, Will Deacon, Alexander Viro, Michael Ellerman,
	Benjamin Herrenschmidt, Paul Mackerras, x86, H . Peter Anvin
  Cc: linuxppc-dev, linux-arm-kernel, linux-kernel, linux-mm,
	Kefeng Wang, Xie XiuQi, Guohanjun, Tong Tiangen

In copy_highpage() the `kto` and `kfrom` local variables are pointers to
struct page, but these are used to hold arbitrary pointers to kernel memory
. Each call to page_address() returns a void pointer to memory associated
with the relevant page, and copy_page() expects void pointers to this
memory.

This inconsistency was introduced in commit 2563776b41c3 ("arm64: mte:
Tags-aware copy_{user_,}highpage() implementations") and while this
doesn't appear to be harmful in practice it is clearly wrong.

Correct this by making `kto` and `kfrom` void pointers.

Fixes: 2563776b41c3 ("arm64: mte: Tags-aware copy_{user_,}highpage() implementations")
Signed-off-by: Tong Tiangen <tongtiangen@huawei.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Kefeng Wang <wangkefeng.wang@huawei.com>
---
 arch/arm64/mm/copypage.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/mm/copypage.c b/arch/arm64/mm/copypage.c
index b5447e53cd73..0dea80bf6de4 100644
--- a/arch/arm64/mm/copypage.c
+++ b/arch/arm64/mm/copypage.c
@@ -16,8 +16,8 @@
 
 void copy_highpage(struct page *to, struct page *from)
 {
-	struct page *kto = page_address(to);
-	struct page *kfrom = page_address(from);
+	void *kto = page_address(to);
+	void *kfrom = page_address(from);
 
 	copy_page(kto, kfrom);
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 96+ messages in thread

* [PATCH -next v4 2/7] arm64: fix types in copy_highpage()
@ 2022-04-20  3:04   ` Tong Tiangen
  0 siblings, 0 replies; 96+ messages in thread
From: Tong Tiangen @ 2022-04-20  3:04 UTC (permalink / raw)
  To: Mark Rutland, James Morse, Andrew Morton, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Robin Murphy, Dave Hansen,
	Catalin Marinas, Will Deacon, Alexander Viro, Michael Ellerman,
	Benjamin Herrenschmidt, Paul Mackerras, x86, H . Peter Anvin
  Cc: linuxppc-dev, linux-arm-kernel, linux-kernel, linux-mm,
	Kefeng Wang, Xie XiuQi, Guohanjun, Tong Tiangen

In copy_highpage() the `kto` and `kfrom` local variables are pointers to
struct page, but these are used to hold arbitrary pointers to kernel memory
. Each call to page_address() returns a void pointer to memory associated
with the relevant page, and copy_page() expects void pointers to this
memory.

This inconsistency was introduced in commit 2563776b41c3 ("arm64: mte:
Tags-aware copy_{user_,}highpage() implementations") and while this
doesn't appear to be harmful in practice it is clearly wrong.

Correct this by making `kto` and `kfrom` void pointers.

Fixes: 2563776b41c3 ("arm64: mte: Tags-aware copy_{user_,}highpage() implementations")
Signed-off-by: Tong Tiangen <tongtiangen@huawei.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Kefeng Wang <wangkefeng.wang@huawei.com>
---
 arch/arm64/mm/copypage.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/mm/copypage.c b/arch/arm64/mm/copypage.c
index b5447e53cd73..0dea80bf6de4 100644
--- a/arch/arm64/mm/copypage.c
+++ b/arch/arm64/mm/copypage.c
@@ -16,8 +16,8 @@
 
 void copy_highpage(struct page *to, struct page *from)
 {
-	struct page *kto = page_address(to);
-	struct page *kfrom = page_address(from);
+	void *kto = page_address(to);
+	void *kfrom = page_address(from);
 
 	copy_page(kto, kfrom);
 
-- 
2.25.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 96+ messages in thread

* [PATCH -next v4 2/7] arm64: fix types in copy_highpage()
@ 2022-04-20  3:04   ` Tong Tiangen
  0 siblings, 0 replies; 96+ messages in thread
From: Tong Tiangen @ 2022-04-20  3:04 UTC (permalink / raw)
  To: Mark Rutland, James Morse, Andrew Morton, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Robin Murphy, Dave Hansen,
	Catalin Marinas, Will Deacon, Alexander Viro, Michael Ellerman,
	Benjamin Herrenschmidt, Paul Mackerras, x86, H . Peter Anvin
  Cc: Kefeng Wang, Xie XiuQi, linux-kernel, linux-mm, Tong Tiangen,
	Guohanjun, linuxppc-dev, linux-arm-kernel

In copy_highpage() the `kto` and `kfrom` local variables are pointers to
struct page, but these are used to hold arbitrary pointers to kernel memory
. Each call to page_address() returns a void pointer to memory associated
with the relevant page, and copy_page() expects void pointers to this
memory.

This inconsistency was introduced in commit 2563776b41c3 ("arm64: mte:
Tags-aware copy_{user_,}highpage() implementations") and while this
doesn't appear to be harmful in practice it is clearly wrong.

Correct this by making `kto` and `kfrom` void pointers.

Fixes: 2563776b41c3 ("arm64: mte: Tags-aware copy_{user_,}highpage() implementations")
Signed-off-by: Tong Tiangen <tongtiangen@huawei.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Kefeng Wang <wangkefeng.wang@huawei.com>
---
 arch/arm64/mm/copypage.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/mm/copypage.c b/arch/arm64/mm/copypage.c
index b5447e53cd73..0dea80bf6de4 100644
--- a/arch/arm64/mm/copypage.c
+++ b/arch/arm64/mm/copypage.c
@@ -16,8 +16,8 @@
 
 void copy_highpage(struct page *to, struct page *from)
 {
-	struct page *kto = page_address(to);
-	struct page *kfrom = page_address(from);
+	void *kto = page_address(to);
+	void *kfrom = page_address(from);
 
 	copy_page(kto, kfrom);
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 96+ messages in thread

* [PATCH -next v4 3/7] arm64: add support for machine check error safe
  2022-04-20  3:04 ` Tong Tiangen
  (?)
@ 2022-04-20  3:04   ` Tong Tiangen
  -1 siblings, 0 replies; 96+ messages in thread
From: Tong Tiangen @ 2022-04-20  3:04 UTC (permalink / raw)
  To: Mark Rutland, James Morse, Andrew Morton, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Robin Murphy, Dave Hansen,
	Catalin Marinas, Will Deacon, Alexander Viro, Michael Ellerman,
	Benjamin Herrenschmidt, Paul Mackerras, x86, H . Peter Anvin
  Cc: linuxppc-dev, linux-arm-kernel, linux-kernel, linux-mm,
	Kefeng Wang, Xie XiuQi, Guohanjun, Tong Tiangen

During the processing of arm64 kernel hardware memory errors(do_sea()), if
the errors is consumed in the kernel, the current processing is panic.
However, it is not optimal.

Take uaccess for example, if the uaccess operation fails due to memory
error, only the user process will be affected, kill the user process
and isolate the user page with hardware memory errors is a better choice.

This patch only enable machine error check framework, it add exception
fixup before kernel panic in do_sea() and only limit the consumption of
hardware memory errors in kernel mode triggered by user mode processes.
If fixup successful, panic can be avoided.

Consistent with PPC/x86, it is implemented by CONFIG_ARCH_HAS_COPY_MC.

Also add copy_mc_to_user() in include/linux/uaccess.h, this helper is
called when CONFIG_ARCH_HAS_COPOY_MC is open.

Signed-off-by: Tong Tiangen <tongtiangen@huawei.com>
---
 arch/arm64/Kconfig               |  1 +
 arch/arm64/include/asm/extable.h |  1 +
 arch/arm64/mm/extable.c          | 17 +++++++++++++++++
 arch/arm64/mm/fault.c            | 27 ++++++++++++++++++++++++++-
 include/linux/uaccess.h          |  9 +++++++++
 5 files changed, 54 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index d9325dd95eba..012e38309955 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -19,6 +19,7 @@ config ARM64
 	select ARCH_ENABLE_SPLIT_PMD_PTLOCK if PGTABLE_LEVELS > 2
 	select ARCH_ENABLE_THP_MIGRATION if TRANSPARENT_HUGEPAGE
 	select ARCH_HAS_CACHE_LINE_SIZE
+	select ARCH_HAS_COPY_MC if ACPI_APEI_GHES
 	select ARCH_HAS_CURRENT_STACK_POINTER
 	select ARCH_HAS_DEBUG_VIRTUAL
 	select ARCH_HAS_DEBUG_VM_PGTABLE
diff --git a/arch/arm64/include/asm/extable.h b/arch/arm64/include/asm/extable.h
index 72b0e71cc3de..f80ebd0addfd 100644
--- a/arch/arm64/include/asm/extable.h
+++ b/arch/arm64/include/asm/extable.h
@@ -46,4 +46,5 @@ bool ex_handler_bpf(const struct exception_table_entry *ex,
 #endif /* !CONFIG_BPF_JIT */
 
 bool fixup_exception(struct pt_regs *regs);
+bool fixup_exception_mc(struct pt_regs *regs);
 #endif
diff --git a/arch/arm64/mm/extable.c b/arch/arm64/mm/extable.c
index 489455309695..4f0083a550d4 100644
--- a/arch/arm64/mm/extable.c
+++ b/arch/arm64/mm/extable.c
@@ -9,6 +9,7 @@
 
 #include <asm/asm-extable.h>
 #include <asm/ptrace.h>
+#include <asm/esr.h>
 
 static inline unsigned long
 get_ex_fixup(const struct exception_table_entry *ex)
@@ -84,3 +85,19 @@ bool fixup_exception(struct pt_regs *regs)
 
 	BUG();
 }
+
+bool fixup_exception_mc(struct pt_regs *regs)
+{
+	const struct exception_table_entry *ex;
+
+	ex = search_exception_tables(instruction_pointer(regs));
+	if (!ex)
+		return false;
+
+	/*
+	 * This is not complete, More Machine check safe extable type can
+	 * be processed here.
+	 */
+
+	return false;
+}
diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c
index 77341b160aca..a9e6fb1999d1 100644
--- a/arch/arm64/mm/fault.c
+++ b/arch/arm64/mm/fault.c
@@ -695,6 +695,29 @@ static int do_bad(unsigned long far, unsigned int esr, struct pt_regs *regs)
 	return 1; /* "fault" */
 }
 
+static bool arm64_do_kernel_sea(unsigned long addr, unsigned int esr,
+				     struct pt_regs *regs, int sig, int code)
+{
+	if (!IS_ENABLED(CONFIG_ARCH_HAS_COPY_MC))
+		return false;
+
+	if (user_mode(regs) || !current->mm)
+		return false;
+
+	if (apei_claim_sea(regs) < 0)
+		return false;
+
+	if (!fixup_exception_mc(regs))
+		return false;
+
+	set_thread_esr(0, esr);
+
+	arm64_force_sig_fault(sig, code, addr,
+		"Uncorrected hardware memory error in kernel-access\n");
+
+	return true;
+}
+
 static int do_sea(unsigned long far, unsigned int esr, struct pt_regs *regs)
 {
 	const struct fault_info *inf;
@@ -720,7 +743,9 @@ static int do_sea(unsigned long far, unsigned int esr, struct pt_regs *regs)
 		 */
 		siaddr  = untagged_addr(far);
 	}
-	arm64_notify_die(inf->name, regs, inf->sig, inf->code, siaddr, esr);
+
+	if (!arm64_do_kernel_sea(siaddr, esr, regs, inf->sig, inf->code))
+		arm64_notify_die(inf->name, regs, inf->sig, inf->code, siaddr, esr);
 
 	return 0;
 }
diff --git a/include/linux/uaccess.h b/include/linux/uaccess.h
index 546179418ffa..884661b29c17 100644
--- a/include/linux/uaccess.h
+++ b/include/linux/uaccess.h
@@ -174,6 +174,15 @@ copy_mc_to_kernel(void *dst, const void *src, size_t cnt)
 }
 #endif
 
+#ifndef copy_mc_to_user
+static inline unsigned long __must_check
+copy_mc_to_user(void *dst, const void *src, size_t cnt)
+{
+	check_object_size(src, cnt, true);
+	return raw_copy_to_user(dst, src, cnt);
+}
+#endif
+
 static __always_inline void pagefault_disabled_inc(void)
 {
 	current->pagefault_disabled++;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 96+ messages in thread

* [PATCH -next v4 3/7] arm64: add support for machine check error safe
@ 2022-04-20  3:04   ` Tong Tiangen
  0 siblings, 0 replies; 96+ messages in thread
From: Tong Tiangen @ 2022-04-20  3:04 UTC (permalink / raw)
  To: Mark Rutland, James Morse, Andrew Morton, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Robin Murphy, Dave Hansen,
	Catalin Marinas, Will Deacon, Alexander Viro, Michael Ellerman,
	Benjamin Herrenschmidt, Paul Mackerras, x86, H . Peter Anvin
  Cc: linuxppc-dev, linux-arm-kernel, linux-kernel, linux-mm,
	Kefeng Wang, Xie XiuQi, Guohanjun, Tong Tiangen

During the processing of arm64 kernel hardware memory errors(do_sea()), if
the errors is consumed in the kernel, the current processing is panic.
However, it is not optimal.

Take uaccess for example, if the uaccess operation fails due to memory
error, only the user process will be affected, kill the user process
and isolate the user page with hardware memory errors is a better choice.

This patch only enable machine error check framework, it add exception
fixup before kernel panic in do_sea() and only limit the consumption of
hardware memory errors in kernel mode triggered by user mode processes.
If fixup successful, panic can be avoided.

Consistent with PPC/x86, it is implemented by CONFIG_ARCH_HAS_COPY_MC.

Also add copy_mc_to_user() in include/linux/uaccess.h, this helper is
called when CONFIG_ARCH_HAS_COPOY_MC is open.

Signed-off-by: Tong Tiangen <tongtiangen@huawei.com>
---
 arch/arm64/Kconfig               |  1 +
 arch/arm64/include/asm/extable.h |  1 +
 arch/arm64/mm/extable.c          | 17 +++++++++++++++++
 arch/arm64/mm/fault.c            | 27 ++++++++++++++++++++++++++-
 include/linux/uaccess.h          |  9 +++++++++
 5 files changed, 54 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index d9325dd95eba..012e38309955 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -19,6 +19,7 @@ config ARM64
 	select ARCH_ENABLE_SPLIT_PMD_PTLOCK if PGTABLE_LEVELS > 2
 	select ARCH_ENABLE_THP_MIGRATION if TRANSPARENT_HUGEPAGE
 	select ARCH_HAS_CACHE_LINE_SIZE
+	select ARCH_HAS_COPY_MC if ACPI_APEI_GHES
 	select ARCH_HAS_CURRENT_STACK_POINTER
 	select ARCH_HAS_DEBUG_VIRTUAL
 	select ARCH_HAS_DEBUG_VM_PGTABLE
diff --git a/arch/arm64/include/asm/extable.h b/arch/arm64/include/asm/extable.h
index 72b0e71cc3de..f80ebd0addfd 100644
--- a/arch/arm64/include/asm/extable.h
+++ b/arch/arm64/include/asm/extable.h
@@ -46,4 +46,5 @@ bool ex_handler_bpf(const struct exception_table_entry *ex,
 #endif /* !CONFIG_BPF_JIT */
 
 bool fixup_exception(struct pt_regs *regs);
+bool fixup_exception_mc(struct pt_regs *regs);
 #endif
diff --git a/arch/arm64/mm/extable.c b/arch/arm64/mm/extable.c
index 489455309695..4f0083a550d4 100644
--- a/arch/arm64/mm/extable.c
+++ b/arch/arm64/mm/extable.c
@@ -9,6 +9,7 @@
 
 #include <asm/asm-extable.h>
 #include <asm/ptrace.h>
+#include <asm/esr.h>
 
 static inline unsigned long
 get_ex_fixup(const struct exception_table_entry *ex)
@@ -84,3 +85,19 @@ bool fixup_exception(struct pt_regs *regs)
 
 	BUG();
 }
+
+bool fixup_exception_mc(struct pt_regs *regs)
+{
+	const struct exception_table_entry *ex;
+
+	ex = search_exception_tables(instruction_pointer(regs));
+	if (!ex)
+		return false;
+
+	/*
+	 * This is not complete, More Machine check safe extable type can
+	 * be processed here.
+	 */
+
+	return false;
+}
diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c
index 77341b160aca..a9e6fb1999d1 100644
--- a/arch/arm64/mm/fault.c
+++ b/arch/arm64/mm/fault.c
@@ -695,6 +695,29 @@ static int do_bad(unsigned long far, unsigned int esr, struct pt_regs *regs)
 	return 1; /* "fault" */
 }
 
+static bool arm64_do_kernel_sea(unsigned long addr, unsigned int esr,
+				     struct pt_regs *regs, int sig, int code)
+{
+	if (!IS_ENABLED(CONFIG_ARCH_HAS_COPY_MC))
+		return false;
+
+	if (user_mode(regs) || !current->mm)
+		return false;
+
+	if (apei_claim_sea(regs) < 0)
+		return false;
+
+	if (!fixup_exception_mc(regs))
+		return false;
+
+	set_thread_esr(0, esr);
+
+	arm64_force_sig_fault(sig, code, addr,
+		"Uncorrected hardware memory error in kernel-access\n");
+
+	return true;
+}
+
 static int do_sea(unsigned long far, unsigned int esr, struct pt_regs *regs)
 {
 	const struct fault_info *inf;
@@ -720,7 +743,9 @@ static int do_sea(unsigned long far, unsigned int esr, struct pt_regs *regs)
 		 */
 		siaddr  = untagged_addr(far);
 	}
-	arm64_notify_die(inf->name, regs, inf->sig, inf->code, siaddr, esr);
+
+	if (!arm64_do_kernel_sea(siaddr, esr, regs, inf->sig, inf->code))
+		arm64_notify_die(inf->name, regs, inf->sig, inf->code, siaddr, esr);
 
 	return 0;
 }
diff --git a/include/linux/uaccess.h b/include/linux/uaccess.h
index 546179418ffa..884661b29c17 100644
--- a/include/linux/uaccess.h
+++ b/include/linux/uaccess.h
@@ -174,6 +174,15 @@ copy_mc_to_kernel(void *dst, const void *src, size_t cnt)
 }
 #endif
 
+#ifndef copy_mc_to_user
+static inline unsigned long __must_check
+copy_mc_to_user(void *dst, const void *src, size_t cnt)
+{
+	check_object_size(src, cnt, true);
+	return raw_copy_to_user(dst, src, cnt);
+}
+#endif
+
 static __always_inline void pagefault_disabled_inc(void)
 {
 	current->pagefault_disabled++;
-- 
2.25.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 96+ messages in thread

* [PATCH -next v4 3/7] arm64: add support for machine check error safe
@ 2022-04-20  3:04   ` Tong Tiangen
  0 siblings, 0 replies; 96+ messages in thread
From: Tong Tiangen @ 2022-04-20  3:04 UTC (permalink / raw)
  To: Mark Rutland, James Morse, Andrew Morton, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Robin Murphy, Dave Hansen,
	Catalin Marinas, Will Deacon, Alexander Viro, Michael Ellerman,
	Benjamin Herrenschmidt, Paul Mackerras, x86, H . Peter Anvin
  Cc: Kefeng Wang, Xie XiuQi, linux-kernel, linux-mm, Tong Tiangen,
	Guohanjun, linuxppc-dev, linux-arm-kernel

During the processing of arm64 kernel hardware memory errors(do_sea()), if
the errors is consumed in the kernel, the current processing is panic.
However, it is not optimal.

Take uaccess for example, if the uaccess operation fails due to memory
error, only the user process will be affected, kill the user process
and isolate the user page with hardware memory errors is a better choice.

This patch only enable machine error check framework, it add exception
fixup before kernel panic in do_sea() and only limit the consumption of
hardware memory errors in kernel mode triggered by user mode processes.
If fixup successful, panic can be avoided.

Consistent with PPC/x86, it is implemented by CONFIG_ARCH_HAS_COPY_MC.

Also add copy_mc_to_user() in include/linux/uaccess.h, this helper is
called when CONFIG_ARCH_HAS_COPOY_MC is open.

Signed-off-by: Tong Tiangen <tongtiangen@huawei.com>
---
 arch/arm64/Kconfig               |  1 +
 arch/arm64/include/asm/extable.h |  1 +
 arch/arm64/mm/extable.c          | 17 +++++++++++++++++
 arch/arm64/mm/fault.c            | 27 ++++++++++++++++++++++++++-
 include/linux/uaccess.h          |  9 +++++++++
 5 files changed, 54 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index d9325dd95eba..012e38309955 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -19,6 +19,7 @@ config ARM64
 	select ARCH_ENABLE_SPLIT_PMD_PTLOCK if PGTABLE_LEVELS > 2
 	select ARCH_ENABLE_THP_MIGRATION if TRANSPARENT_HUGEPAGE
 	select ARCH_HAS_CACHE_LINE_SIZE
+	select ARCH_HAS_COPY_MC if ACPI_APEI_GHES
 	select ARCH_HAS_CURRENT_STACK_POINTER
 	select ARCH_HAS_DEBUG_VIRTUAL
 	select ARCH_HAS_DEBUG_VM_PGTABLE
diff --git a/arch/arm64/include/asm/extable.h b/arch/arm64/include/asm/extable.h
index 72b0e71cc3de..f80ebd0addfd 100644
--- a/arch/arm64/include/asm/extable.h
+++ b/arch/arm64/include/asm/extable.h
@@ -46,4 +46,5 @@ bool ex_handler_bpf(const struct exception_table_entry *ex,
 #endif /* !CONFIG_BPF_JIT */
 
 bool fixup_exception(struct pt_regs *regs);
+bool fixup_exception_mc(struct pt_regs *regs);
 #endif
diff --git a/arch/arm64/mm/extable.c b/arch/arm64/mm/extable.c
index 489455309695..4f0083a550d4 100644
--- a/arch/arm64/mm/extable.c
+++ b/arch/arm64/mm/extable.c
@@ -9,6 +9,7 @@
 
 #include <asm/asm-extable.h>
 #include <asm/ptrace.h>
+#include <asm/esr.h>
 
 static inline unsigned long
 get_ex_fixup(const struct exception_table_entry *ex)
@@ -84,3 +85,19 @@ bool fixup_exception(struct pt_regs *regs)
 
 	BUG();
 }
+
+bool fixup_exception_mc(struct pt_regs *regs)
+{
+	const struct exception_table_entry *ex;
+
+	ex = search_exception_tables(instruction_pointer(regs));
+	if (!ex)
+		return false;
+
+	/*
+	 * This is not complete, More Machine check safe extable type can
+	 * be processed here.
+	 */
+
+	return false;
+}
diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c
index 77341b160aca..a9e6fb1999d1 100644
--- a/arch/arm64/mm/fault.c
+++ b/arch/arm64/mm/fault.c
@@ -695,6 +695,29 @@ static int do_bad(unsigned long far, unsigned int esr, struct pt_regs *regs)
 	return 1; /* "fault" */
 }
 
+static bool arm64_do_kernel_sea(unsigned long addr, unsigned int esr,
+				     struct pt_regs *regs, int sig, int code)
+{
+	if (!IS_ENABLED(CONFIG_ARCH_HAS_COPY_MC))
+		return false;
+
+	if (user_mode(regs) || !current->mm)
+		return false;
+
+	if (apei_claim_sea(regs) < 0)
+		return false;
+
+	if (!fixup_exception_mc(regs))
+		return false;
+
+	set_thread_esr(0, esr);
+
+	arm64_force_sig_fault(sig, code, addr,
+		"Uncorrected hardware memory error in kernel-access\n");
+
+	return true;
+}
+
 static int do_sea(unsigned long far, unsigned int esr, struct pt_regs *regs)
 {
 	const struct fault_info *inf;
@@ -720,7 +743,9 @@ static int do_sea(unsigned long far, unsigned int esr, struct pt_regs *regs)
 		 */
 		siaddr  = untagged_addr(far);
 	}
-	arm64_notify_die(inf->name, regs, inf->sig, inf->code, siaddr, esr);
+
+	if (!arm64_do_kernel_sea(siaddr, esr, regs, inf->sig, inf->code))
+		arm64_notify_die(inf->name, regs, inf->sig, inf->code, siaddr, esr);
 
 	return 0;
 }
diff --git a/include/linux/uaccess.h b/include/linux/uaccess.h
index 546179418ffa..884661b29c17 100644
--- a/include/linux/uaccess.h
+++ b/include/linux/uaccess.h
@@ -174,6 +174,15 @@ copy_mc_to_kernel(void *dst, const void *src, size_t cnt)
 }
 #endif
 
+#ifndef copy_mc_to_user
+static inline unsigned long __must_check
+copy_mc_to_user(void *dst, const void *src, size_t cnt)
+{
+	check_object_size(src, cnt, true);
+	return raw_copy_to_user(dst, src, cnt);
+}
+#endif
+
 static __always_inline void pagefault_disabled_inc(void)
 {
 	current->pagefault_disabled++;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 96+ messages in thread

* [PATCH -next v4 4/7] arm64: add copy_{to, from}_user to machine check safe
  2022-04-20  3:04 ` Tong Tiangen
  (?)
@ 2022-04-20  3:04   ` Tong Tiangen
  -1 siblings, 0 replies; 96+ messages in thread
From: Tong Tiangen @ 2022-04-20  3:04 UTC (permalink / raw)
  To: Mark Rutland, James Morse, Andrew Morton, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Robin Murphy, Dave Hansen,
	Catalin Marinas, Will Deacon, Alexander Viro, Michael Ellerman,
	Benjamin Herrenschmidt, Paul Mackerras, x86, H . Peter Anvin
  Cc: linuxppc-dev, linux-arm-kernel, linux-kernel, linux-mm,
	Kefeng Wang, Xie XiuQi, Guohanjun, Tong Tiangen

Add copy_{to, from}_user() to machine check safe.

If copy fail due to hardware memory error, only the relevant processes are
affected, so killing the user process and isolate the user page with
hardware memory errors is a more reasonable choice than kernel panic.

Add new extable type EX_TYPE_UACCESS_MC which can be used for uaccess that
can be recovered from hardware memory errors.

The x16 register is used to save the fixup type in copy_xxx_user which
used extable type EX_TYPE_UACCESS_MC.

Signed-off-by: Tong Tiangen <tongtiangen@huawei.com>
---
 arch/arm64/include/asm/asm-extable.h | 14 ++++++++++++++
 arch/arm64/include/asm/asm-uaccess.h | 15 ++++++++++-----
 arch/arm64/lib/copy_from_user.S      | 18 +++++++++++-------
 arch/arm64/lib/copy_to_user.S        | 18 +++++++++++-------
 arch/arm64/mm/extable.c              | 18 ++++++++++++++----
 5 files changed, 60 insertions(+), 23 deletions(-)

diff --git a/arch/arm64/include/asm/asm-extable.h b/arch/arm64/include/asm/asm-extable.h
index c39f2437e08e..75b2c00e9523 100644
--- a/arch/arm64/include/asm/asm-extable.h
+++ b/arch/arm64/include/asm/asm-extable.h
@@ -2,12 +2,18 @@
 #ifndef __ASM_ASM_EXTABLE_H
 #define __ASM_ASM_EXTABLE_H
 
+#define FIXUP_TYPE_NORMAL		0
+#define FIXUP_TYPE_MC			1
+
 #define EX_TYPE_NONE			0
 #define EX_TYPE_FIXUP			1
 #define EX_TYPE_BPF			2
 #define EX_TYPE_UACCESS_ERR_ZERO	3
 #define EX_TYPE_LOAD_UNALIGNED_ZEROPAD	4
 
+/* _MC indicates that can fixup from machine check errors */
+#define EX_TYPE_UACCESS_MC		5
+
 #ifdef __ASSEMBLY__
 
 #define __ASM_EXTABLE_RAW(insn, fixup, type, data)	\
@@ -27,6 +33,14 @@
 	__ASM_EXTABLE_RAW(\insn, \fixup, EX_TYPE_FIXUP, 0)
 	.endm
 
+/*
+ * Create an exception table entry for `insn`, which will branch to `fixup`
+ * when an unhandled fault(include sea fault) is taken.
+ */
+	.macro          _asm_extable_uaccess_mc, insn, fixup
+	__ASM_EXTABLE_RAW(\insn, \fixup, EX_TYPE_UACCESS_MC, 0)
+	.endm
+
 /*
  * Create an exception table entry for `insn` if `fixup` is provided. Otherwise
  * do nothing.
diff --git a/arch/arm64/include/asm/asm-uaccess.h b/arch/arm64/include/asm/asm-uaccess.h
index 0557af834e03..6c23c138e1fc 100644
--- a/arch/arm64/include/asm/asm-uaccess.h
+++ b/arch/arm64/include/asm/asm-uaccess.h
@@ -63,6 +63,11 @@ alternative_else_nop_endif
 9999:	x;					\
 	_asm_extable	9999b, l
 
+
+#define USER_MC(l, x...)			\
+9999:	x;					\
+	_asm_extable_uaccess_mc	9999b, l
+
 /*
  * Generate the assembly for LDTR/STTR with exception table entries.
  * This is complicated as there is no post-increment or pair versions of the
@@ -73,8 +78,8 @@ alternative_else_nop_endif
 8889:		ldtr	\reg2, [\addr, #8];
 		add	\addr, \addr, \post_inc;
 
-		_asm_extable	8888b,\l;
-		_asm_extable	8889b,\l;
+		_asm_extable_uaccess_mc	8888b, \l;
+		_asm_extable_uaccess_mc	8889b, \l;
 	.endm
 
 	.macro user_stp l, reg1, reg2, addr, post_inc
@@ -82,14 +87,14 @@ alternative_else_nop_endif
 8889:		sttr	\reg2, [\addr, #8];
 		add	\addr, \addr, \post_inc;
 
-		_asm_extable	8888b,\l;
-		_asm_extable	8889b,\l;
+		_asm_extable_uaccess_mc	8888b,\l;
+		_asm_extable_uaccess_mc	8889b,\l;
 	.endm
 
 	.macro user_ldst l, inst, reg, addr, post_inc
 8888:		\inst		\reg, [\addr];
 		add		\addr, \addr, \post_inc;
 
-		_asm_extable	8888b,\l;
+		_asm_extable_uaccess_mc	8888b, \l;
 	.endm
 #endif
diff --git a/arch/arm64/lib/copy_from_user.S b/arch/arm64/lib/copy_from_user.S
index 34e317907524..480cc5ac0a8d 100644
--- a/arch/arm64/lib/copy_from_user.S
+++ b/arch/arm64/lib/copy_from_user.S
@@ -25,7 +25,7 @@
 	.endm
 
 	.macro strb1 reg, ptr, val
-	strb \reg, [\ptr], \val
+	USER_MC(9998f, strb \reg, [\ptr], \val)
 	.endm
 
 	.macro ldrh1 reg, ptr, val
@@ -33,7 +33,7 @@
 	.endm
 
 	.macro strh1 reg, ptr, val
-	strh \reg, [\ptr], \val
+	USER_MC(9998f, strh \reg, [\ptr], \val)
 	.endm
 
 	.macro ldr1 reg, ptr, val
@@ -41,7 +41,7 @@
 	.endm
 
 	.macro str1 reg, ptr, val
-	str \reg, [\ptr], \val
+	USER_MC(9998f, str \reg, [\ptr], \val)
 	.endm
 
 	.macro ldp1 reg1, reg2, ptr, val
@@ -49,11 +49,12 @@
 	.endm
 
 	.macro stp1 reg1, reg2, ptr, val
-	stp \reg1, \reg2, [\ptr], \val
+	USER_MC(9998f, stp \reg1, \reg2, [\ptr], \val)
 	.endm
 
-end	.req	x5
-srcin	.req	x15
+end		.req	x5
+srcin		.req	x15
+fixup_type	.req	x16
 SYM_FUNC_START(__arch_copy_from_user)
 	add	end, x0, x2
 	mov	srcin, x1
@@ -62,7 +63,10 @@ SYM_FUNC_START(__arch_copy_from_user)
 	ret
 
 	// Exception fixups
-9997:	cmp	dst, dstin
+	// x16: fixup type written by ex_handler_uaccess_mc
+9997:	cmp 	fixup_type, #FIXUP_TYPE_MC
+	b.eq	9998f
+	cmp	dst, dstin
 	b.ne	9998f
 	// Before being absolutely sure we couldn't copy anything, try harder
 USER(9998f, ldtrb tmp1w, [srcin])
diff --git a/arch/arm64/lib/copy_to_user.S b/arch/arm64/lib/copy_to_user.S
index 802231772608..021a7d27b3a4 100644
--- a/arch/arm64/lib/copy_to_user.S
+++ b/arch/arm64/lib/copy_to_user.S
@@ -20,7 +20,7 @@
  *	x0 - bytes not copied
  */
 	.macro ldrb1 reg, ptr, val
-	ldrb  \reg, [\ptr], \val
+	USER_MC(9998f, ldrb  \reg, [\ptr], \val)
 	.endm
 
 	.macro strb1 reg, ptr, val
@@ -28,7 +28,7 @@
 	.endm
 
 	.macro ldrh1 reg, ptr, val
-	ldrh  \reg, [\ptr], \val
+	USER_MC(9998f, ldrh  \reg, [\ptr], \val)
 	.endm
 
 	.macro strh1 reg, ptr, val
@@ -36,7 +36,7 @@
 	.endm
 
 	.macro ldr1 reg, ptr, val
-	ldr \reg, [\ptr], \val
+	USER_MC(9998f, ldr \reg, [\ptr], \val)
 	.endm
 
 	.macro str1 reg, ptr, val
@@ -44,15 +44,16 @@
 	.endm
 
 	.macro ldp1 reg1, reg2, ptr, val
-	ldp \reg1, \reg2, [\ptr], \val
+	USER_MC(9998f, ldp \reg1, \reg2, [\ptr], \val)
 	.endm
 
 	.macro stp1 reg1, reg2, ptr, val
 	user_stp 9997f, \reg1, \reg2, \ptr, \val
 	.endm
 
-end	.req	x5
-srcin	.req	x15
+end		.req	x5
+srcin		.req	x15
+fixup_type	.req	x16
 SYM_FUNC_START(__arch_copy_to_user)
 	add	end, x0, x2
 	mov	srcin, x1
@@ -61,7 +62,10 @@ SYM_FUNC_START(__arch_copy_to_user)
 	ret
 
 	// Exception fixups
-9997:	cmp	dst, dstin
+	// x16: fixup type written by ex_handler_uaccess_mc
+9997:	cmp 	fixup_type, #FIXUP_TYPE_MC
+	b.eq	9998f
+	cmp	dst, dstin
 	b.ne	9998f
 	// Before being absolutely sure we couldn't copy anything, try harder
 	ldrb	tmp1w, [srcin]
diff --git a/arch/arm64/mm/extable.c b/arch/arm64/mm/extable.c
index 4f0083a550d4..525876c3ebf4 100644
--- a/arch/arm64/mm/extable.c
+++ b/arch/arm64/mm/extable.c
@@ -24,6 +24,14 @@ static bool ex_handler_fixup(const struct exception_table_entry *ex,
 	return true;
 }
 
+static bool ex_handler_uaccess_type(const struct exception_table_entry *ex,
+			     struct pt_regs *regs,
+			     unsigned long fixup_type)
+{
+	regs->regs[16] = fixup_type;
+	return ex_handler_fixup(ex, regs);
+}
+
 static bool ex_handler_uaccess_err_zero(const struct exception_table_entry *ex,
 					struct pt_regs *regs)
 {
@@ -75,6 +83,8 @@ bool fixup_exception(struct pt_regs *regs)
 	switch (ex->type) {
 	case EX_TYPE_FIXUP:
 		return ex_handler_fixup(ex, regs);
+	case EX_TYPE_UACCESS_MC:
+		return ex_handler_uaccess_type(ex, regs, FIXUP_TYPE_NORMAL);
 	case EX_TYPE_BPF:
 		return ex_handler_bpf(ex, regs);
 	case EX_TYPE_UACCESS_ERR_ZERO:
@@ -94,10 +104,10 @@ bool fixup_exception_mc(struct pt_regs *regs)
 	if (!ex)
 		return false;
 
-	/*
-	 * This is not complete, More Machine check safe extable type can
-	 * be processed here.
-	 */
+	switch (ex->type) {
+	case EX_TYPE_UACCESS_MC:
+		return ex_handler_uaccess_type(ex, regs, FIXUP_TYPE_MC);
+	}
 
 	return false;
 }
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 96+ messages in thread

* [PATCH -next v4 4/7] arm64: add copy_{to, from}_user to machine check safe
@ 2022-04-20  3:04   ` Tong Tiangen
  0 siblings, 0 replies; 96+ messages in thread
From: Tong Tiangen @ 2022-04-20  3:04 UTC (permalink / raw)
  To: Mark Rutland, James Morse, Andrew Morton, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Robin Murphy, Dave Hansen,
	Catalin Marinas, Will Deacon, Alexander Viro, Michael Ellerman,
	Benjamin Herrenschmidt, Paul Mackerras, x86, H . Peter Anvin
  Cc: linuxppc-dev, linux-arm-kernel, linux-kernel, linux-mm,
	Kefeng Wang, Xie XiuQi, Guohanjun, Tong Tiangen

Add copy_{to, from}_user() to machine check safe.

If copy fail due to hardware memory error, only the relevant processes are
affected, so killing the user process and isolate the user page with
hardware memory errors is a more reasonable choice than kernel panic.

Add new extable type EX_TYPE_UACCESS_MC which can be used for uaccess that
can be recovered from hardware memory errors.

The x16 register is used to save the fixup type in copy_xxx_user which
used extable type EX_TYPE_UACCESS_MC.

Signed-off-by: Tong Tiangen <tongtiangen@huawei.com>
---
 arch/arm64/include/asm/asm-extable.h | 14 ++++++++++++++
 arch/arm64/include/asm/asm-uaccess.h | 15 ++++++++++-----
 arch/arm64/lib/copy_from_user.S      | 18 +++++++++++-------
 arch/arm64/lib/copy_to_user.S        | 18 +++++++++++-------
 arch/arm64/mm/extable.c              | 18 ++++++++++++++----
 5 files changed, 60 insertions(+), 23 deletions(-)

diff --git a/arch/arm64/include/asm/asm-extable.h b/arch/arm64/include/asm/asm-extable.h
index c39f2437e08e..75b2c00e9523 100644
--- a/arch/arm64/include/asm/asm-extable.h
+++ b/arch/arm64/include/asm/asm-extable.h
@@ -2,12 +2,18 @@
 #ifndef __ASM_ASM_EXTABLE_H
 #define __ASM_ASM_EXTABLE_H
 
+#define FIXUP_TYPE_NORMAL		0
+#define FIXUP_TYPE_MC			1
+
 #define EX_TYPE_NONE			0
 #define EX_TYPE_FIXUP			1
 #define EX_TYPE_BPF			2
 #define EX_TYPE_UACCESS_ERR_ZERO	3
 #define EX_TYPE_LOAD_UNALIGNED_ZEROPAD	4
 
+/* _MC indicates that can fixup from machine check errors */
+#define EX_TYPE_UACCESS_MC		5
+
 #ifdef __ASSEMBLY__
 
 #define __ASM_EXTABLE_RAW(insn, fixup, type, data)	\
@@ -27,6 +33,14 @@
 	__ASM_EXTABLE_RAW(\insn, \fixup, EX_TYPE_FIXUP, 0)
 	.endm
 
+/*
+ * Create an exception table entry for `insn`, which will branch to `fixup`
+ * when an unhandled fault(include sea fault) is taken.
+ */
+	.macro          _asm_extable_uaccess_mc, insn, fixup
+	__ASM_EXTABLE_RAW(\insn, \fixup, EX_TYPE_UACCESS_MC, 0)
+	.endm
+
 /*
  * Create an exception table entry for `insn` if `fixup` is provided. Otherwise
  * do nothing.
diff --git a/arch/arm64/include/asm/asm-uaccess.h b/arch/arm64/include/asm/asm-uaccess.h
index 0557af834e03..6c23c138e1fc 100644
--- a/arch/arm64/include/asm/asm-uaccess.h
+++ b/arch/arm64/include/asm/asm-uaccess.h
@@ -63,6 +63,11 @@ alternative_else_nop_endif
 9999:	x;					\
 	_asm_extable	9999b, l
 
+
+#define USER_MC(l, x...)			\
+9999:	x;					\
+	_asm_extable_uaccess_mc	9999b, l
+
 /*
  * Generate the assembly for LDTR/STTR with exception table entries.
  * This is complicated as there is no post-increment or pair versions of the
@@ -73,8 +78,8 @@ alternative_else_nop_endif
 8889:		ldtr	\reg2, [\addr, #8];
 		add	\addr, \addr, \post_inc;
 
-		_asm_extable	8888b,\l;
-		_asm_extable	8889b,\l;
+		_asm_extable_uaccess_mc	8888b, \l;
+		_asm_extable_uaccess_mc	8889b, \l;
 	.endm
 
 	.macro user_stp l, reg1, reg2, addr, post_inc
@@ -82,14 +87,14 @@ alternative_else_nop_endif
 8889:		sttr	\reg2, [\addr, #8];
 		add	\addr, \addr, \post_inc;
 
-		_asm_extable	8888b,\l;
-		_asm_extable	8889b,\l;
+		_asm_extable_uaccess_mc	8888b,\l;
+		_asm_extable_uaccess_mc	8889b,\l;
 	.endm
 
 	.macro user_ldst l, inst, reg, addr, post_inc
 8888:		\inst		\reg, [\addr];
 		add		\addr, \addr, \post_inc;
 
-		_asm_extable	8888b,\l;
+		_asm_extable_uaccess_mc	8888b, \l;
 	.endm
 #endif
diff --git a/arch/arm64/lib/copy_from_user.S b/arch/arm64/lib/copy_from_user.S
index 34e317907524..480cc5ac0a8d 100644
--- a/arch/arm64/lib/copy_from_user.S
+++ b/arch/arm64/lib/copy_from_user.S
@@ -25,7 +25,7 @@
 	.endm
 
 	.macro strb1 reg, ptr, val
-	strb \reg, [\ptr], \val
+	USER_MC(9998f, strb \reg, [\ptr], \val)
 	.endm
 
 	.macro ldrh1 reg, ptr, val
@@ -33,7 +33,7 @@
 	.endm
 
 	.macro strh1 reg, ptr, val
-	strh \reg, [\ptr], \val
+	USER_MC(9998f, strh \reg, [\ptr], \val)
 	.endm
 
 	.macro ldr1 reg, ptr, val
@@ -41,7 +41,7 @@
 	.endm
 
 	.macro str1 reg, ptr, val
-	str \reg, [\ptr], \val
+	USER_MC(9998f, str \reg, [\ptr], \val)
 	.endm
 
 	.macro ldp1 reg1, reg2, ptr, val
@@ -49,11 +49,12 @@
 	.endm
 
 	.macro stp1 reg1, reg2, ptr, val
-	stp \reg1, \reg2, [\ptr], \val
+	USER_MC(9998f, stp \reg1, \reg2, [\ptr], \val)
 	.endm
 
-end	.req	x5
-srcin	.req	x15
+end		.req	x5
+srcin		.req	x15
+fixup_type	.req	x16
 SYM_FUNC_START(__arch_copy_from_user)
 	add	end, x0, x2
 	mov	srcin, x1
@@ -62,7 +63,10 @@ SYM_FUNC_START(__arch_copy_from_user)
 	ret
 
 	// Exception fixups
-9997:	cmp	dst, dstin
+	// x16: fixup type written by ex_handler_uaccess_mc
+9997:	cmp 	fixup_type, #FIXUP_TYPE_MC
+	b.eq	9998f
+	cmp	dst, dstin
 	b.ne	9998f
 	// Before being absolutely sure we couldn't copy anything, try harder
 USER(9998f, ldtrb tmp1w, [srcin])
diff --git a/arch/arm64/lib/copy_to_user.S b/arch/arm64/lib/copy_to_user.S
index 802231772608..021a7d27b3a4 100644
--- a/arch/arm64/lib/copy_to_user.S
+++ b/arch/arm64/lib/copy_to_user.S
@@ -20,7 +20,7 @@
  *	x0 - bytes not copied
  */
 	.macro ldrb1 reg, ptr, val
-	ldrb  \reg, [\ptr], \val
+	USER_MC(9998f, ldrb  \reg, [\ptr], \val)
 	.endm
 
 	.macro strb1 reg, ptr, val
@@ -28,7 +28,7 @@
 	.endm
 
 	.macro ldrh1 reg, ptr, val
-	ldrh  \reg, [\ptr], \val
+	USER_MC(9998f, ldrh  \reg, [\ptr], \val)
 	.endm
 
 	.macro strh1 reg, ptr, val
@@ -36,7 +36,7 @@
 	.endm
 
 	.macro ldr1 reg, ptr, val
-	ldr \reg, [\ptr], \val
+	USER_MC(9998f, ldr \reg, [\ptr], \val)
 	.endm
 
 	.macro str1 reg, ptr, val
@@ -44,15 +44,16 @@
 	.endm
 
 	.macro ldp1 reg1, reg2, ptr, val
-	ldp \reg1, \reg2, [\ptr], \val
+	USER_MC(9998f, ldp \reg1, \reg2, [\ptr], \val)
 	.endm
 
 	.macro stp1 reg1, reg2, ptr, val
 	user_stp 9997f, \reg1, \reg2, \ptr, \val
 	.endm
 
-end	.req	x5
-srcin	.req	x15
+end		.req	x5
+srcin		.req	x15
+fixup_type	.req	x16
 SYM_FUNC_START(__arch_copy_to_user)
 	add	end, x0, x2
 	mov	srcin, x1
@@ -61,7 +62,10 @@ SYM_FUNC_START(__arch_copy_to_user)
 	ret
 
 	// Exception fixups
-9997:	cmp	dst, dstin
+	// x16: fixup type written by ex_handler_uaccess_mc
+9997:	cmp 	fixup_type, #FIXUP_TYPE_MC
+	b.eq	9998f
+	cmp	dst, dstin
 	b.ne	9998f
 	// Before being absolutely sure we couldn't copy anything, try harder
 	ldrb	tmp1w, [srcin]
diff --git a/arch/arm64/mm/extable.c b/arch/arm64/mm/extable.c
index 4f0083a550d4..525876c3ebf4 100644
--- a/arch/arm64/mm/extable.c
+++ b/arch/arm64/mm/extable.c
@@ -24,6 +24,14 @@ static bool ex_handler_fixup(const struct exception_table_entry *ex,
 	return true;
 }
 
+static bool ex_handler_uaccess_type(const struct exception_table_entry *ex,
+			     struct pt_regs *regs,
+			     unsigned long fixup_type)
+{
+	regs->regs[16] = fixup_type;
+	return ex_handler_fixup(ex, regs);
+}
+
 static bool ex_handler_uaccess_err_zero(const struct exception_table_entry *ex,
 					struct pt_regs *regs)
 {
@@ -75,6 +83,8 @@ bool fixup_exception(struct pt_regs *regs)
 	switch (ex->type) {
 	case EX_TYPE_FIXUP:
 		return ex_handler_fixup(ex, regs);
+	case EX_TYPE_UACCESS_MC:
+		return ex_handler_uaccess_type(ex, regs, FIXUP_TYPE_NORMAL);
 	case EX_TYPE_BPF:
 		return ex_handler_bpf(ex, regs);
 	case EX_TYPE_UACCESS_ERR_ZERO:
@@ -94,10 +104,10 @@ bool fixup_exception_mc(struct pt_regs *regs)
 	if (!ex)
 		return false;
 
-	/*
-	 * This is not complete, More Machine check safe extable type can
-	 * be processed here.
-	 */
+	switch (ex->type) {
+	case EX_TYPE_UACCESS_MC:
+		return ex_handler_uaccess_type(ex, regs, FIXUP_TYPE_MC);
+	}
 
 	return false;
 }
-- 
2.25.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 96+ messages in thread

* [PATCH -next v4 4/7] arm64: add copy_{to, from}_user to machine check safe
@ 2022-04-20  3:04   ` Tong Tiangen
  0 siblings, 0 replies; 96+ messages in thread
From: Tong Tiangen @ 2022-04-20  3:04 UTC (permalink / raw)
  To: Mark Rutland, James Morse, Andrew Morton, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Robin Murphy, Dave Hansen,
	Catalin Marinas, Will Deacon, Alexander Viro, Michael Ellerman,
	Benjamin Herrenschmidt, Paul Mackerras, x86, H . Peter Anvin
  Cc: Kefeng Wang, Xie XiuQi, linux-kernel, linux-mm, Tong Tiangen,
	Guohanjun, linuxppc-dev, linux-arm-kernel

Add copy_{to, from}_user() to machine check safe.

If copy fail due to hardware memory error, only the relevant processes are
affected, so killing the user process and isolate the user page with
hardware memory errors is a more reasonable choice than kernel panic.

Add new extable type EX_TYPE_UACCESS_MC which can be used for uaccess that
can be recovered from hardware memory errors.

The x16 register is used to save the fixup type in copy_xxx_user which
used extable type EX_TYPE_UACCESS_MC.

Signed-off-by: Tong Tiangen <tongtiangen@huawei.com>
---
 arch/arm64/include/asm/asm-extable.h | 14 ++++++++++++++
 arch/arm64/include/asm/asm-uaccess.h | 15 ++++++++++-----
 arch/arm64/lib/copy_from_user.S      | 18 +++++++++++-------
 arch/arm64/lib/copy_to_user.S        | 18 +++++++++++-------
 arch/arm64/mm/extable.c              | 18 ++++++++++++++----
 5 files changed, 60 insertions(+), 23 deletions(-)

diff --git a/arch/arm64/include/asm/asm-extable.h b/arch/arm64/include/asm/asm-extable.h
index c39f2437e08e..75b2c00e9523 100644
--- a/arch/arm64/include/asm/asm-extable.h
+++ b/arch/arm64/include/asm/asm-extable.h
@@ -2,12 +2,18 @@
 #ifndef __ASM_ASM_EXTABLE_H
 #define __ASM_ASM_EXTABLE_H
 
+#define FIXUP_TYPE_NORMAL		0
+#define FIXUP_TYPE_MC			1
+
 #define EX_TYPE_NONE			0
 #define EX_TYPE_FIXUP			1
 #define EX_TYPE_BPF			2
 #define EX_TYPE_UACCESS_ERR_ZERO	3
 #define EX_TYPE_LOAD_UNALIGNED_ZEROPAD	4
 
+/* _MC indicates that can fixup from machine check errors */
+#define EX_TYPE_UACCESS_MC		5
+
 #ifdef __ASSEMBLY__
 
 #define __ASM_EXTABLE_RAW(insn, fixup, type, data)	\
@@ -27,6 +33,14 @@
 	__ASM_EXTABLE_RAW(\insn, \fixup, EX_TYPE_FIXUP, 0)
 	.endm
 
+/*
+ * Create an exception table entry for `insn`, which will branch to `fixup`
+ * when an unhandled fault(include sea fault) is taken.
+ */
+	.macro          _asm_extable_uaccess_mc, insn, fixup
+	__ASM_EXTABLE_RAW(\insn, \fixup, EX_TYPE_UACCESS_MC, 0)
+	.endm
+
 /*
  * Create an exception table entry for `insn` if `fixup` is provided. Otherwise
  * do nothing.
diff --git a/arch/arm64/include/asm/asm-uaccess.h b/arch/arm64/include/asm/asm-uaccess.h
index 0557af834e03..6c23c138e1fc 100644
--- a/arch/arm64/include/asm/asm-uaccess.h
+++ b/arch/arm64/include/asm/asm-uaccess.h
@@ -63,6 +63,11 @@ alternative_else_nop_endif
 9999:	x;					\
 	_asm_extable	9999b, l
 
+
+#define USER_MC(l, x...)			\
+9999:	x;					\
+	_asm_extable_uaccess_mc	9999b, l
+
 /*
  * Generate the assembly for LDTR/STTR with exception table entries.
  * This is complicated as there is no post-increment or pair versions of the
@@ -73,8 +78,8 @@ alternative_else_nop_endif
 8889:		ldtr	\reg2, [\addr, #8];
 		add	\addr, \addr, \post_inc;
 
-		_asm_extable	8888b,\l;
-		_asm_extable	8889b,\l;
+		_asm_extable_uaccess_mc	8888b, \l;
+		_asm_extable_uaccess_mc	8889b, \l;
 	.endm
 
 	.macro user_stp l, reg1, reg2, addr, post_inc
@@ -82,14 +87,14 @@ alternative_else_nop_endif
 8889:		sttr	\reg2, [\addr, #8];
 		add	\addr, \addr, \post_inc;
 
-		_asm_extable	8888b,\l;
-		_asm_extable	8889b,\l;
+		_asm_extable_uaccess_mc	8888b,\l;
+		_asm_extable_uaccess_mc	8889b,\l;
 	.endm
 
 	.macro user_ldst l, inst, reg, addr, post_inc
 8888:		\inst		\reg, [\addr];
 		add		\addr, \addr, \post_inc;
 
-		_asm_extable	8888b,\l;
+		_asm_extable_uaccess_mc	8888b, \l;
 	.endm
 #endif
diff --git a/arch/arm64/lib/copy_from_user.S b/arch/arm64/lib/copy_from_user.S
index 34e317907524..480cc5ac0a8d 100644
--- a/arch/arm64/lib/copy_from_user.S
+++ b/arch/arm64/lib/copy_from_user.S
@@ -25,7 +25,7 @@
 	.endm
 
 	.macro strb1 reg, ptr, val
-	strb \reg, [\ptr], \val
+	USER_MC(9998f, strb \reg, [\ptr], \val)
 	.endm
 
 	.macro ldrh1 reg, ptr, val
@@ -33,7 +33,7 @@
 	.endm
 
 	.macro strh1 reg, ptr, val
-	strh \reg, [\ptr], \val
+	USER_MC(9998f, strh \reg, [\ptr], \val)
 	.endm
 
 	.macro ldr1 reg, ptr, val
@@ -41,7 +41,7 @@
 	.endm
 
 	.macro str1 reg, ptr, val
-	str \reg, [\ptr], \val
+	USER_MC(9998f, str \reg, [\ptr], \val)
 	.endm
 
 	.macro ldp1 reg1, reg2, ptr, val
@@ -49,11 +49,12 @@
 	.endm
 
 	.macro stp1 reg1, reg2, ptr, val
-	stp \reg1, \reg2, [\ptr], \val
+	USER_MC(9998f, stp \reg1, \reg2, [\ptr], \val)
 	.endm
 
-end	.req	x5
-srcin	.req	x15
+end		.req	x5
+srcin		.req	x15
+fixup_type	.req	x16
 SYM_FUNC_START(__arch_copy_from_user)
 	add	end, x0, x2
 	mov	srcin, x1
@@ -62,7 +63,10 @@ SYM_FUNC_START(__arch_copy_from_user)
 	ret
 
 	// Exception fixups
-9997:	cmp	dst, dstin
+	// x16: fixup type written by ex_handler_uaccess_mc
+9997:	cmp 	fixup_type, #FIXUP_TYPE_MC
+	b.eq	9998f
+	cmp	dst, dstin
 	b.ne	9998f
 	// Before being absolutely sure we couldn't copy anything, try harder
 USER(9998f, ldtrb tmp1w, [srcin])
diff --git a/arch/arm64/lib/copy_to_user.S b/arch/arm64/lib/copy_to_user.S
index 802231772608..021a7d27b3a4 100644
--- a/arch/arm64/lib/copy_to_user.S
+++ b/arch/arm64/lib/copy_to_user.S
@@ -20,7 +20,7 @@
  *	x0 - bytes not copied
  */
 	.macro ldrb1 reg, ptr, val
-	ldrb  \reg, [\ptr], \val
+	USER_MC(9998f, ldrb  \reg, [\ptr], \val)
 	.endm
 
 	.macro strb1 reg, ptr, val
@@ -28,7 +28,7 @@
 	.endm
 
 	.macro ldrh1 reg, ptr, val
-	ldrh  \reg, [\ptr], \val
+	USER_MC(9998f, ldrh  \reg, [\ptr], \val)
 	.endm
 
 	.macro strh1 reg, ptr, val
@@ -36,7 +36,7 @@
 	.endm
 
 	.macro ldr1 reg, ptr, val
-	ldr \reg, [\ptr], \val
+	USER_MC(9998f, ldr \reg, [\ptr], \val)
 	.endm
 
 	.macro str1 reg, ptr, val
@@ -44,15 +44,16 @@
 	.endm
 
 	.macro ldp1 reg1, reg2, ptr, val
-	ldp \reg1, \reg2, [\ptr], \val
+	USER_MC(9998f, ldp \reg1, \reg2, [\ptr], \val)
 	.endm
 
 	.macro stp1 reg1, reg2, ptr, val
 	user_stp 9997f, \reg1, \reg2, \ptr, \val
 	.endm
 
-end	.req	x5
-srcin	.req	x15
+end		.req	x5
+srcin		.req	x15
+fixup_type	.req	x16
 SYM_FUNC_START(__arch_copy_to_user)
 	add	end, x0, x2
 	mov	srcin, x1
@@ -61,7 +62,10 @@ SYM_FUNC_START(__arch_copy_to_user)
 	ret
 
 	// Exception fixups
-9997:	cmp	dst, dstin
+	// x16: fixup type written by ex_handler_uaccess_mc
+9997:	cmp 	fixup_type, #FIXUP_TYPE_MC
+	b.eq	9998f
+	cmp	dst, dstin
 	b.ne	9998f
 	// Before being absolutely sure we couldn't copy anything, try harder
 	ldrb	tmp1w, [srcin]
diff --git a/arch/arm64/mm/extable.c b/arch/arm64/mm/extable.c
index 4f0083a550d4..525876c3ebf4 100644
--- a/arch/arm64/mm/extable.c
+++ b/arch/arm64/mm/extable.c
@@ -24,6 +24,14 @@ static bool ex_handler_fixup(const struct exception_table_entry *ex,
 	return true;
 }
 
+static bool ex_handler_uaccess_type(const struct exception_table_entry *ex,
+			     struct pt_regs *regs,
+			     unsigned long fixup_type)
+{
+	regs->regs[16] = fixup_type;
+	return ex_handler_fixup(ex, regs);
+}
+
 static bool ex_handler_uaccess_err_zero(const struct exception_table_entry *ex,
 					struct pt_regs *regs)
 {
@@ -75,6 +83,8 @@ bool fixup_exception(struct pt_regs *regs)
 	switch (ex->type) {
 	case EX_TYPE_FIXUP:
 		return ex_handler_fixup(ex, regs);
+	case EX_TYPE_UACCESS_MC:
+		return ex_handler_uaccess_type(ex, regs, FIXUP_TYPE_NORMAL);
 	case EX_TYPE_BPF:
 		return ex_handler_bpf(ex, regs);
 	case EX_TYPE_UACCESS_ERR_ZERO:
@@ -94,10 +104,10 @@ bool fixup_exception_mc(struct pt_regs *regs)
 	if (!ex)
 		return false;
 
-	/*
-	 * This is not complete, More Machine check safe extable type can
-	 * be processed here.
-	 */
+	switch (ex->type) {
+	case EX_TYPE_UACCESS_MC:
+		return ex_handler_uaccess_type(ex, regs, FIXUP_TYPE_MC);
+	}
 
 	return false;
 }
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 96+ messages in thread

* [PATCH -next v4 5/7] arm64: mte: Clean up user tag accessors
  2022-04-20  3:04 ` Tong Tiangen
  (?)
@ 2022-04-20  3:04   ` Tong Tiangen
  -1 siblings, 0 replies; 96+ messages in thread
From: Tong Tiangen @ 2022-04-20  3:04 UTC (permalink / raw)
  To: Mark Rutland, James Morse, Andrew Morton, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Robin Murphy, Dave Hansen,
	Catalin Marinas, Will Deacon, Alexander Viro, Michael Ellerman,
	Benjamin Herrenschmidt, Paul Mackerras, x86, H . Peter Anvin
  Cc: linuxppc-dev, linux-arm-kernel, linux-kernel, linux-mm,
	Kefeng Wang, Xie XiuQi, Guohanjun, Tong Tiangen

From: Robin Murphy <robin.murphy@arm.com>

Invoking user_ldst to explicitly add a post-increment of 0 is silly.
Just use a normal USER() annotation and save the redundant instruction.

Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Reviewed-by: Tong Tiangen <tongtiangen@huawei.com>
---
 arch/arm64/lib/mte.S | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/lib/mte.S b/arch/arm64/lib/mte.S
index 8590af3c98c0..eeb9e45bcce8 100644
--- a/arch/arm64/lib/mte.S
+++ b/arch/arm64/lib/mte.S
@@ -93,7 +93,7 @@ SYM_FUNC_START(mte_copy_tags_from_user)
 	mov	x3, x1
 	cbz	x2, 2f
 1:
-	user_ldst 2f, ldtrb, w4, x1, 0
+USER(2f, ldtrb	w4, [x1])
 	lsl	x4, x4, #MTE_TAG_SHIFT
 	stg	x4, [x0], #MTE_GRANULE_SIZE
 	add	x1, x1, #1
@@ -120,7 +120,7 @@ SYM_FUNC_START(mte_copy_tags_to_user)
 1:
 	ldg	x4, [x1]
 	ubfx	x4, x4, #MTE_TAG_SHIFT, #MTE_TAG_SIZE
-	user_ldst 2f, sttrb, w4, x0, 0
+USER(2f, sttrb	w4, [x0])
 	add	x0, x0, #1
 	add	x1, x1, #MTE_GRANULE_SIZE
 	subs	x2, x2, #1
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 96+ messages in thread

* [PATCH -next v4 5/7] arm64: mte: Clean up user tag accessors
@ 2022-04-20  3:04   ` Tong Tiangen
  0 siblings, 0 replies; 96+ messages in thread
From: Tong Tiangen @ 2022-04-20  3:04 UTC (permalink / raw)
  To: Mark Rutland, James Morse, Andrew Morton, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Robin Murphy, Dave Hansen,
	Catalin Marinas, Will Deacon, Alexander Viro, Michael Ellerman,
	Benjamin Herrenschmidt, Paul Mackerras, x86, H . Peter Anvin
  Cc: linuxppc-dev, linux-arm-kernel, linux-kernel, linux-mm,
	Kefeng Wang, Xie XiuQi, Guohanjun, Tong Tiangen

From: Robin Murphy <robin.murphy@arm.com>

Invoking user_ldst to explicitly add a post-increment of 0 is silly.
Just use a normal USER() annotation and save the redundant instruction.

Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Reviewed-by: Tong Tiangen <tongtiangen@huawei.com>
---
 arch/arm64/lib/mte.S | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/lib/mte.S b/arch/arm64/lib/mte.S
index 8590af3c98c0..eeb9e45bcce8 100644
--- a/arch/arm64/lib/mte.S
+++ b/arch/arm64/lib/mte.S
@@ -93,7 +93,7 @@ SYM_FUNC_START(mte_copy_tags_from_user)
 	mov	x3, x1
 	cbz	x2, 2f
 1:
-	user_ldst 2f, ldtrb, w4, x1, 0
+USER(2f, ldtrb	w4, [x1])
 	lsl	x4, x4, #MTE_TAG_SHIFT
 	stg	x4, [x0], #MTE_GRANULE_SIZE
 	add	x1, x1, #1
@@ -120,7 +120,7 @@ SYM_FUNC_START(mte_copy_tags_to_user)
 1:
 	ldg	x4, [x1]
 	ubfx	x4, x4, #MTE_TAG_SHIFT, #MTE_TAG_SIZE
-	user_ldst 2f, sttrb, w4, x0, 0
+USER(2f, sttrb	w4, [x0])
 	add	x0, x0, #1
 	add	x1, x1, #MTE_GRANULE_SIZE
 	subs	x2, x2, #1
-- 
2.25.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 96+ messages in thread

* [PATCH -next v4 5/7] arm64: mte: Clean up user tag accessors
@ 2022-04-20  3:04   ` Tong Tiangen
  0 siblings, 0 replies; 96+ messages in thread
From: Tong Tiangen @ 2022-04-20  3:04 UTC (permalink / raw)
  To: Mark Rutland, James Morse, Andrew Morton, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Robin Murphy, Dave Hansen,
	Catalin Marinas, Will Deacon, Alexander Viro, Michael Ellerman,
	Benjamin Herrenschmidt, Paul Mackerras, x86, H . Peter Anvin
  Cc: Kefeng Wang, Xie XiuQi, linux-kernel, linux-mm, Tong Tiangen,
	Guohanjun, linuxppc-dev, linux-arm-kernel

From: Robin Murphy <robin.murphy@arm.com>

Invoking user_ldst to explicitly add a post-increment of 0 is silly.
Just use a normal USER() annotation and save the redundant instruction.

Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Reviewed-by: Tong Tiangen <tongtiangen@huawei.com>
---
 arch/arm64/lib/mte.S | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/lib/mte.S b/arch/arm64/lib/mte.S
index 8590af3c98c0..eeb9e45bcce8 100644
--- a/arch/arm64/lib/mte.S
+++ b/arch/arm64/lib/mte.S
@@ -93,7 +93,7 @@ SYM_FUNC_START(mte_copy_tags_from_user)
 	mov	x3, x1
 	cbz	x2, 2f
 1:
-	user_ldst 2f, ldtrb, w4, x1, 0
+USER(2f, ldtrb	w4, [x1])
 	lsl	x4, x4, #MTE_TAG_SHIFT
 	stg	x4, [x0], #MTE_GRANULE_SIZE
 	add	x1, x1, #1
@@ -120,7 +120,7 @@ SYM_FUNC_START(mte_copy_tags_to_user)
 1:
 	ldg	x4, [x1]
 	ubfx	x4, x4, #MTE_TAG_SHIFT, #MTE_TAG_SIZE
-	user_ldst 2f, sttrb, w4, x0, 0
+USER(2f, sttrb	w4, [x0])
 	add	x0, x0, #1
 	add	x1, x1, #MTE_GRANULE_SIZE
 	subs	x2, x2, #1
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 96+ messages in thread

* [PATCH -next v4 6/7] arm64: add {get, put}_user to machine check safe
  2022-04-20  3:04 ` Tong Tiangen
  (?)
@ 2022-04-20  3:04   ` Tong Tiangen
  -1 siblings, 0 replies; 96+ messages in thread
From: Tong Tiangen @ 2022-04-20  3:04 UTC (permalink / raw)
  To: Mark Rutland, James Morse, Andrew Morton, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Robin Murphy, Dave Hansen,
	Catalin Marinas, Will Deacon, Alexander Viro, Michael Ellerman,
	Benjamin Herrenschmidt, Paul Mackerras, x86, H . Peter Anvin
  Cc: linuxppc-dev, linux-arm-kernel, linux-kernel, linux-mm,
	Kefeng Wang, Xie XiuQi, Guohanjun, Tong Tiangen

Add {get, put}_user() to machine check safe.

If get/put fail due to hardware memory error, only the relevant processes
are affected, so killing the user process and isolate the user page with
hardware memory errors is a more reasonable choice than kernel panic.

Add new extable type EX_TYPE_UACCESS_MC_ERR_ZERO which can be used for
uaccess that can be recovered from hardware memory errors. The difference
from EX_TYPE_UACCESS_MC is that this type also sets additional two target
register which save error code and value needs to be set zero.

Signed-off-by: Tong Tiangen <tongtiangen@huawei.com>
---
 arch/arm64/include/asm/asm-extable.h | 14 ++++++++++++++
 arch/arm64/include/asm/uaccess.h     |  4 ++--
 arch/arm64/mm/extable.c              |  4 ++++
 3 files changed, 20 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/include/asm/asm-extable.h b/arch/arm64/include/asm/asm-extable.h
index 75b2c00e9523..80410899a9ad 100644
--- a/arch/arm64/include/asm/asm-extable.h
+++ b/arch/arm64/include/asm/asm-extable.h
@@ -13,6 +13,7 @@
 
 /* _MC indicates that can fixup from machine check errors */
 #define EX_TYPE_UACCESS_MC		5
+#define EX_TYPE_UACCESS_MC_ERR_ZERO	6
 
 #ifdef __ASSEMBLY__
 
@@ -78,6 +79,15 @@
 #define EX_DATA_REG(reg, gpr)						\
 	"((.L__gpr_num_" #gpr ") << " __stringify(EX_DATA_REG_##reg##_SHIFT) ")"
 
+#define _ASM_EXTABLE_UACCESS_MC_ERR_ZERO(insn, fixup, err, zero)		\
+	__DEFINE_ASM_GPR_NUMS							\
+	__ASM_EXTABLE_RAW(#insn, #fixup,					\
+			  __stringify(EX_TYPE_UACCESS_MC_ERR_ZERO),		\
+			  "("							\
+			    EX_DATA_REG(ERR, err) " | "				\
+			    EX_DATA_REG(ZERO, zero)				\
+			  ")")
+
 #define _ASM_EXTABLE_UACCESS_ERR_ZERO(insn, fixup, err, zero)		\
 	__DEFINE_ASM_GPR_NUMS						\
 	__ASM_EXTABLE_RAW(#insn, #fixup, 				\
@@ -90,6 +100,10 @@
 #define _ASM_EXTABLE_UACCESS_ERR(insn, fixup, err)			\
 	_ASM_EXTABLE_UACCESS_ERR_ZERO(insn, fixup, err, wzr)
 
+
+#define _ASM_EXTABLE_UACCESS_MC_ERR(insn, fixup, err)			\
+	_ASM_EXTABLE_UACCESS_MC_ERR_ZERO(insn, fixup, err, wzr)
+
 #define EX_DATA_REG_DATA_SHIFT	0
 #define EX_DATA_REG_DATA	GENMASK(4, 0)
 #define EX_DATA_REG_ADDR_SHIFT	5
diff --git a/arch/arm64/include/asm/uaccess.h b/arch/arm64/include/asm/uaccess.h
index e8dce0cc5eaa..e41b47df48b0 100644
--- a/arch/arm64/include/asm/uaccess.h
+++ b/arch/arm64/include/asm/uaccess.h
@@ -236,7 +236,7 @@ static inline void __user *__uaccess_mask_ptr(const void __user *ptr)
 	asm volatile(							\
 	"1:	" load "	" reg "1, [%2]\n"			\
 	"2:\n"								\
-	_ASM_EXTABLE_UACCESS_ERR_ZERO(1b, 2b, %w0, %w1)			\
+	_ASM_EXTABLE_UACCESS_MC_ERR_ZERO(1b, 2b, %w0, %w1)		\
 	: "+r" (err), "=&r" (x)						\
 	: "r" (addr))
 
@@ -325,7 +325,7 @@ do {									\
 	asm volatile(							\
 	"1:	" store "	" reg "1, [%2]\n"			\
 	"2:\n"								\
-	_ASM_EXTABLE_UACCESS_ERR(1b, 2b, %w0)				\
+	_ASM_EXTABLE_UACCESS_MC_ERR(1b, 2b, %w0)			\
 	: "+r" (err)							\
 	: "r" (x), "r" (addr))
 
diff --git a/arch/arm64/mm/extable.c b/arch/arm64/mm/extable.c
index 525876c3ebf4..1023ccdb2f89 100644
--- a/arch/arm64/mm/extable.c
+++ b/arch/arm64/mm/extable.c
@@ -88,6 +88,7 @@ bool fixup_exception(struct pt_regs *regs)
 	case EX_TYPE_BPF:
 		return ex_handler_bpf(ex, regs);
 	case EX_TYPE_UACCESS_ERR_ZERO:
+	case EX_TYPE_UACCESS_MC_ERR_ZERO:
 		return ex_handler_uaccess_err_zero(ex, regs);
 	case EX_TYPE_LOAD_UNALIGNED_ZEROPAD:
 		return ex_handler_load_unaligned_zeropad(ex, regs);
@@ -107,6 +108,9 @@ bool fixup_exception_mc(struct pt_regs *regs)
 	switch (ex->type) {
 	case EX_TYPE_UACCESS_MC:
 		return ex_handler_uaccess_type(ex, regs, FIXUP_TYPE_MC);
+	case EX_TYPE_UACCESS_MC_ERR_ZERO:
+		return ex_handler_uaccess_err_zero(ex, regs);
+
 	}
 
 	return false;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 96+ messages in thread

* [PATCH -next v4 6/7] arm64: add {get, put}_user to machine check safe
@ 2022-04-20  3:04   ` Tong Tiangen
  0 siblings, 0 replies; 96+ messages in thread
From: Tong Tiangen @ 2022-04-20  3:04 UTC (permalink / raw)
  To: Mark Rutland, James Morse, Andrew Morton, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Robin Murphy, Dave Hansen,
	Catalin Marinas, Will Deacon, Alexander Viro, Michael Ellerman,
	Benjamin Herrenschmidt, Paul Mackerras, x86, H . Peter Anvin
  Cc: linuxppc-dev, linux-arm-kernel, linux-kernel, linux-mm,
	Kefeng Wang, Xie XiuQi, Guohanjun, Tong Tiangen

Add {get, put}_user() to machine check safe.

If get/put fail due to hardware memory error, only the relevant processes
are affected, so killing the user process and isolate the user page with
hardware memory errors is a more reasonable choice than kernel panic.

Add new extable type EX_TYPE_UACCESS_MC_ERR_ZERO which can be used for
uaccess that can be recovered from hardware memory errors. The difference
from EX_TYPE_UACCESS_MC is that this type also sets additional two target
register which save error code and value needs to be set zero.

Signed-off-by: Tong Tiangen <tongtiangen@huawei.com>
---
 arch/arm64/include/asm/asm-extable.h | 14 ++++++++++++++
 arch/arm64/include/asm/uaccess.h     |  4 ++--
 arch/arm64/mm/extable.c              |  4 ++++
 3 files changed, 20 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/include/asm/asm-extable.h b/arch/arm64/include/asm/asm-extable.h
index 75b2c00e9523..80410899a9ad 100644
--- a/arch/arm64/include/asm/asm-extable.h
+++ b/arch/arm64/include/asm/asm-extable.h
@@ -13,6 +13,7 @@
 
 /* _MC indicates that can fixup from machine check errors */
 #define EX_TYPE_UACCESS_MC		5
+#define EX_TYPE_UACCESS_MC_ERR_ZERO	6
 
 #ifdef __ASSEMBLY__
 
@@ -78,6 +79,15 @@
 #define EX_DATA_REG(reg, gpr)						\
 	"((.L__gpr_num_" #gpr ") << " __stringify(EX_DATA_REG_##reg##_SHIFT) ")"
 
+#define _ASM_EXTABLE_UACCESS_MC_ERR_ZERO(insn, fixup, err, zero)		\
+	__DEFINE_ASM_GPR_NUMS							\
+	__ASM_EXTABLE_RAW(#insn, #fixup,					\
+			  __stringify(EX_TYPE_UACCESS_MC_ERR_ZERO),		\
+			  "("							\
+			    EX_DATA_REG(ERR, err) " | "				\
+			    EX_DATA_REG(ZERO, zero)				\
+			  ")")
+
 #define _ASM_EXTABLE_UACCESS_ERR_ZERO(insn, fixup, err, zero)		\
 	__DEFINE_ASM_GPR_NUMS						\
 	__ASM_EXTABLE_RAW(#insn, #fixup, 				\
@@ -90,6 +100,10 @@
 #define _ASM_EXTABLE_UACCESS_ERR(insn, fixup, err)			\
 	_ASM_EXTABLE_UACCESS_ERR_ZERO(insn, fixup, err, wzr)
 
+
+#define _ASM_EXTABLE_UACCESS_MC_ERR(insn, fixup, err)			\
+	_ASM_EXTABLE_UACCESS_MC_ERR_ZERO(insn, fixup, err, wzr)
+
 #define EX_DATA_REG_DATA_SHIFT	0
 #define EX_DATA_REG_DATA	GENMASK(4, 0)
 #define EX_DATA_REG_ADDR_SHIFT	5
diff --git a/arch/arm64/include/asm/uaccess.h b/arch/arm64/include/asm/uaccess.h
index e8dce0cc5eaa..e41b47df48b0 100644
--- a/arch/arm64/include/asm/uaccess.h
+++ b/arch/arm64/include/asm/uaccess.h
@@ -236,7 +236,7 @@ static inline void __user *__uaccess_mask_ptr(const void __user *ptr)
 	asm volatile(							\
 	"1:	" load "	" reg "1, [%2]\n"			\
 	"2:\n"								\
-	_ASM_EXTABLE_UACCESS_ERR_ZERO(1b, 2b, %w0, %w1)			\
+	_ASM_EXTABLE_UACCESS_MC_ERR_ZERO(1b, 2b, %w0, %w1)		\
 	: "+r" (err), "=&r" (x)						\
 	: "r" (addr))
 
@@ -325,7 +325,7 @@ do {									\
 	asm volatile(							\
 	"1:	" store "	" reg "1, [%2]\n"			\
 	"2:\n"								\
-	_ASM_EXTABLE_UACCESS_ERR(1b, 2b, %w0)				\
+	_ASM_EXTABLE_UACCESS_MC_ERR(1b, 2b, %w0)			\
 	: "+r" (err)							\
 	: "r" (x), "r" (addr))
 
diff --git a/arch/arm64/mm/extable.c b/arch/arm64/mm/extable.c
index 525876c3ebf4..1023ccdb2f89 100644
--- a/arch/arm64/mm/extable.c
+++ b/arch/arm64/mm/extable.c
@@ -88,6 +88,7 @@ bool fixup_exception(struct pt_regs *regs)
 	case EX_TYPE_BPF:
 		return ex_handler_bpf(ex, regs);
 	case EX_TYPE_UACCESS_ERR_ZERO:
+	case EX_TYPE_UACCESS_MC_ERR_ZERO:
 		return ex_handler_uaccess_err_zero(ex, regs);
 	case EX_TYPE_LOAD_UNALIGNED_ZEROPAD:
 		return ex_handler_load_unaligned_zeropad(ex, regs);
@@ -107,6 +108,9 @@ bool fixup_exception_mc(struct pt_regs *regs)
 	switch (ex->type) {
 	case EX_TYPE_UACCESS_MC:
 		return ex_handler_uaccess_type(ex, regs, FIXUP_TYPE_MC);
+	case EX_TYPE_UACCESS_MC_ERR_ZERO:
+		return ex_handler_uaccess_err_zero(ex, regs);
+
 	}
 
 	return false;
-- 
2.25.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 96+ messages in thread

* [PATCH -next v4 6/7] arm64: add {get, put}_user to machine check safe
@ 2022-04-20  3:04   ` Tong Tiangen
  0 siblings, 0 replies; 96+ messages in thread
From: Tong Tiangen @ 2022-04-20  3:04 UTC (permalink / raw)
  To: Mark Rutland, James Morse, Andrew Morton, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Robin Murphy, Dave Hansen,
	Catalin Marinas, Will Deacon, Alexander Viro, Michael Ellerman,
	Benjamin Herrenschmidt, Paul Mackerras, x86, H . Peter Anvin
  Cc: Kefeng Wang, Xie XiuQi, linux-kernel, linux-mm, Tong Tiangen,
	Guohanjun, linuxppc-dev, linux-arm-kernel

Add {get, put}_user() to machine check safe.

If get/put fail due to hardware memory error, only the relevant processes
are affected, so killing the user process and isolate the user page with
hardware memory errors is a more reasonable choice than kernel panic.

Add new extable type EX_TYPE_UACCESS_MC_ERR_ZERO which can be used for
uaccess that can be recovered from hardware memory errors. The difference
from EX_TYPE_UACCESS_MC is that this type also sets additional two target
register which save error code and value needs to be set zero.

Signed-off-by: Tong Tiangen <tongtiangen@huawei.com>
---
 arch/arm64/include/asm/asm-extable.h | 14 ++++++++++++++
 arch/arm64/include/asm/uaccess.h     |  4 ++--
 arch/arm64/mm/extable.c              |  4 ++++
 3 files changed, 20 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/include/asm/asm-extable.h b/arch/arm64/include/asm/asm-extable.h
index 75b2c00e9523..80410899a9ad 100644
--- a/arch/arm64/include/asm/asm-extable.h
+++ b/arch/arm64/include/asm/asm-extable.h
@@ -13,6 +13,7 @@
 
 /* _MC indicates that can fixup from machine check errors */
 #define EX_TYPE_UACCESS_MC		5
+#define EX_TYPE_UACCESS_MC_ERR_ZERO	6
 
 #ifdef __ASSEMBLY__
 
@@ -78,6 +79,15 @@
 #define EX_DATA_REG(reg, gpr)						\
 	"((.L__gpr_num_" #gpr ") << " __stringify(EX_DATA_REG_##reg##_SHIFT) ")"
 
+#define _ASM_EXTABLE_UACCESS_MC_ERR_ZERO(insn, fixup, err, zero)		\
+	__DEFINE_ASM_GPR_NUMS							\
+	__ASM_EXTABLE_RAW(#insn, #fixup,					\
+			  __stringify(EX_TYPE_UACCESS_MC_ERR_ZERO),		\
+			  "("							\
+			    EX_DATA_REG(ERR, err) " | "				\
+			    EX_DATA_REG(ZERO, zero)				\
+			  ")")
+
 #define _ASM_EXTABLE_UACCESS_ERR_ZERO(insn, fixup, err, zero)		\
 	__DEFINE_ASM_GPR_NUMS						\
 	__ASM_EXTABLE_RAW(#insn, #fixup, 				\
@@ -90,6 +100,10 @@
 #define _ASM_EXTABLE_UACCESS_ERR(insn, fixup, err)			\
 	_ASM_EXTABLE_UACCESS_ERR_ZERO(insn, fixup, err, wzr)
 
+
+#define _ASM_EXTABLE_UACCESS_MC_ERR(insn, fixup, err)			\
+	_ASM_EXTABLE_UACCESS_MC_ERR_ZERO(insn, fixup, err, wzr)
+
 #define EX_DATA_REG_DATA_SHIFT	0
 #define EX_DATA_REG_DATA	GENMASK(4, 0)
 #define EX_DATA_REG_ADDR_SHIFT	5
diff --git a/arch/arm64/include/asm/uaccess.h b/arch/arm64/include/asm/uaccess.h
index e8dce0cc5eaa..e41b47df48b0 100644
--- a/arch/arm64/include/asm/uaccess.h
+++ b/arch/arm64/include/asm/uaccess.h
@@ -236,7 +236,7 @@ static inline void __user *__uaccess_mask_ptr(const void __user *ptr)
 	asm volatile(							\
 	"1:	" load "	" reg "1, [%2]\n"			\
 	"2:\n"								\
-	_ASM_EXTABLE_UACCESS_ERR_ZERO(1b, 2b, %w0, %w1)			\
+	_ASM_EXTABLE_UACCESS_MC_ERR_ZERO(1b, 2b, %w0, %w1)		\
 	: "+r" (err), "=&r" (x)						\
 	: "r" (addr))
 
@@ -325,7 +325,7 @@ do {									\
 	asm volatile(							\
 	"1:	" store "	" reg "1, [%2]\n"			\
 	"2:\n"								\
-	_ASM_EXTABLE_UACCESS_ERR(1b, 2b, %w0)				\
+	_ASM_EXTABLE_UACCESS_MC_ERR(1b, 2b, %w0)			\
 	: "+r" (err)							\
 	: "r" (x), "r" (addr))
 
diff --git a/arch/arm64/mm/extable.c b/arch/arm64/mm/extable.c
index 525876c3ebf4..1023ccdb2f89 100644
--- a/arch/arm64/mm/extable.c
+++ b/arch/arm64/mm/extable.c
@@ -88,6 +88,7 @@ bool fixup_exception(struct pt_regs *regs)
 	case EX_TYPE_BPF:
 		return ex_handler_bpf(ex, regs);
 	case EX_TYPE_UACCESS_ERR_ZERO:
+	case EX_TYPE_UACCESS_MC_ERR_ZERO:
 		return ex_handler_uaccess_err_zero(ex, regs);
 	case EX_TYPE_LOAD_UNALIGNED_ZEROPAD:
 		return ex_handler_load_unaligned_zeropad(ex, regs);
@@ -107,6 +108,9 @@ bool fixup_exception_mc(struct pt_regs *regs)
 	switch (ex->type) {
 	case EX_TYPE_UACCESS_MC:
 		return ex_handler_uaccess_type(ex, regs, FIXUP_TYPE_MC);
+	case EX_TYPE_UACCESS_MC_ERR_ZERO:
+		return ex_handler_uaccess_err_zero(ex, regs);
+
 	}
 
 	return false;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 96+ messages in thread

* [PATCH -next v4 7/7] arm64: add cow to machine check safe
  2022-04-20  3:04 ` Tong Tiangen
  (?)
@ 2022-04-20  3:04   ` Tong Tiangen
  -1 siblings, 0 replies; 96+ messages in thread
From: Tong Tiangen @ 2022-04-20  3:04 UTC (permalink / raw)
  To: Mark Rutland, James Morse, Andrew Morton, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Robin Murphy, Dave Hansen,
	Catalin Marinas, Will Deacon, Alexander Viro, Michael Ellerman,
	Benjamin Herrenschmidt, Paul Mackerras, x86, H . Peter Anvin
  Cc: linuxppc-dev, linux-arm-kernel, linux-kernel, linux-mm,
	Kefeng Wang, Xie XiuQi, Guohanjun, Tong Tiangen

In the cow(copy on write) processing, the data of the user process is
copied, when hardware memory error is encountered during copy, only the
relevant processes are affected, so killing the user process and isolate
the user page with hardware memory errors is a more reasonable choice than
kernel panic.

Add new helper copy_page_mc() which provide a page copy implementation with
machine check safe. At present, only used in cow. In future, we can expand
more scenes. As long as the consequences of page copy failure are not
fatal(eg: only affect user process), we can use this helper.

The copy_page_mc() in copy_page_mc.S is largely borrows from copy_page()
in copy_page.S and the main difference is copy_page_mc() add extable entry
to every load/store insn to support machine check safe. largely to keep the
patch simple. If needed those optimizations can be folded in.

Add new extable type EX_TYPE_COPY_PAGE_MC which used in copy_page_mc().

This type only be processed in fixup_exception_mc(), The reason is that
copy_page_mc() is consistent with copy_page() except machine check safe is
considered, and copy_page() do not need to consider exception fixup.

Signed-off-by: Tong Tiangen <tongtiangen@huawei.com>
---
 arch/arm64/include/asm/asm-extable.h |  5 ++
 arch/arm64/include/asm/page.h        | 10 ++++
 arch/arm64/lib/Makefile              |  2 +
 arch/arm64/lib/copy_page_mc.S        | 86 ++++++++++++++++++++++++++++
 arch/arm64/mm/copypage.c             | 36 ++++++++++--
 arch/arm64/mm/extable.c              |  2 +
 include/linux/highmem.h              |  8 +++
 mm/memory.c                          |  2 +-
 8 files changed, 144 insertions(+), 7 deletions(-)
 create mode 100644 arch/arm64/lib/copy_page_mc.S

diff --git a/arch/arm64/include/asm/asm-extable.h b/arch/arm64/include/asm/asm-extable.h
index 80410899a9ad..74c056ddae15 100644
--- a/arch/arm64/include/asm/asm-extable.h
+++ b/arch/arm64/include/asm/asm-extable.h
@@ -14,6 +14,7 @@
 /* _MC indicates that can fixup from machine check errors */
 #define EX_TYPE_UACCESS_MC		5
 #define EX_TYPE_UACCESS_MC_ERR_ZERO	6
+#define EX_TYPE_COPY_PAGE_MC		7
 
 #ifdef __ASSEMBLY__
 
@@ -42,6 +43,10 @@
 	__ASM_EXTABLE_RAW(\insn, \fixup, EX_TYPE_UACCESS_MC, 0)
 	.endm
 
+	.macro          _asm_extable_copy_page_mc, insn, fixup
+	__ASM_EXTABLE_RAW(\insn, \fixup, EX_TYPE_COPY_PAGE_MC, 0)
+	.endm
+
 /*
  * Create an exception table entry for `insn` if `fixup` is provided. Otherwise
  * do nothing.
diff --git a/arch/arm64/include/asm/page.h b/arch/arm64/include/asm/page.h
index 993a27ea6f54..832571a7dddb 100644
--- a/arch/arm64/include/asm/page.h
+++ b/arch/arm64/include/asm/page.h
@@ -29,6 +29,16 @@ void copy_user_highpage(struct page *to, struct page *from,
 void copy_highpage(struct page *to, struct page *from);
 #define __HAVE_ARCH_COPY_HIGHPAGE
 
+#ifdef CONFIG_ARCH_HAS_COPY_MC
+extern void copy_page_mc(void *to, const void *from);
+void copy_highpage_mc(struct page *to, struct page *from);
+#define __HAVE_ARCH_COPY_HIGHPAGE_MC
+
+void copy_user_highpage_mc(struct page *to, struct page *from,
+		unsigned long vaddr, struct vm_area_struct *vma);
+#define __HAVE_ARCH_COPY_USER_HIGHPAGE_MC
+#endif
+
 struct page *alloc_zeroed_user_highpage_movable(struct vm_area_struct *vma,
 						unsigned long vaddr);
 #define __HAVE_ARCH_ALLOC_ZEROED_USER_HIGHPAGE_MOVABLE
diff --git a/arch/arm64/lib/Makefile b/arch/arm64/lib/Makefile
index 29490be2546b..0d9f292ef68a 100644
--- a/arch/arm64/lib/Makefile
+++ b/arch/arm64/lib/Makefile
@@ -15,6 +15,8 @@ endif
 
 lib-$(CONFIG_ARCH_HAS_UACCESS_FLUSHCACHE) += uaccess_flushcache.o
 
+lib-$(CONFIG_ARCH_HAS_COPY_MC) += copy_page_mc.o
+
 obj-$(CONFIG_CRC32) += crc32.o
 
 obj-$(CONFIG_FUNCTION_ERROR_INJECTION) += error-inject.o
diff --git a/arch/arm64/lib/copy_page_mc.S b/arch/arm64/lib/copy_page_mc.S
new file mode 100644
index 000000000000..655161363dcf
--- /dev/null
+++ b/arch/arm64/lib/copy_page_mc.S
@@ -0,0 +1,86 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Copyright (C) 2012 ARM Ltd.
+ */
+
+#include <linux/linkage.h>
+#include <linux/const.h>
+#include <asm/assembler.h>
+#include <asm/page.h>
+#include <asm/cpufeature.h>
+#include <asm/alternative.h>
+#include <asm/asm-extable.h>
+
+#define CPY_MC(l, x...)		\
+9999:   x;			\
+	_asm_extable_copy_page_mc    9999b, l
+
+/*
+ * Copy a page from src to dest (both are page aligned) with machine check
+ *
+ * Parameters:
+ *	x0 - dest
+ *	x1 - src
+ */
+SYM_FUNC_START(__pi_copy_page_mc)
+alternative_if ARM64_HAS_NO_HW_PREFETCH
+	// Prefetch three cache lines ahead.
+	prfm	pldl1strm, [x1, #128]
+	prfm	pldl1strm, [x1, #256]
+	prfm	pldl1strm, [x1, #384]
+alternative_else_nop_endif
+
+CPY_MC(9998f, ldp	x2, x3, [x1])
+CPY_MC(9998f, ldp	x4, x5, [x1, #16])
+CPY_MC(9998f, ldp	x6, x7, [x1, #32])
+CPY_MC(9998f, ldp	x8, x9, [x1, #48])
+CPY_MC(9998f, ldp	x10, x11, [x1, #64])
+CPY_MC(9998f, ldp	x12, x13, [x1, #80])
+CPY_MC(9998f, ldp	x14, x15, [x1, #96])
+CPY_MC(9998f, ldp	x16, x17, [x1, #112])
+
+	add	x0, x0, #256
+	add	x1, x1, #128
+1:
+	tst	x0, #(PAGE_SIZE - 1)
+
+alternative_if ARM64_HAS_NO_HW_PREFETCH
+	prfm	pldl1strm, [x1, #384]
+alternative_else_nop_endif
+
+CPY_MC(9998f, stnp	x2, x3, [x0, #-256])
+CPY_MC(9998f, ldp	x2, x3, [x1])
+CPY_MC(9998f, stnp	x4, x5, [x0, #16 - 256])
+CPY_MC(9998f, ldp	x4, x5, [x1, #16])
+CPY_MC(9998f, stnp	x6, x7, [x0, #32 - 256])
+CPY_MC(9998f, ldp	x6, x7, [x1, #32])
+CPY_MC(9998f, stnp	x8, x9, [x0, #48 - 256])
+CPY_MC(9998f, ldp	x8, x9, [x1, #48])
+CPY_MC(9998f, stnp	x10, x11, [x0, #64 - 256])
+CPY_MC(9998f, ldp	x10, x11, [x1, #64])
+CPY_MC(9998f, stnp	x12, x13, [x0, #80 - 256])
+CPY_MC(9998f, ldp	x12, x13, [x1, #80])
+CPY_MC(9998f, stnp	x14, x15, [x0, #96 - 256])
+CPY_MC(9998f, ldp	x14, x15, [x1, #96])
+CPY_MC(9998f, stnp	x16, x17, [x0, #112 - 256])
+CPY_MC(9998f, ldp	x16, x17, [x1, #112])
+
+	add	x0, x0, #128
+	add	x1, x1, #128
+
+	b.ne	1b
+
+CPY_MC(9998f, stnp	x2, x3, [x0, #-256])
+CPY_MC(9998f, stnp	x4, x5, [x0, #16 - 256])
+CPY_MC(9998f, stnp	x6, x7, [x0, #32 - 256])
+CPY_MC(9998f, stnp	x8, x9, [x0, #48 - 256])
+CPY_MC(9998f, stnp	x10, x11, [x0, #64 - 256])
+CPY_MC(9998f, stnp	x12, x13, [x0, #80 - 256])
+CPY_MC(9998f, stnp	x14, x15, [x0, #96 - 256])
+CPY_MC(9998f, stnp	x16, x17, [x0, #112 - 256])
+
+9998:	ret
+
+SYM_FUNC_END(__pi_copy_page_mc)
+SYM_FUNC_ALIAS(copy_page_mc, __pi_copy_page_mc)
+EXPORT_SYMBOL(copy_page_mc)
diff --git a/arch/arm64/mm/copypage.c b/arch/arm64/mm/copypage.c
index 0dea80bf6de4..0f28edfcb234 100644
--- a/arch/arm64/mm/copypage.c
+++ b/arch/arm64/mm/copypage.c
@@ -14,13 +14,8 @@
 #include <asm/cpufeature.h>
 #include <asm/mte.h>
 
-void copy_highpage(struct page *to, struct page *from)
+static void do_mte(struct page *to, struct page *from, void *kto, void *kfrom)
 {
-	void *kto = page_address(to);
-	void *kfrom = page_address(from);
-
-	copy_page(kto, kfrom);
-
 	if (system_supports_mte() && test_bit(PG_mte_tagged, &from->flags)) {
 		set_bit(PG_mte_tagged, &to->flags);
 		page_kasan_tag_reset(to);
@@ -35,6 +30,15 @@ void copy_highpage(struct page *to, struct page *from)
 		mte_copy_page_tags(kto, kfrom);
 	}
 }
+
+void copy_highpage(struct page *to, struct page *from)
+{
+	void *kto = page_address(to);
+	void *kfrom = page_address(from);
+
+	copy_page(kto, kfrom);
+	do_mte(to, from, kto, kfrom);
+}
 EXPORT_SYMBOL(copy_highpage);
 
 void copy_user_highpage(struct page *to, struct page *from,
@@ -44,3 +48,23 @@ void copy_user_highpage(struct page *to, struct page *from,
 	flush_dcache_page(to);
 }
 EXPORT_SYMBOL_GPL(copy_user_highpage);
+
+#ifdef CONFIG_ARCH_HAS_COPY_MC
+void copy_highpage_mc(struct page *to, struct page *from)
+{
+	void *kto = page_address(to);
+	void *kfrom = page_address(from);
+
+	copy_page_mc(kto, kfrom);
+	do_mte(to, from, kto, kfrom);
+}
+EXPORT_SYMBOL(copy_highpage_mc);
+
+void copy_user_highpage_mc(struct page *to, struct page *from,
+			unsigned long vaddr, struct vm_area_struct *vma)
+{
+	copy_highpage_mc(to, from);
+	flush_dcache_page(to);
+}
+EXPORT_SYMBOL_GPL(copy_user_highpage_mc);
+#endif
diff --git a/arch/arm64/mm/extable.c b/arch/arm64/mm/extable.c
index 1023ccdb2f89..4c882d36dd64 100644
--- a/arch/arm64/mm/extable.c
+++ b/arch/arm64/mm/extable.c
@@ -110,6 +110,8 @@ bool fixup_exception_mc(struct pt_regs *regs)
 		return ex_handler_uaccess_type(ex, regs, FIXUP_TYPE_MC);
 	case EX_TYPE_UACCESS_MC_ERR_ZERO:
 		return ex_handler_uaccess_err_zero(ex, regs);
+	case EX_TYPE_COPY_PAGE_MC:
+		return ex_handler_fixup(ex, regs);
 
 	}
 
diff --git a/include/linux/highmem.h b/include/linux/highmem.h
index 39bb9b47fa9c..a9dbf331b038 100644
--- a/include/linux/highmem.h
+++ b/include/linux/highmem.h
@@ -283,6 +283,10 @@ static inline void copy_user_highpage(struct page *to, struct page *from,
 
 #endif
 
+#ifndef __HAVE_ARCH_COPY_USER_HIGHPAGE_MC
+#define copy_user_highpage_mc copy_user_highpage
+#endif
+
 #ifndef __HAVE_ARCH_COPY_HIGHPAGE
 
 static inline void copy_highpage(struct page *to, struct page *from)
@@ -298,6 +302,10 @@ static inline void copy_highpage(struct page *to, struct page *from)
 
 #endif
 
+#ifndef __HAVE_ARCH_COPY_HIGHPAGE_MC
+#define cop_highpage_mc copy_highpage
+#endif
+
 static inline void memcpy_page(struct page *dst_page, size_t dst_off,
 			       struct page *src_page, size_t src_off,
 			       size_t len)
diff --git a/mm/memory.c b/mm/memory.c
index 76e3af9639d9..d5f62234152d 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -2767,7 +2767,7 @@ static inline bool cow_user_page(struct page *dst, struct page *src,
 	unsigned long addr = vmf->address;
 
 	if (likely(src)) {
-		copy_user_highpage(dst, src, addr, vma);
+		copy_user_highpage_mc(dst, src, addr, vma);
 		return true;
 	}
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 96+ messages in thread

* [PATCH -next v4 7/7] arm64: add cow to machine check safe
@ 2022-04-20  3:04   ` Tong Tiangen
  0 siblings, 0 replies; 96+ messages in thread
From: Tong Tiangen @ 2022-04-20  3:04 UTC (permalink / raw)
  To: Mark Rutland, James Morse, Andrew Morton, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Robin Murphy, Dave Hansen,
	Catalin Marinas, Will Deacon, Alexander Viro, Michael Ellerman,
	Benjamin Herrenschmidt, Paul Mackerras, x86, H . Peter Anvin
  Cc: linuxppc-dev, linux-arm-kernel, linux-kernel, linux-mm,
	Kefeng Wang, Xie XiuQi, Guohanjun, Tong Tiangen

In the cow(copy on write) processing, the data of the user process is
copied, when hardware memory error is encountered during copy, only the
relevant processes are affected, so killing the user process and isolate
the user page with hardware memory errors is a more reasonable choice than
kernel panic.

Add new helper copy_page_mc() which provide a page copy implementation with
machine check safe. At present, only used in cow. In future, we can expand
more scenes. As long as the consequences of page copy failure are not
fatal(eg: only affect user process), we can use this helper.

The copy_page_mc() in copy_page_mc.S is largely borrows from copy_page()
in copy_page.S and the main difference is copy_page_mc() add extable entry
to every load/store insn to support machine check safe. largely to keep the
patch simple. If needed those optimizations can be folded in.

Add new extable type EX_TYPE_COPY_PAGE_MC which used in copy_page_mc().

This type only be processed in fixup_exception_mc(), The reason is that
copy_page_mc() is consistent with copy_page() except machine check safe is
considered, and copy_page() do not need to consider exception fixup.

Signed-off-by: Tong Tiangen <tongtiangen@huawei.com>
---
 arch/arm64/include/asm/asm-extable.h |  5 ++
 arch/arm64/include/asm/page.h        | 10 ++++
 arch/arm64/lib/Makefile              |  2 +
 arch/arm64/lib/copy_page_mc.S        | 86 ++++++++++++++++++++++++++++
 arch/arm64/mm/copypage.c             | 36 ++++++++++--
 arch/arm64/mm/extable.c              |  2 +
 include/linux/highmem.h              |  8 +++
 mm/memory.c                          |  2 +-
 8 files changed, 144 insertions(+), 7 deletions(-)
 create mode 100644 arch/arm64/lib/copy_page_mc.S

diff --git a/arch/arm64/include/asm/asm-extable.h b/arch/arm64/include/asm/asm-extable.h
index 80410899a9ad..74c056ddae15 100644
--- a/arch/arm64/include/asm/asm-extable.h
+++ b/arch/arm64/include/asm/asm-extable.h
@@ -14,6 +14,7 @@
 /* _MC indicates that can fixup from machine check errors */
 #define EX_TYPE_UACCESS_MC		5
 #define EX_TYPE_UACCESS_MC_ERR_ZERO	6
+#define EX_TYPE_COPY_PAGE_MC		7
 
 #ifdef __ASSEMBLY__
 
@@ -42,6 +43,10 @@
 	__ASM_EXTABLE_RAW(\insn, \fixup, EX_TYPE_UACCESS_MC, 0)
 	.endm
 
+	.macro          _asm_extable_copy_page_mc, insn, fixup
+	__ASM_EXTABLE_RAW(\insn, \fixup, EX_TYPE_COPY_PAGE_MC, 0)
+	.endm
+
 /*
  * Create an exception table entry for `insn` if `fixup` is provided. Otherwise
  * do nothing.
diff --git a/arch/arm64/include/asm/page.h b/arch/arm64/include/asm/page.h
index 993a27ea6f54..832571a7dddb 100644
--- a/arch/arm64/include/asm/page.h
+++ b/arch/arm64/include/asm/page.h
@@ -29,6 +29,16 @@ void copy_user_highpage(struct page *to, struct page *from,
 void copy_highpage(struct page *to, struct page *from);
 #define __HAVE_ARCH_COPY_HIGHPAGE
 
+#ifdef CONFIG_ARCH_HAS_COPY_MC
+extern void copy_page_mc(void *to, const void *from);
+void copy_highpage_mc(struct page *to, struct page *from);
+#define __HAVE_ARCH_COPY_HIGHPAGE_MC
+
+void copy_user_highpage_mc(struct page *to, struct page *from,
+		unsigned long vaddr, struct vm_area_struct *vma);
+#define __HAVE_ARCH_COPY_USER_HIGHPAGE_MC
+#endif
+
 struct page *alloc_zeroed_user_highpage_movable(struct vm_area_struct *vma,
 						unsigned long vaddr);
 #define __HAVE_ARCH_ALLOC_ZEROED_USER_HIGHPAGE_MOVABLE
diff --git a/arch/arm64/lib/Makefile b/arch/arm64/lib/Makefile
index 29490be2546b..0d9f292ef68a 100644
--- a/arch/arm64/lib/Makefile
+++ b/arch/arm64/lib/Makefile
@@ -15,6 +15,8 @@ endif
 
 lib-$(CONFIG_ARCH_HAS_UACCESS_FLUSHCACHE) += uaccess_flushcache.o
 
+lib-$(CONFIG_ARCH_HAS_COPY_MC) += copy_page_mc.o
+
 obj-$(CONFIG_CRC32) += crc32.o
 
 obj-$(CONFIG_FUNCTION_ERROR_INJECTION) += error-inject.o
diff --git a/arch/arm64/lib/copy_page_mc.S b/arch/arm64/lib/copy_page_mc.S
new file mode 100644
index 000000000000..655161363dcf
--- /dev/null
+++ b/arch/arm64/lib/copy_page_mc.S
@@ -0,0 +1,86 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Copyright (C) 2012 ARM Ltd.
+ */
+
+#include <linux/linkage.h>
+#include <linux/const.h>
+#include <asm/assembler.h>
+#include <asm/page.h>
+#include <asm/cpufeature.h>
+#include <asm/alternative.h>
+#include <asm/asm-extable.h>
+
+#define CPY_MC(l, x...)		\
+9999:   x;			\
+	_asm_extable_copy_page_mc    9999b, l
+
+/*
+ * Copy a page from src to dest (both are page aligned) with machine check
+ *
+ * Parameters:
+ *	x0 - dest
+ *	x1 - src
+ */
+SYM_FUNC_START(__pi_copy_page_mc)
+alternative_if ARM64_HAS_NO_HW_PREFETCH
+	// Prefetch three cache lines ahead.
+	prfm	pldl1strm, [x1, #128]
+	prfm	pldl1strm, [x1, #256]
+	prfm	pldl1strm, [x1, #384]
+alternative_else_nop_endif
+
+CPY_MC(9998f, ldp	x2, x3, [x1])
+CPY_MC(9998f, ldp	x4, x5, [x1, #16])
+CPY_MC(9998f, ldp	x6, x7, [x1, #32])
+CPY_MC(9998f, ldp	x8, x9, [x1, #48])
+CPY_MC(9998f, ldp	x10, x11, [x1, #64])
+CPY_MC(9998f, ldp	x12, x13, [x1, #80])
+CPY_MC(9998f, ldp	x14, x15, [x1, #96])
+CPY_MC(9998f, ldp	x16, x17, [x1, #112])
+
+	add	x0, x0, #256
+	add	x1, x1, #128
+1:
+	tst	x0, #(PAGE_SIZE - 1)
+
+alternative_if ARM64_HAS_NO_HW_PREFETCH
+	prfm	pldl1strm, [x1, #384]
+alternative_else_nop_endif
+
+CPY_MC(9998f, stnp	x2, x3, [x0, #-256])
+CPY_MC(9998f, ldp	x2, x3, [x1])
+CPY_MC(9998f, stnp	x4, x5, [x0, #16 - 256])
+CPY_MC(9998f, ldp	x4, x5, [x1, #16])
+CPY_MC(9998f, stnp	x6, x7, [x0, #32 - 256])
+CPY_MC(9998f, ldp	x6, x7, [x1, #32])
+CPY_MC(9998f, stnp	x8, x9, [x0, #48 - 256])
+CPY_MC(9998f, ldp	x8, x9, [x1, #48])
+CPY_MC(9998f, stnp	x10, x11, [x0, #64 - 256])
+CPY_MC(9998f, ldp	x10, x11, [x1, #64])
+CPY_MC(9998f, stnp	x12, x13, [x0, #80 - 256])
+CPY_MC(9998f, ldp	x12, x13, [x1, #80])
+CPY_MC(9998f, stnp	x14, x15, [x0, #96 - 256])
+CPY_MC(9998f, ldp	x14, x15, [x1, #96])
+CPY_MC(9998f, stnp	x16, x17, [x0, #112 - 256])
+CPY_MC(9998f, ldp	x16, x17, [x1, #112])
+
+	add	x0, x0, #128
+	add	x1, x1, #128
+
+	b.ne	1b
+
+CPY_MC(9998f, stnp	x2, x3, [x0, #-256])
+CPY_MC(9998f, stnp	x4, x5, [x0, #16 - 256])
+CPY_MC(9998f, stnp	x6, x7, [x0, #32 - 256])
+CPY_MC(9998f, stnp	x8, x9, [x0, #48 - 256])
+CPY_MC(9998f, stnp	x10, x11, [x0, #64 - 256])
+CPY_MC(9998f, stnp	x12, x13, [x0, #80 - 256])
+CPY_MC(9998f, stnp	x14, x15, [x0, #96 - 256])
+CPY_MC(9998f, stnp	x16, x17, [x0, #112 - 256])
+
+9998:	ret
+
+SYM_FUNC_END(__pi_copy_page_mc)
+SYM_FUNC_ALIAS(copy_page_mc, __pi_copy_page_mc)
+EXPORT_SYMBOL(copy_page_mc)
diff --git a/arch/arm64/mm/copypage.c b/arch/arm64/mm/copypage.c
index 0dea80bf6de4..0f28edfcb234 100644
--- a/arch/arm64/mm/copypage.c
+++ b/arch/arm64/mm/copypage.c
@@ -14,13 +14,8 @@
 #include <asm/cpufeature.h>
 #include <asm/mte.h>
 
-void copy_highpage(struct page *to, struct page *from)
+static void do_mte(struct page *to, struct page *from, void *kto, void *kfrom)
 {
-	void *kto = page_address(to);
-	void *kfrom = page_address(from);
-
-	copy_page(kto, kfrom);
-
 	if (system_supports_mte() && test_bit(PG_mte_tagged, &from->flags)) {
 		set_bit(PG_mte_tagged, &to->flags);
 		page_kasan_tag_reset(to);
@@ -35,6 +30,15 @@ void copy_highpage(struct page *to, struct page *from)
 		mte_copy_page_tags(kto, kfrom);
 	}
 }
+
+void copy_highpage(struct page *to, struct page *from)
+{
+	void *kto = page_address(to);
+	void *kfrom = page_address(from);
+
+	copy_page(kto, kfrom);
+	do_mte(to, from, kto, kfrom);
+}
 EXPORT_SYMBOL(copy_highpage);
 
 void copy_user_highpage(struct page *to, struct page *from,
@@ -44,3 +48,23 @@ void copy_user_highpage(struct page *to, struct page *from,
 	flush_dcache_page(to);
 }
 EXPORT_SYMBOL_GPL(copy_user_highpage);
+
+#ifdef CONFIG_ARCH_HAS_COPY_MC
+void copy_highpage_mc(struct page *to, struct page *from)
+{
+	void *kto = page_address(to);
+	void *kfrom = page_address(from);
+
+	copy_page_mc(kto, kfrom);
+	do_mte(to, from, kto, kfrom);
+}
+EXPORT_SYMBOL(copy_highpage_mc);
+
+void copy_user_highpage_mc(struct page *to, struct page *from,
+			unsigned long vaddr, struct vm_area_struct *vma)
+{
+	copy_highpage_mc(to, from);
+	flush_dcache_page(to);
+}
+EXPORT_SYMBOL_GPL(copy_user_highpage_mc);
+#endif
diff --git a/arch/arm64/mm/extable.c b/arch/arm64/mm/extable.c
index 1023ccdb2f89..4c882d36dd64 100644
--- a/arch/arm64/mm/extable.c
+++ b/arch/arm64/mm/extable.c
@@ -110,6 +110,8 @@ bool fixup_exception_mc(struct pt_regs *regs)
 		return ex_handler_uaccess_type(ex, regs, FIXUP_TYPE_MC);
 	case EX_TYPE_UACCESS_MC_ERR_ZERO:
 		return ex_handler_uaccess_err_zero(ex, regs);
+	case EX_TYPE_COPY_PAGE_MC:
+		return ex_handler_fixup(ex, regs);
 
 	}
 
diff --git a/include/linux/highmem.h b/include/linux/highmem.h
index 39bb9b47fa9c..a9dbf331b038 100644
--- a/include/linux/highmem.h
+++ b/include/linux/highmem.h
@@ -283,6 +283,10 @@ static inline void copy_user_highpage(struct page *to, struct page *from,
 
 #endif
 
+#ifndef __HAVE_ARCH_COPY_USER_HIGHPAGE_MC
+#define copy_user_highpage_mc copy_user_highpage
+#endif
+
 #ifndef __HAVE_ARCH_COPY_HIGHPAGE
 
 static inline void copy_highpage(struct page *to, struct page *from)
@@ -298,6 +302,10 @@ static inline void copy_highpage(struct page *to, struct page *from)
 
 #endif
 
+#ifndef __HAVE_ARCH_COPY_HIGHPAGE_MC
+#define cop_highpage_mc copy_highpage
+#endif
+
 static inline void memcpy_page(struct page *dst_page, size_t dst_off,
 			       struct page *src_page, size_t src_off,
 			       size_t len)
diff --git a/mm/memory.c b/mm/memory.c
index 76e3af9639d9..d5f62234152d 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -2767,7 +2767,7 @@ static inline bool cow_user_page(struct page *dst, struct page *src,
 	unsigned long addr = vmf->address;
 
 	if (likely(src)) {
-		copy_user_highpage(dst, src, addr, vma);
+		copy_user_highpage_mc(dst, src, addr, vma);
 		return true;
 	}
 
-- 
2.25.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 96+ messages in thread

* [PATCH -next v4 7/7] arm64: add cow to machine check safe
@ 2022-04-20  3:04   ` Tong Tiangen
  0 siblings, 0 replies; 96+ messages in thread
From: Tong Tiangen @ 2022-04-20  3:04 UTC (permalink / raw)
  To: Mark Rutland, James Morse, Andrew Morton, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Robin Murphy, Dave Hansen,
	Catalin Marinas, Will Deacon, Alexander Viro, Michael Ellerman,
	Benjamin Herrenschmidt, Paul Mackerras, x86, H . Peter Anvin
  Cc: Kefeng Wang, Xie XiuQi, linux-kernel, linux-mm, Tong Tiangen,
	Guohanjun, linuxppc-dev, linux-arm-kernel

In the cow(copy on write) processing, the data of the user process is
copied, when hardware memory error is encountered during copy, only the
relevant processes are affected, so killing the user process and isolate
the user page with hardware memory errors is a more reasonable choice than
kernel panic.

Add new helper copy_page_mc() which provide a page copy implementation with
machine check safe. At present, only used in cow. In future, we can expand
more scenes. As long as the consequences of page copy failure are not
fatal(eg: only affect user process), we can use this helper.

The copy_page_mc() in copy_page_mc.S is largely borrows from copy_page()
in copy_page.S and the main difference is copy_page_mc() add extable entry
to every load/store insn to support machine check safe. largely to keep the
patch simple. If needed those optimizations can be folded in.

Add new extable type EX_TYPE_COPY_PAGE_MC which used in copy_page_mc().

This type only be processed in fixup_exception_mc(), The reason is that
copy_page_mc() is consistent with copy_page() except machine check safe is
considered, and copy_page() do not need to consider exception fixup.

Signed-off-by: Tong Tiangen <tongtiangen@huawei.com>
---
 arch/arm64/include/asm/asm-extable.h |  5 ++
 arch/arm64/include/asm/page.h        | 10 ++++
 arch/arm64/lib/Makefile              |  2 +
 arch/arm64/lib/copy_page_mc.S        | 86 ++++++++++++++++++++++++++++
 arch/arm64/mm/copypage.c             | 36 ++++++++++--
 arch/arm64/mm/extable.c              |  2 +
 include/linux/highmem.h              |  8 +++
 mm/memory.c                          |  2 +-
 8 files changed, 144 insertions(+), 7 deletions(-)
 create mode 100644 arch/arm64/lib/copy_page_mc.S

diff --git a/arch/arm64/include/asm/asm-extable.h b/arch/arm64/include/asm/asm-extable.h
index 80410899a9ad..74c056ddae15 100644
--- a/arch/arm64/include/asm/asm-extable.h
+++ b/arch/arm64/include/asm/asm-extable.h
@@ -14,6 +14,7 @@
 /* _MC indicates that can fixup from machine check errors */
 #define EX_TYPE_UACCESS_MC		5
 #define EX_TYPE_UACCESS_MC_ERR_ZERO	6
+#define EX_TYPE_COPY_PAGE_MC		7
 
 #ifdef __ASSEMBLY__
 
@@ -42,6 +43,10 @@
 	__ASM_EXTABLE_RAW(\insn, \fixup, EX_TYPE_UACCESS_MC, 0)
 	.endm
 
+	.macro          _asm_extable_copy_page_mc, insn, fixup
+	__ASM_EXTABLE_RAW(\insn, \fixup, EX_TYPE_COPY_PAGE_MC, 0)
+	.endm
+
 /*
  * Create an exception table entry for `insn` if `fixup` is provided. Otherwise
  * do nothing.
diff --git a/arch/arm64/include/asm/page.h b/arch/arm64/include/asm/page.h
index 993a27ea6f54..832571a7dddb 100644
--- a/arch/arm64/include/asm/page.h
+++ b/arch/arm64/include/asm/page.h
@@ -29,6 +29,16 @@ void copy_user_highpage(struct page *to, struct page *from,
 void copy_highpage(struct page *to, struct page *from);
 #define __HAVE_ARCH_COPY_HIGHPAGE
 
+#ifdef CONFIG_ARCH_HAS_COPY_MC
+extern void copy_page_mc(void *to, const void *from);
+void copy_highpage_mc(struct page *to, struct page *from);
+#define __HAVE_ARCH_COPY_HIGHPAGE_MC
+
+void copy_user_highpage_mc(struct page *to, struct page *from,
+		unsigned long vaddr, struct vm_area_struct *vma);
+#define __HAVE_ARCH_COPY_USER_HIGHPAGE_MC
+#endif
+
 struct page *alloc_zeroed_user_highpage_movable(struct vm_area_struct *vma,
 						unsigned long vaddr);
 #define __HAVE_ARCH_ALLOC_ZEROED_USER_HIGHPAGE_MOVABLE
diff --git a/arch/arm64/lib/Makefile b/arch/arm64/lib/Makefile
index 29490be2546b..0d9f292ef68a 100644
--- a/arch/arm64/lib/Makefile
+++ b/arch/arm64/lib/Makefile
@@ -15,6 +15,8 @@ endif
 
 lib-$(CONFIG_ARCH_HAS_UACCESS_FLUSHCACHE) += uaccess_flushcache.o
 
+lib-$(CONFIG_ARCH_HAS_COPY_MC) += copy_page_mc.o
+
 obj-$(CONFIG_CRC32) += crc32.o
 
 obj-$(CONFIG_FUNCTION_ERROR_INJECTION) += error-inject.o
diff --git a/arch/arm64/lib/copy_page_mc.S b/arch/arm64/lib/copy_page_mc.S
new file mode 100644
index 000000000000..655161363dcf
--- /dev/null
+++ b/arch/arm64/lib/copy_page_mc.S
@@ -0,0 +1,86 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Copyright (C) 2012 ARM Ltd.
+ */
+
+#include <linux/linkage.h>
+#include <linux/const.h>
+#include <asm/assembler.h>
+#include <asm/page.h>
+#include <asm/cpufeature.h>
+#include <asm/alternative.h>
+#include <asm/asm-extable.h>
+
+#define CPY_MC(l, x...)		\
+9999:   x;			\
+	_asm_extable_copy_page_mc    9999b, l
+
+/*
+ * Copy a page from src to dest (both are page aligned) with machine check
+ *
+ * Parameters:
+ *	x0 - dest
+ *	x1 - src
+ */
+SYM_FUNC_START(__pi_copy_page_mc)
+alternative_if ARM64_HAS_NO_HW_PREFETCH
+	// Prefetch three cache lines ahead.
+	prfm	pldl1strm, [x1, #128]
+	prfm	pldl1strm, [x1, #256]
+	prfm	pldl1strm, [x1, #384]
+alternative_else_nop_endif
+
+CPY_MC(9998f, ldp	x2, x3, [x1])
+CPY_MC(9998f, ldp	x4, x5, [x1, #16])
+CPY_MC(9998f, ldp	x6, x7, [x1, #32])
+CPY_MC(9998f, ldp	x8, x9, [x1, #48])
+CPY_MC(9998f, ldp	x10, x11, [x1, #64])
+CPY_MC(9998f, ldp	x12, x13, [x1, #80])
+CPY_MC(9998f, ldp	x14, x15, [x1, #96])
+CPY_MC(9998f, ldp	x16, x17, [x1, #112])
+
+	add	x0, x0, #256
+	add	x1, x1, #128
+1:
+	tst	x0, #(PAGE_SIZE - 1)
+
+alternative_if ARM64_HAS_NO_HW_PREFETCH
+	prfm	pldl1strm, [x1, #384]
+alternative_else_nop_endif
+
+CPY_MC(9998f, stnp	x2, x3, [x0, #-256])
+CPY_MC(9998f, ldp	x2, x3, [x1])
+CPY_MC(9998f, stnp	x4, x5, [x0, #16 - 256])
+CPY_MC(9998f, ldp	x4, x5, [x1, #16])
+CPY_MC(9998f, stnp	x6, x7, [x0, #32 - 256])
+CPY_MC(9998f, ldp	x6, x7, [x1, #32])
+CPY_MC(9998f, stnp	x8, x9, [x0, #48 - 256])
+CPY_MC(9998f, ldp	x8, x9, [x1, #48])
+CPY_MC(9998f, stnp	x10, x11, [x0, #64 - 256])
+CPY_MC(9998f, ldp	x10, x11, [x1, #64])
+CPY_MC(9998f, stnp	x12, x13, [x0, #80 - 256])
+CPY_MC(9998f, ldp	x12, x13, [x1, #80])
+CPY_MC(9998f, stnp	x14, x15, [x0, #96 - 256])
+CPY_MC(9998f, ldp	x14, x15, [x1, #96])
+CPY_MC(9998f, stnp	x16, x17, [x0, #112 - 256])
+CPY_MC(9998f, ldp	x16, x17, [x1, #112])
+
+	add	x0, x0, #128
+	add	x1, x1, #128
+
+	b.ne	1b
+
+CPY_MC(9998f, stnp	x2, x3, [x0, #-256])
+CPY_MC(9998f, stnp	x4, x5, [x0, #16 - 256])
+CPY_MC(9998f, stnp	x6, x7, [x0, #32 - 256])
+CPY_MC(9998f, stnp	x8, x9, [x0, #48 - 256])
+CPY_MC(9998f, stnp	x10, x11, [x0, #64 - 256])
+CPY_MC(9998f, stnp	x12, x13, [x0, #80 - 256])
+CPY_MC(9998f, stnp	x14, x15, [x0, #96 - 256])
+CPY_MC(9998f, stnp	x16, x17, [x0, #112 - 256])
+
+9998:	ret
+
+SYM_FUNC_END(__pi_copy_page_mc)
+SYM_FUNC_ALIAS(copy_page_mc, __pi_copy_page_mc)
+EXPORT_SYMBOL(copy_page_mc)
diff --git a/arch/arm64/mm/copypage.c b/arch/arm64/mm/copypage.c
index 0dea80bf6de4..0f28edfcb234 100644
--- a/arch/arm64/mm/copypage.c
+++ b/arch/arm64/mm/copypage.c
@@ -14,13 +14,8 @@
 #include <asm/cpufeature.h>
 #include <asm/mte.h>
 
-void copy_highpage(struct page *to, struct page *from)
+static void do_mte(struct page *to, struct page *from, void *kto, void *kfrom)
 {
-	void *kto = page_address(to);
-	void *kfrom = page_address(from);
-
-	copy_page(kto, kfrom);
-
 	if (system_supports_mte() && test_bit(PG_mte_tagged, &from->flags)) {
 		set_bit(PG_mte_tagged, &to->flags);
 		page_kasan_tag_reset(to);
@@ -35,6 +30,15 @@ void copy_highpage(struct page *to, struct page *from)
 		mte_copy_page_tags(kto, kfrom);
 	}
 }
+
+void copy_highpage(struct page *to, struct page *from)
+{
+	void *kto = page_address(to);
+	void *kfrom = page_address(from);
+
+	copy_page(kto, kfrom);
+	do_mte(to, from, kto, kfrom);
+}
 EXPORT_SYMBOL(copy_highpage);
 
 void copy_user_highpage(struct page *to, struct page *from,
@@ -44,3 +48,23 @@ void copy_user_highpage(struct page *to, struct page *from,
 	flush_dcache_page(to);
 }
 EXPORT_SYMBOL_GPL(copy_user_highpage);
+
+#ifdef CONFIG_ARCH_HAS_COPY_MC
+void copy_highpage_mc(struct page *to, struct page *from)
+{
+	void *kto = page_address(to);
+	void *kfrom = page_address(from);
+
+	copy_page_mc(kto, kfrom);
+	do_mte(to, from, kto, kfrom);
+}
+EXPORT_SYMBOL(copy_highpage_mc);
+
+void copy_user_highpage_mc(struct page *to, struct page *from,
+			unsigned long vaddr, struct vm_area_struct *vma)
+{
+	copy_highpage_mc(to, from);
+	flush_dcache_page(to);
+}
+EXPORT_SYMBOL_GPL(copy_user_highpage_mc);
+#endif
diff --git a/arch/arm64/mm/extable.c b/arch/arm64/mm/extable.c
index 1023ccdb2f89..4c882d36dd64 100644
--- a/arch/arm64/mm/extable.c
+++ b/arch/arm64/mm/extable.c
@@ -110,6 +110,8 @@ bool fixup_exception_mc(struct pt_regs *regs)
 		return ex_handler_uaccess_type(ex, regs, FIXUP_TYPE_MC);
 	case EX_TYPE_UACCESS_MC_ERR_ZERO:
 		return ex_handler_uaccess_err_zero(ex, regs);
+	case EX_TYPE_COPY_PAGE_MC:
+		return ex_handler_fixup(ex, regs);
 
 	}
 
diff --git a/include/linux/highmem.h b/include/linux/highmem.h
index 39bb9b47fa9c..a9dbf331b038 100644
--- a/include/linux/highmem.h
+++ b/include/linux/highmem.h
@@ -283,6 +283,10 @@ static inline void copy_user_highpage(struct page *to, struct page *from,
 
 #endif
 
+#ifndef __HAVE_ARCH_COPY_USER_HIGHPAGE_MC
+#define copy_user_highpage_mc copy_user_highpage
+#endif
+
 #ifndef __HAVE_ARCH_COPY_HIGHPAGE
 
 static inline void copy_highpage(struct page *to, struct page *from)
@@ -298,6 +302,10 @@ static inline void copy_highpage(struct page *to, struct page *from)
 
 #endif
 
+#ifndef __HAVE_ARCH_COPY_HIGHPAGE_MC
+#define cop_highpage_mc copy_highpage
+#endif
+
 static inline void memcpy_page(struct page *dst_page, size_t dst_off,
 			       struct page *src_page, size_t src_off,
 			       size_t len)
diff --git a/mm/memory.c b/mm/memory.c
index 76e3af9639d9..d5f62234152d 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -2767,7 +2767,7 @@ static inline bool cow_user_page(struct page *dst, struct page *src,
 	unsigned long addr = vmf->address;
 
 	if (likely(src)) {
-		copy_user_highpage(dst, src, addr, vma);
+		copy_user_highpage_mc(dst, src, addr, vma);
 		return true;
 	}
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 96+ messages in thread

* Re: [PATCH -next v4 1/7] x86, powerpc: fix function define in copy_mc_to_user
  2022-04-20  3:04   ` Tong Tiangen
  (?)
@ 2022-04-22  9:45     ` Michael Ellerman
  -1 siblings, 0 replies; 96+ messages in thread
From: Michael Ellerman @ 2022-04-22  9:45 UTC (permalink / raw)
  To: Tong Tiangen, Mark Rutland, James Morse, Andrew Morton,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Robin Murphy,
	Dave Hansen, Catalin Marinas, Will Deacon, Alexander Viro,
	Benjamin Herrenschmidt, Paul Mackerras, x86, H . Peter Anvin
  Cc: linuxppc-dev, linux-arm-kernel, linux-kernel, linux-mm,
	Kefeng Wang, Xie XiuQi, Guohanjun, Tong Tiangen

Tong Tiangen <tongtiangen@huawei.com> writes:
> x86/powerpc has it's implementation of copy_mc_to_user but not use #define
> to declare.
>
> This may cause problems, for example, if other architectures open
> CONFIG_ARCH_HAS_COPY_MC, but want to use copy_mc_to_user() outside the
> architecture, the code add to include/linux/uaddess.h is as follows:
>
>     #ifndef copy_mc_to_user
>     static inline unsigned long __must_check
>     copy_mc_to_user(void *dst, const void *src, size_t cnt)
>     {
> 	    ...
>     }
>     #endif
     
The above doesn't exist yet, you add it in patch 3, which is a little
confusing for a reader of this commit in isolation.

I think you could safely move that into this patch, and then this patch
would be ~= "Add generic fallback version of copy_mc_to_user()".

It's probably not worth doing a whole new version of the series just for
that, but if you need to do a new version for some other reason I think
it would be cleaner to introduce the fallback in this commit.

> Then this definition will conflict with the implementation of x86/powerpc
> and cause compilation errors as follow:
>
> Fixes: ec6347bb4339 ("x86, powerpc: Rename memcpy_mcsafe() to copy_mc_to_{user, kernel}()")
> Signed-off-by: Tong Tiangen <tongtiangen@huawei.com>
> ---
>  arch/powerpc/include/asm/uaccess.h | 1 +

Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc)

cheers

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH -next v4 1/7] x86, powerpc: fix function define in copy_mc_to_user
@ 2022-04-22  9:45     ` Michael Ellerman
  0 siblings, 0 replies; 96+ messages in thread
From: Michael Ellerman @ 2022-04-22  9:45 UTC (permalink / raw)
  To: Tong Tiangen, Mark Rutland, James Morse, Andrew Morton,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Robin Murphy,
	Dave Hansen, Catalin Marinas, Will Deacon, Alexander Viro,
	Benjamin Herrenschmidt, Paul Mackerras, x86, H . Peter Anvin
  Cc: Kefeng Wang, Xie XiuQi, linux-kernel, linux-mm, Tong Tiangen,
	Guohanjun, linuxppc-dev, linux-arm-kernel

Tong Tiangen <tongtiangen@huawei.com> writes:
> x86/powerpc has it's implementation of copy_mc_to_user but not use #define
> to declare.
>
> This may cause problems, for example, if other architectures open
> CONFIG_ARCH_HAS_COPY_MC, but want to use copy_mc_to_user() outside the
> architecture, the code add to include/linux/uaddess.h is as follows:
>
>     #ifndef copy_mc_to_user
>     static inline unsigned long __must_check
>     copy_mc_to_user(void *dst, const void *src, size_t cnt)
>     {
> 	    ...
>     }
>     #endif
     
The above doesn't exist yet, you add it in patch 3, which is a little
confusing for a reader of this commit in isolation.

I think you could safely move that into this patch, and then this patch
would be ~= "Add generic fallback version of copy_mc_to_user()".

It's probably not worth doing a whole new version of the series just for
that, but if you need to do a new version for some other reason I think
it would be cleaner to introduce the fallback in this commit.

> Then this definition will conflict with the implementation of x86/powerpc
> and cause compilation errors as follow:
>
> Fixes: ec6347bb4339 ("x86, powerpc: Rename memcpy_mcsafe() to copy_mc_to_{user, kernel}()")
> Signed-off-by: Tong Tiangen <tongtiangen@huawei.com>
> ---
>  arch/powerpc/include/asm/uaccess.h | 1 +

Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc)

cheers

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH -next v4 1/7] x86, powerpc: fix function define in copy_mc_to_user
@ 2022-04-22  9:45     ` Michael Ellerman
  0 siblings, 0 replies; 96+ messages in thread
From: Michael Ellerman @ 2022-04-22  9:45 UTC (permalink / raw)
  To: Tong Tiangen, Mark Rutland, James Morse, Andrew Morton,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Robin Murphy,
	Dave Hansen, Catalin Marinas, Will Deacon, Alexander Viro,
	Benjamin Herrenschmidt, Paul Mackerras, x86, H . Peter Anvin
  Cc: linuxppc-dev, linux-arm-kernel, linux-kernel, linux-mm,
	Kefeng Wang, Xie XiuQi, Guohanjun, Tong Tiangen

Tong Tiangen <tongtiangen@huawei.com> writes:
> x86/powerpc has it's implementation of copy_mc_to_user but not use #define
> to declare.
>
> This may cause problems, for example, if other architectures open
> CONFIG_ARCH_HAS_COPY_MC, but want to use copy_mc_to_user() outside the
> architecture, the code add to include/linux/uaddess.h is as follows:
>
>     #ifndef copy_mc_to_user
>     static inline unsigned long __must_check
>     copy_mc_to_user(void *dst, const void *src, size_t cnt)
>     {
> 	    ...
>     }
>     #endif
     
The above doesn't exist yet, you add it in patch 3, which is a little
confusing for a reader of this commit in isolation.

I think you could safely move that into this patch, and then this patch
would be ~= "Add generic fallback version of copy_mc_to_user()".

It's probably not worth doing a whole new version of the series just for
that, but if you need to do a new version for some other reason I think
it would be cleaner to introduce the fallback in this commit.

> Then this definition will conflict with the implementation of x86/powerpc
> and cause compilation errors as follow:
>
> Fixes: ec6347bb4339 ("x86, powerpc: Rename memcpy_mcsafe() to copy_mc_to_{user, kernel}()")
> Signed-off-by: Tong Tiangen <tongtiangen@huawei.com>
> ---
>  arch/powerpc/include/asm/uaccess.h | 1 +

Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc)

cheers

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH -next v4 1/7] x86, powerpc: fix function define in copy_mc_to_user
  2022-04-22  9:45     ` Michael Ellerman
  (?)
@ 2022-04-24  1:16       ` Tong Tiangen
  -1 siblings, 0 replies; 96+ messages in thread
From: Tong Tiangen @ 2022-04-24  1:16 UTC (permalink / raw)
  To: Michael Ellerman, Mark Rutland, James Morse, Andrew Morton,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Robin Murphy,
	Dave Hansen, Catalin Marinas, Will Deacon, Alexander Viro,
	Benjamin Herrenschmidt, Paul Mackerras, x86, H . Peter Anvin
  Cc: linuxppc-dev, linux-arm-kernel, linux-kernel, linux-mm,
	Kefeng Wang, Xie XiuQi, Guohanjun



在 2022/4/22 17:45, Michael Ellerman 写道:
> Tong Tiangen <tongtiangen@huawei.com> writes:
>> x86/powerpc has it's implementation of copy_mc_to_user but not use #define
>> to declare.
>>
>> This may cause problems, for example, if other architectures open
>> CONFIG_ARCH_HAS_COPY_MC, but want to use copy_mc_to_user() outside the
>> architecture, the code add to include/linux/uaddess.h is as follows:
>>
>>      #ifndef copy_mc_to_user
>>      static inline unsigned long __must_check
>>      copy_mc_to_user(void *dst, const void *src, size_t cnt)
>>      {
>> 	    ...
>>      }
>>      #endif
>       
> The above doesn't exist yet, you add it in patch 3, which is a little
> confusing for a reader of this commit in isolation.
> 
> I think you could safely move that into this patch, and then this patch
> would be ~= "Add generic fallback version of copy_mc_to_user()".
> 
> It's probably not worth doing a whole new version of the series just for
> that, but if you need to do a new version for some other reason I think
> it would be cleaner to introduce the fallback in this commit.
> 

Agreed, will do in next version.

Thanks,
Tong.

>> Then this definition will conflict with the implementation of x86/powerpc
>> and cause compilation errors as follow:
>>
>> Fixes: ec6347bb4339 ("x86, powerpc: Rename memcpy_mcsafe() to copy_mc_to_{user, kernel}()")
>> Signed-off-by: Tong Tiangen <tongtiangen@huawei.com>
>> ---
>>   arch/powerpc/include/asm/uaccess.h | 1 +
> 
> Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc)
> 
> cheers
> .

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH -next v4 1/7] x86, powerpc: fix function define in copy_mc_to_user
@ 2022-04-24  1:16       ` Tong Tiangen
  0 siblings, 0 replies; 96+ messages in thread
From: Tong Tiangen @ 2022-04-24  1:16 UTC (permalink / raw)
  To: Michael Ellerman, Mark Rutland, James Morse, Andrew Morton,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Robin Murphy,
	Dave Hansen, Catalin Marinas, Will Deacon, Alexander Viro,
	Benjamin Herrenschmidt, Paul Mackerras, x86, H . Peter Anvin
  Cc: Kefeng Wang, Xie XiuQi, linux-kernel, linux-mm, Guohanjun,
	linuxppc-dev, linux-arm-kernel



在 2022/4/22 17:45, Michael Ellerman 写道:
> Tong Tiangen <tongtiangen@huawei.com> writes:
>> x86/powerpc has it's implementation of copy_mc_to_user but not use #define
>> to declare.
>>
>> This may cause problems, for example, if other architectures open
>> CONFIG_ARCH_HAS_COPY_MC, but want to use copy_mc_to_user() outside the
>> architecture, the code add to include/linux/uaddess.h is as follows:
>>
>>      #ifndef copy_mc_to_user
>>      static inline unsigned long __must_check
>>      copy_mc_to_user(void *dst, const void *src, size_t cnt)
>>      {
>> 	    ...
>>      }
>>      #endif
>       
> The above doesn't exist yet, you add it in patch 3, which is a little
> confusing for a reader of this commit in isolation.
> 
> I think you could safely move that into this patch, and then this patch
> would be ~= "Add generic fallback version of copy_mc_to_user()".
> 
> It's probably not worth doing a whole new version of the series just for
> that, but if you need to do a new version for some other reason I think
> it would be cleaner to introduce the fallback in this commit.
> 

Agreed, will do in next version.

Thanks,
Tong.

>> Then this definition will conflict with the implementation of x86/powerpc
>> and cause compilation errors as follow:
>>
>> Fixes: ec6347bb4339 ("x86, powerpc: Rename memcpy_mcsafe() to copy_mc_to_{user, kernel}()")
>> Signed-off-by: Tong Tiangen <tongtiangen@huawei.com>
>> ---
>>   arch/powerpc/include/asm/uaccess.h | 1 +
> 
> Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc)
> 
> cheers
> .

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH -next v4 1/7] x86, powerpc: fix function define in copy_mc_to_user
@ 2022-04-24  1:16       ` Tong Tiangen
  0 siblings, 0 replies; 96+ messages in thread
From: Tong Tiangen @ 2022-04-24  1:16 UTC (permalink / raw)
  To: Michael Ellerman, Mark Rutland, James Morse, Andrew Morton,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Robin Murphy,
	Dave Hansen, Catalin Marinas, Will Deacon, Alexander Viro,
	Benjamin Herrenschmidt, Paul Mackerras, x86, H . Peter Anvin
  Cc: linuxppc-dev, linux-arm-kernel, linux-kernel, linux-mm,
	Kefeng Wang, Xie XiuQi, Guohanjun



在 2022/4/22 17:45, Michael Ellerman 写道:
> Tong Tiangen <tongtiangen@huawei.com> writes:
>> x86/powerpc has it's implementation of copy_mc_to_user but not use #define
>> to declare.
>>
>> This may cause problems, for example, if other architectures open
>> CONFIG_ARCH_HAS_COPY_MC, but want to use copy_mc_to_user() outside the
>> architecture, the code add to include/linux/uaddess.h is as follows:
>>
>>      #ifndef copy_mc_to_user
>>      static inline unsigned long __must_check
>>      copy_mc_to_user(void *dst, const void *src, size_t cnt)
>>      {
>> 	    ...
>>      }
>>      #endif
>       
> The above doesn't exist yet, you add it in patch 3, which is a little
> confusing for a reader of this commit in isolation.
> 
> I think you could safely move that into this patch, and then this patch
> would be ~= "Add generic fallback version of copy_mc_to_user()".
> 
> It's probably not worth doing a whole new version of the series just for
> that, but if you need to do a new version for some other reason I think
> it would be cleaner to introduce the fallback in this commit.
> 

Agreed, will do in next version.

Thanks,
Tong.

>> Then this definition will conflict with the implementation of x86/powerpc
>> and cause compilation errors as follow:
>>
>> Fixes: ec6347bb4339 ("x86, powerpc: Rename memcpy_mcsafe() to copy_mc_to_{user, kernel}()")
>> Signed-off-by: Tong Tiangen <tongtiangen@huawei.com>
>> ---
>>   arch/powerpc/include/asm/uaccess.h | 1 +
> 
> Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc)
> 
> cheers
> .

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH -next v4 0/7]arm64: add machine check safe support
  2022-04-20  3:04 ` Tong Tiangen
  (?)
@ 2022-04-27  9:09   ` Tong Tiangen
  -1 siblings, 0 replies; 96+ messages in thread
From: Tong Tiangen @ 2022-04-27  9:09 UTC (permalink / raw)
  To: Mark Rutland, James Morse, Andrew Morton, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Robin Murphy, Dave Hansen,
	Catalin Marinas, Will Deacon, Alexander Viro, Michael Ellerman,
	Benjamin Herrenschmidt, Paul Mackerras, x86, H . Peter Anvin
  Cc: linuxppc-dev, linux-arm-kernel, linux-kernel, linux-mm,
	Kefeng Wang, Xie XiuQi, Guohanjun

Hi Mark, James, Robin, kindly ping...

Thanks.

在 2022/4/20 11:04, Tong Tiangen 写道:
> With the increase of memory capacity and density, the probability of
> memory error increases. The increasing size and density of server RAM
> in the data center and cloud have shown increased uncorrectable memory
> errors.
> 
> Currently, the kernel has a mechanism to recover from hardware memory
> errors. This patchset provides an new recovery mechanism.
> 
> For arm64, the hardware memory error handling is do_sea() which divided
> into two cases:
>   1. The user state consumed the memory errors, the solution is kill the
>      user process and isolate the error page.
>   2. The kernel state consumed the memory errors, the solution is panic.
> 
> For case 2, Undifferentiated panic maybe not the optimal choice, it can be
> handled better, in some scenes, we can avoid panic, such as uaccess, if the
> uaccess fails due to memory error, only the user process will be affected,
> kill the user process and isolate the user page with hardware memory errors
> is a better choice.
> 
> This patchset can be divided into three parts:
>   1. Patch 0/1/4    - make some minor fixes to the associated code.
>   2. Patch 3      - arm64 add support for machine check safe framework.
>   3. Pathc 5/6/7  - arm64 add uaccess and cow to machine check safe.
> 
> Since V4:
>   1. According to Robin's suggestion, direct modify user_ldst and
>   user_ldp in asm-uaccess.h and modify mte.S.
>   2. Add new macro USER_MC in asm-uaccess.h, used in copy_from_user.S
>   and copy_to_user.S.
>   3. According to Robin's suggestion, using micro in copy_page_mc.S to
>   simplify code.
>   4. According to KeFeng's suggestion, modify powerpc code in patch1.
>   5. According to KeFeng's suggestion, modify mm/extable.c and some code
>   optimization.
> 
> Since V3:
>   1. According to Mark's suggestion, all uaccess can be recovered due to
>      memory error.
>   2. Scenario pagecache reading is also supported as part of uaccess
>      (copy_to_user()) and duplication code problem is also solved.
>      Thanks for Robin's suggestion.
>   3. According Mark's suggestion, update commit message of patch 2/5.
>   4. According Borisllav's suggestion, update commit message of patch 1/5.
> 
> Since V2:
>   1.Consistent with PPC/x86, Using CONFIG_ARCH_HAS_COPY_MC instead of
>     ARM64_UCE_KERNEL_RECOVERY.
>   2.Add two new scenes, cow and pagecache reading.
>   3.Fix two small bug(the first two patch).
> 
> V1 in here:
> https://lore.kernel.org/lkml/20220323033705.3966643-1-tongtiangen@huawei.com/
> 
> Robin Murphy (1):
>    arm64: mte: Clean up user tag accessors
> 
> Tong Tiangen (6):
>    x86, powerpc: fix function define in copy_mc_to_user
>    arm64: fix types in copy_highpage()
>    arm64: add support for machine check error safe
>    arm64: add copy_{to, from}_user to machine check safe
>    arm64: add {get, put}_user to machine check safe
>    arm64: add cow to machine check safe
> 
>   arch/arm64/Kconfig                   |  1 +
>   arch/arm64/include/asm/asm-extable.h | 33 +++++++++++
>   arch/arm64/include/asm/asm-uaccess.h | 15 +++--
>   arch/arm64/include/asm/extable.h     |  1 +
>   arch/arm64/include/asm/page.h        | 10 ++++
>   arch/arm64/include/asm/uaccess.h     |  4 +-
>   arch/arm64/lib/Makefile              |  2 +
>   arch/arm64/lib/copy_from_user.S      | 18 +++---
>   arch/arm64/lib/copy_page_mc.S        | 86 ++++++++++++++++++++++++++++
>   arch/arm64/lib/copy_to_user.S        | 18 +++---
>   arch/arm64/lib/mte.S                 |  4 +-
>   arch/arm64/mm/copypage.c             | 36 ++++++++++--
>   arch/arm64/mm/extable.c              | 33 +++++++++++
>   arch/arm64/mm/fault.c                | 27 ++++++++-
>   arch/powerpc/include/asm/uaccess.h   |  1 +
>   arch/x86/include/asm/uaccess.h       |  1 +
>   include/linux/highmem.h              |  8 +++
>   include/linux/uaccess.h              |  9 +++
>   mm/memory.c                          |  2 +-
>   19 files changed, 278 insertions(+), 31 deletions(-)
>   create mode 100644 arch/arm64/lib/copy_page_mc.S
> 

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH -next v4 0/7]arm64: add machine check safe support
@ 2022-04-27  9:09   ` Tong Tiangen
  0 siblings, 0 replies; 96+ messages in thread
From: Tong Tiangen @ 2022-04-27  9:09 UTC (permalink / raw)
  To: Mark Rutland, James Morse, Andrew Morton, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Robin Murphy, Dave Hansen,
	Catalin Marinas, Will Deacon, Alexander Viro, Michael Ellerman,
	Benjamin Herrenschmidt, Paul Mackerras, x86, H . Peter Anvin
  Cc: Kefeng Wang, Xie XiuQi, linux-kernel, linux-mm, Guohanjun,
	linuxppc-dev, linux-arm-kernel

Hi Mark, James, Robin, kindly ping...

Thanks.

在 2022/4/20 11:04, Tong Tiangen 写道:
> With the increase of memory capacity and density, the probability of
> memory error increases. The increasing size and density of server RAM
> in the data center and cloud have shown increased uncorrectable memory
> errors.
> 
> Currently, the kernel has a mechanism to recover from hardware memory
> errors. This patchset provides an new recovery mechanism.
> 
> For arm64, the hardware memory error handling is do_sea() which divided
> into two cases:
>   1. The user state consumed the memory errors, the solution is kill the
>      user process and isolate the error page.
>   2. The kernel state consumed the memory errors, the solution is panic.
> 
> For case 2, Undifferentiated panic maybe not the optimal choice, it can be
> handled better, in some scenes, we can avoid panic, such as uaccess, if the
> uaccess fails due to memory error, only the user process will be affected,
> kill the user process and isolate the user page with hardware memory errors
> is a better choice.
> 
> This patchset can be divided into three parts:
>   1. Patch 0/1/4    - make some minor fixes to the associated code.
>   2. Patch 3      - arm64 add support for machine check safe framework.
>   3. Pathc 5/6/7  - arm64 add uaccess and cow to machine check safe.
> 
> Since V4:
>   1. According to Robin's suggestion, direct modify user_ldst and
>   user_ldp in asm-uaccess.h and modify mte.S.
>   2. Add new macro USER_MC in asm-uaccess.h, used in copy_from_user.S
>   and copy_to_user.S.
>   3. According to Robin's suggestion, using micro in copy_page_mc.S to
>   simplify code.
>   4. According to KeFeng's suggestion, modify powerpc code in patch1.
>   5. According to KeFeng's suggestion, modify mm/extable.c and some code
>   optimization.
> 
> Since V3:
>   1. According to Mark's suggestion, all uaccess can be recovered due to
>      memory error.
>   2. Scenario pagecache reading is also supported as part of uaccess
>      (copy_to_user()) and duplication code problem is also solved.
>      Thanks for Robin's suggestion.
>   3. According Mark's suggestion, update commit message of patch 2/5.
>   4. According Borisllav's suggestion, update commit message of patch 1/5.
> 
> Since V2:
>   1.Consistent with PPC/x86, Using CONFIG_ARCH_HAS_COPY_MC instead of
>     ARM64_UCE_KERNEL_RECOVERY.
>   2.Add two new scenes, cow and pagecache reading.
>   3.Fix two small bug(the first two patch).
> 
> V1 in here:
> https://lore.kernel.org/lkml/20220323033705.3966643-1-tongtiangen@huawei.com/
> 
> Robin Murphy (1):
>    arm64: mte: Clean up user tag accessors
> 
> Tong Tiangen (6):
>    x86, powerpc: fix function define in copy_mc_to_user
>    arm64: fix types in copy_highpage()
>    arm64: add support for machine check error safe
>    arm64: add copy_{to, from}_user to machine check safe
>    arm64: add {get, put}_user to machine check safe
>    arm64: add cow to machine check safe
> 
>   arch/arm64/Kconfig                   |  1 +
>   arch/arm64/include/asm/asm-extable.h | 33 +++++++++++
>   arch/arm64/include/asm/asm-uaccess.h | 15 +++--
>   arch/arm64/include/asm/extable.h     |  1 +
>   arch/arm64/include/asm/page.h        | 10 ++++
>   arch/arm64/include/asm/uaccess.h     |  4 +-
>   arch/arm64/lib/Makefile              |  2 +
>   arch/arm64/lib/copy_from_user.S      | 18 +++---
>   arch/arm64/lib/copy_page_mc.S        | 86 ++++++++++++++++++++++++++++
>   arch/arm64/lib/copy_to_user.S        | 18 +++---
>   arch/arm64/lib/mte.S                 |  4 +-
>   arch/arm64/mm/copypage.c             | 36 ++++++++++--
>   arch/arm64/mm/extable.c              | 33 +++++++++++
>   arch/arm64/mm/fault.c                | 27 ++++++++-
>   arch/powerpc/include/asm/uaccess.h   |  1 +
>   arch/x86/include/asm/uaccess.h       |  1 +
>   include/linux/highmem.h              |  8 +++
>   include/linux/uaccess.h              |  9 +++
>   mm/memory.c                          |  2 +-
>   19 files changed, 278 insertions(+), 31 deletions(-)
>   create mode 100644 arch/arm64/lib/copy_page_mc.S
> 

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH -next v4 0/7]arm64: add machine check safe support
@ 2022-04-27  9:09   ` Tong Tiangen
  0 siblings, 0 replies; 96+ messages in thread
From: Tong Tiangen @ 2022-04-27  9:09 UTC (permalink / raw)
  To: Mark Rutland, James Morse, Andrew Morton, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Robin Murphy, Dave Hansen,
	Catalin Marinas, Will Deacon, Alexander Viro, Michael Ellerman,
	Benjamin Herrenschmidt, Paul Mackerras, x86, H . Peter Anvin
  Cc: linuxppc-dev, linux-arm-kernel, linux-kernel, linux-mm,
	Kefeng Wang, Xie XiuQi, Guohanjun

Hi Mark, James, Robin, kindly ping...

Thanks.

在 2022/4/20 11:04, Tong Tiangen 写道:
> With the increase of memory capacity and density, the probability of
> memory error increases. The increasing size and density of server RAM
> in the data center and cloud have shown increased uncorrectable memory
> errors.
> 
> Currently, the kernel has a mechanism to recover from hardware memory
> errors. This patchset provides an new recovery mechanism.
> 
> For arm64, the hardware memory error handling is do_sea() which divided
> into two cases:
>   1. The user state consumed the memory errors, the solution is kill the
>      user process and isolate the error page.
>   2. The kernel state consumed the memory errors, the solution is panic.
> 
> For case 2, Undifferentiated panic maybe not the optimal choice, it can be
> handled better, in some scenes, we can avoid panic, such as uaccess, if the
> uaccess fails due to memory error, only the user process will be affected,
> kill the user process and isolate the user page with hardware memory errors
> is a better choice.
> 
> This patchset can be divided into three parts:
>   1. Patch 0/1/4    - make some minor fixes to the associated code.
>   2. Patch 3      - arm64 add support for machine check safe framework.
>   3. Pathc 5/6/7  - arm64 add uaccess and cow to machine check safe.
> 
> Since V4:
>   1. According to Robin's suggestion, direct modify user_ldst and
>   user_ldp in asm-uaccess.h and modify mte.S.
>   2. Add new macro USER_MC in asm-uaccess.h, used in copy_from_user.S
>   and copy_to_user.S.
>   3. According to Robin's suggestion, using micro in copy_page_mc.S to
>   simplify code.
>   4. According to KeFeng's suggestion, modify powerpc code in patch1.
>   5. According to KeFeng's suggestion, modify mm/extable.c and some code
>   optimization.
> 
> Since V3:
>   1. According to Mark's suggestion, all uaccess can be recovered due to
>      memory error.
>   2. Scenario pagecache reading is also supported as part of uaccess
>      (copy_to_user()) and duplication code problem is also solved.
>      Thanks for Robin's suggestion.
>   3. According Mark's suggestion, update commit message of patch 2/5.
>   4. According Borisllav's suggestion, update commit message of patch 1/5.
> 
> Since V2:
>   1.Consistent with PPC/x86, Using CONFIG_ARCH_HAS_COPY_MC instead of
>     ARM64_UCE_KERNEL_RECOVERY.
>   2.Add two new scenes, cow and pagecache reading.
>   3.Fix two small bug(the first two patch).
> 
> V1 in here:
> https://lore.kernel.org/lkml/20220323033705.3966643-1-tongtiangen@huawei.com/
> 
> Robin Murphy (1):
>    arm64: mte: Clean up user tag accessors
> 
> Tong Tiangen (6):
>    x86, powerpc: fix function define in copy_mc_to_user
>    arm64: fix types in copy_highpage()
>    arm64: add support for machine check error safe
>    arm64: add copy_{to, from}_user to machine check safe
>    arm64: add {get, put}_user to machine check safe
>    arm64: add cow to machine check safe
> 
>   arch/arm64/Kconfig                   |  1 +
>   arch/arm64/include/asm/asm-extable.h | 33 +++++++++++
>   arch/arm64/include/asm/asm-uaccess.h | 15 +++--
>   arch/arm64/include/asm/extable.h     |  1 +
>   arch/arm64/include/asm/page.h        | 10 ++++
>   arch/arm64/include/asm/uaccess.h     |  4 +-
>   arch/arm64/lib/Makefile              |  2 +
>   arch/arm64/lib/copy_from_user.S      | 18 +++---
>   arch/arm64/lib/copy_page_mc.S        | 86 ++++++++++++++++++++++++++++
>   arch/arm64/lib/copy_to_user.S        | 18 +++---
>   arch/arm64/lib/mte.S                 |  4 +-
>   arch/arm64/mm/copypage.c             | 36 ++++++++++--
>   arch/arm64/mm/extable.c              | 33 +++++++++++
>   arch/arm64/mm/fault.c                | 27 ++++++++-
>   arch/powerpc/include/asm/uaccess.h   |  1 +
>   arch/x86/include/asm/uaccess.h       |  1 +
>   include/linux/highmem.h              |  8 +++
>   include/linux/uaccess.h              |  9 +++
>   mm/memory.c                          |  2 +-
>   19 files changed, 278 insertions(+), 31 deletions(-)
>   create mode 100644 arch/arm64/lib/copy_page_mc.S
> 

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH -next v4 1/7] x86, powerpc: fix function define in copy_mc_to_user
  2022-04-20  3:04   ` Tong Tiangen
@ 2022-05-02 14:24     ` Christophe Leroy
  -1 siblings, 0 replies; 96+ messages in thread
From: Christophe Leroy @ 2022-05-02 14:24 UTC (permalink / raw)
  To: Tong Tiangen, Mark Rutland, James Morse, Andrew Morton,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Robin Murphy,
	Dave Hansen, Catalin Marinas, Will Deacon, Alexander Viro,
	Michael Ellerman, Benjamin Herrenschmidt, Paul Mackerras, x86,
	H . Peter Anvin
  Cc: Kefeng Wang, Xie XiuQi, linux-kernel, linux-mm, Guohanjun,
	linuxppc-dev, linux-arm-kernel



Le 20/04/2022 à 05:04, Tong Tiangen a écrit :
> x86/powerpc has it's implementation of copy_mc_to_user but not use #define
> to declare.
> 
> This may cause problems, for example, if other architectures open
> CONFIG_ARCH_HAS_COPY_MC, but want to use copy_mc_to_user() outside the
> architecture, the code add to include/linux/uaddess.h is as follows:
> 
>      #ifndef copy_mc_to_user
>      static inline unsigned long __must_check
>      copy_mc_to_user(void *dst, const void *src, size_t cnt)
>      {
> 	    ...
>      }
>      #endif
> 
> Then this definition will conflict with the implementation of x86/powerpc
> and cause compilation errors as follow:
> 
> Fixes: ec6347bb4339 ("x86, powerpc: Rename memcpy_mcsafe() to copy_mc_to_{user, kernel}()")

I don't understand, what does it fix really ? What was the 
(existing/real) bug introduced by that patch and that your are fixing ?

If those defined had been expected and missing, we would have had a 
build failure. If you have one, can you describe it ?

> Signed-off-by: Tong Tiangen <tongtiangen@huawei.com>
> ---
>   arch/powerpc/include/asm/uaccess.h | 1 +
>   arch/x86/include/asm/uaccess.h     | 1 +
>   2 files changed, 2 insertions(+)
> 
> diff --git a/arch/powerpc/include/asm/uaccess.h b/arch/powerpc/include/asm/uaccess.h
> index 9b82b38ff867..58dbe8e2e318 100644
> --- a/arch/powerpc/include/asm/uaccess.h
> +++ b/arch/powerpc/include/asm/uaccess.h
> @@ -358,6 +358,7 @@ copy_mc_to_user(void __user *to, const void *from, unsigned long n)
>   
>   	return n;
>   }
> +#define copy_mc_to_user copy_mc_to_user
>   #endif
>   
>   extern long __copy_from_user_flushcache(void *dst, const void __user *src,
> diff --git a/arch/x86/include/asm/uaccess.h b/arch/x86/include/asm/uaccess.h
> index f78e2b3501a1..e18c5f098025 100644
> --- a/arch/x86/include/asm/uaccess.h
> +++ b/arch/x86/include/asm/uaccess.h
> @@ -415,6 +415,7 @@ copy_mc_to_kernel(void *to, const void *from, unsigned len);
>   
>   unsigned long __must_check
>   copy_mc_to_user(void *to, const void *from, unsigned len);
> +#define copy_mc_to_user copy_mc_to_user
>   #endif
>   
>   /*

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH -next v4 1/7] x86, powerpc: fix function define in copy_mc_to_user
@ 2022-05-02 14:24     ` Christophe Leroy
  0 siblings, 0 replies; 96+ messages in thread
From: Christophe Leroy @ 2022-05-02 14:24 UTC (permalink / raw)
  To: Tong Tiangen, Mark Rutland, James Morse, Andrew Morton,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Robin Murphy,
	Dave Hansen, Catalin Marinas, Will Deacon, Alexander Viro,
	Michael Ellerman, Benjamin Herrenschmidt, Paul Mackerras, x86,
	H . Peter Anvin
  Cc: Kefeng Wang, Xie XiuQi, linux-kernel, linux-mm, Guohanjun,
	linuxppc-dev, linux-arm-kernel



Le 20/04/2022 à 05:04, Tong Tiangen a écrit :
> x86/powerpc has it's implementation of copy_mc_to_user but not use #define
> to declare.
> 
> This may cause problems, for example, if other architectures open
> CONFIG_ARCH_HAS_COPY_MC, but want to use copy_mc_to_user() outside the
> architecture, the code add to include/linux/uaddess.h is as follows:
> 
>      #ifndef copy_mc_to_user
>      static inline unsigned long __must_check
>      copy_mc_to_user(void *dst, const void *src, size_t cnt)
>      {
> 	    ...
>      }
>      #endif
> 
> Then this definition will conflict with the implementation of x86/powerpc
> and cause compilation errors as follow:
> 
> Fixes: ec6347bb4339 ("x86, powerpc: Rename memcpy_mcsafe() to copy_mc_to_{user, kernel}()")

I don't understand, what does it fix really ? What was the 
(existing/real) bug introduced by that patch and that your are fixing ?

If those defined had been expected and missing, we would have had a 
build failure. If you have one, can you describe it ?

> Signed-off-by: Tong Tiangen <tongtiangen@huawei.com>
> ---
>   arch/powerpc/include/asm/uaccess.h | 1 +
>   arch/x86/include/asm/uaccess.h     | 1 +
>   2 files changed, 2 insertions(+)
> 
> diff --git a/arch/powerpc/include/asm/uaccess.h b/arch/powerpc/include/asm/uaccess.h
> index 9b82b38ff867..58dbe8e2e318 100644
> --- a/arch/powerpc/include/asm/uaccess.h
> +++ b/arch/powerpc/include/asm/uaccess.h
> @@ -358,6 +358,7 @@ copy_mc_to_user(void __user *to, const void *from, unsigned long n)
>   
>   	return n;
>   }
> +#define copy_mc_to_user copy_mc_to_user
>   #endif
>   
>   extern long __copy_from_user_flushcache(void *dst, const void __user *src,
> diff --git a/arch/x86/include/asm/uaccess.h b/arch/x86/include/asm/uaccess.h
> index f78e2b3501a1..e18c5f098025 100644
> --- a/arch/x86/include/asm/uaccess.h
> +++ b/arch/x86/include/asm/uaccess.h
> @@ -415,6 +415,7 @@ copy_mc_to_kernel(void *to, const void *from, unsigned len);
>   
>   unsigned long __must_check
>   copy_mc_to_user(void *to, const void *from, unsigned len);
> +#define copy_mc_to_user copy_mc_to_user
>   #endif
>   
>   /*
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH -next v4 1/7] x86, powerpc: fix function define in copy_mc_to_user
  2022-05-02 14:24     ` Christophe Leroy
@ 2022-05-03  1:06       ` Tong Tiangen
  -1 siblings, 0 replies; 96+ messages in thread
From: Tong Tiangen @ 2022-05-03  1:06 UTC (permalink / raw)
  To: Christophe Leroy, Mark Rutland, James Morse, Andrew Morton,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Robin Murphy,
	Dave Hansen, Catalin Marinas, Will Deacon, Alexander Viro,
	Michael Ellerman, Benjamin Herrenschmidt, Paul Mackerras, x86,
	H . Peter Anvin
  Cc: Kefeng Wang, Xie XiuQi, linux-kernel, linux-mm, Guohanjun,
	linuxppc-dev, linux-arm-kernel



在 2022/5/2 22:24, Christophe Leroy 写道:
> 
> 
> Le 20/04/2022 à 05:04, Tong Tiangen a écrit :
>> x86/powerpc has it's implementation of copy_mc_to_user but not use #define
>> to declare.
>>
>> This may cause problems, for example, if other architectures open
>> CONFIG_ARCH_HAS_COPY_MC, but want to use copy_mc_to_user() outside the
>> architecture, the code add to include/linux/uaddess.h is as follows:
>>
>>       #ifndef copy_mc_to_user
>>       static inline unsigned long __must_check
>>       copy_mc_to_user(void *dst, const void *src, size_t cnt)
>>       {
>> 	    ...
>>       }
>>       #endif
>>
>> Then this definition will conflict with the implementation of x86/powerpc
>> and cause compilation errors as follow:
>>
>> Fixes: ec6347bb4339 ("x86, powerpc: Rename memcpy_mcsafe() to copy_mc_to_{user, kernel}()")
> 
> I don't understand, what does it fix really ? What was the
> (existing/real) bug introduced by that patch and that your are fixing ?
> 
> If those defined had been expected and missing, we would have had a
> build failure. If you have one, can you describe it ?

There will be build failure after patch 3 is added, there is a little
confusing for a reader of this commit in isolation.
In the next version, I will put this patch after patch 3.

Thanks,
Tong.
> 
>> Signed-off-by: Tong Tiangen <tongtiangen@huawei.com>
>> ---
>>    arch/powerpc/include/asm/uaccess.h | 1 +
>>    arch/x86/include/asm/uaccess.h     | 1 +
>>    2 files changed, 2 insertions(+)
>>
>> diff --git a/arch/powerpc/include/asm/uaccess.h b/arch/powerpc/include/asm/uaccess.h
>> index 9b82b38ff867..58dbe8e2e318 100644
>> --- a/arch/powerpc/include/asm/uaccess.h
>> +++ b/arch/powerpc/include/asm/uaccess.h
>> @@ -358,6 +358,7 @@ copy_mc_to_user(void __user *to, const void *from, unsigned long n)
>>    
>>    	return n;
>>    }
>> +#define copy_mc_to_user copy_mc_to_user
>>    #endif
>>    
>>    extern long __copy_from_user_flushcache(void *dst, const void __user *src,
>> diff --git a/arch/x86/include/asm/uaccess.h b/arch/x86/include/asm/uaccess.h
>> index f78e2b3501a1..e18c5f098025 100644
>> --- a/arch/x86/include/asm/uaccess.h
>> +++ b/arch/x86/include/asm/uaccess.h
>> @@ -415,6 +415,7 @@ copy_mc_to_kernel(void *to, const void *from, unsigned len);
>>    
>>    unsigned long __must_check
>>    copy_mc_to_user(void *to, const void *from, unsigned len);
>> +#define copy_mc_to_user copy_mc_to_user
>>    #endif
>>    
>>    /*

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH -next v4 1/7] x86, powerpc: fix function define in copy_mc_to_user
@ 2022-05-03  1:06       ` Tong Tiangen
  0 siblings, 0 replies; 96+ messages in thread
From: Tong Tiangen @ 2022-05-03  1:06 UTC (permalink / raw)
  To: Christophe Leroy, Mark Rutland, James Morse, Andrew Morton,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Robin Murphy,
	Dave Hansen, Catalin Marinas, Will Deacon, Alexander Viro,
	Michael Ellerman, Benjamin Herrenschmidt, Paul Mackerras, x86,
	H . Peter Anvin
  Cc: Kefeng Wang, Xie XiuQi, linux-kernel, linux-mm, Guohanjun,
	linuxppc-dev, linux-arm-kernel



在 2022/5/2 22:24, Christophe Leroy 写道:
> 
> 
> Le 20/04/2022 à 05:04, Tong Tiangen a écrit :
>> x86/powerpc has it's implementation of copy_mc_to_user but not use #define
>> to declare.
>>
>> This may cause problems, for example, if other architectures open
>> CONFIG_ARCH_HAS_COPY_MC, but want to use copy_mc_to_user() outside the
>> architecture, the code add to include/linux/uaddess.h is as follows:
>>
>>       #ifndef copy_mc_to_user
>>       static inline unsigned long __must_check
>>       copy_mc_to_user(void *dst, const void *src, size_t cnt)
>>       {
>> 	    ...
>>       }
>>       #endif
>>
>> Then this definition will conflict with the implementation of x86/powerpc
>> and cause compilation errors as follow:
>>
>> Fixes: ec6347bb4339 ("x86, powerpc: Rename memcpy_mcsafe() to copy_mc_to_{user, kernel}()")
> 
> I don't understand, what does it fix really ? What was the
> (existing/real) bug introduced by that patch and that your are fixing ?
> 
> If those defined had been expected and missing, we would have had a
> build failure. If you have one, can you describe it ?

There will be build failure after patch 3 is added, there is a little
confusing for a reader of this commit in isolation.
In the next version, I will put this patch after patch 3.

Thanks,
Tong.
> 
>> Signed-off-by: Tong Tiangen <tongtiangen@huawei.com>
>> ---
>>    arch/powerpc/include/asm/uaccess.h | 1 +
>>    arch/x86/include/asm/uaccess.h     | 1 +
>>    2 files changed, 2 insertions(+)
>>
>> diff --git a/arch/powerpc/include/asm/uaccess.h b/arch/powerpc/include/asm/uaccess.h
>> index 9b82b38ff867..58dbe8e2e318 100644
>> --- a/arch/powerpc/include/asm/uaccess.h
>> +++ b/arch/powerpc/include/asm/uaccess.h
>> @@ -358,6 +358,7 @@ copy_mc_to_user(void __user *to, const void *from, unsigned long n)
>>    
>>    	return n;
>>    }
>> +#define copy_mc_to_user copy_mc_to_user
>>    #endif
>>    
>>    extern long __copy_from_user_flushcache(void *dst, const void __user *src,
>> diff --git a/arch/x86/include/asm/uaccess.h b/arch/x86/include/asm/uaccess.h
>> index f78e2b3501a1..e18c5f098025 100644
>> --- a/arch/x86/include/asm/uaccess.h
>> +++ b/arch/x86/include/asm/uaccess.h
>> @@ -415,6 +415,7 @@ copy_mc_to_kernel(void *to, const void *from, unsigned len);
>>    
>>    unsigned long __must_check
>>    copy_mc_to_user(void *to, const void *from, unsigned len);
>> +#define copy_mc_to_user copy_mc_to_user
>>    #endif
>>    
>>    /*

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH -next v4 4/7] arm64: add copy_{to, from}_user to machine check safe
  2022-04-20  3:04   ` Tong Tiangen
  (?)
@ 2022-05-04 10:26     ` Catalin Marinas
  -1 siblings, 0 replies; 96+ messages in thread
From: Catalin Marinas @ 2022-05-04 10:26 UTC (permalink / raw)
  To: Tong Tiangen
  Cc: Mark Rutland, James Morse, Andrew Morton, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Robin Murphy, Dave Hansen,
	Will Deacon, Alexander Viro, Michael Ellerman,
	Benjamin Herrenschmidt, Paul Mackerras, x86, H . Peter Anvin,
	linuxppc-dev, linux-arm-kernel, linux-kernel, linux-mm,
	Kefeng Wang, Xie XiuQi, Guohanjun

On Wed, Apr 20, 2022 at 03:04:15AM +0000, Tong Tiangen wrote:
> Add copy_{to, from}_user() to machine check safe.
> 
> If copy fail due to hardware memory error, only the relevant processes are
> affected, so killing the user process and isolate the user page with
> hardware memory errors is a more reasonable choice than kernel panic.

Just to make sure I understand - we can only recover if the fault is in
a user page. That is, for a copy_from_user(), we can only handle the
faults in the source address, not the destination.

> diff --git a/arch/arm64/lib/copy_from_user.S b/arch/arm64/lib/copy_from_user.S
> index 34e317907524..480cc5ac0a8d 100644
> --- a/arch/arm64/lib/copy_from_user.S
> +++ b/arch/arm64/lib/copy_from_user.S
> @@ -25,7 +25,7 @@
>  	.endm
>  
>  	.macro strb1 reg, ptr, val
> -	strb \reg, [\ptr], \val
> +	USER_MC(9998f, strb \reg, [\ptr], \val)
>  	.endm

So if I got the above correctly, why do we need an exception table entry
for the store to the kernel address?

-- 
Catalin

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH -next v4 4/7] arm64: add copy_{to, from}_user to machine check safe
@ 2022-05-04 10:26     ` Catalin Marinas
  0 siblings, 0 replies; 96+ messages in thread
From: Catalin Marinas @ 2022-05-04 10:26 UTC (permalink / raw)
  To: Tong Tiangen
  Cc: Mark Rutland, Kefeng Wang, Dave Hansen, linux-mm, Paul Mackerras,
	Guohanjun, Will Deacon, H . Peter Anvin, x86, Ingo Molnar,
	Xie XiuQi, Borislav Petkov, Alexander Viro, Thomas Gleixner,
	linux-arm-kernel, Robin Murphy, linux-kernel, James Morse,
	Andrew Morton, linuxppc-dev

On Wed, Apr 20, 2022 at 03:04:15AM +0000, Tong Tiangen wrote:
> Add copy_{to, from}_user() to machine check safe.
> 
> If copy fail due to hardware memory error, only the relevant processes are
> affected, so killing the user process and isolate the user page with
> hardware memory errors is a more reasonable choice than kernel panic.

Just to make sure I understand - we can only recover if the fault is in
a user page. That is, for a copy_from_user(), we can only handle the
faults in the source address, not the destination.

> diff --git a/arch/arm64/lib/copy_from_user.S b/arch/arm64/lib/copy_from_user.S
> index 34e317907524..480cc5ac0a8d 100644
> --- a/arch/arm64/lib/copy_from_user.S
> +++ b/arch/arm64/lib/copy_from_user.S
> @@ -25,7 +25,7 @@
>  	.endm
>  
>  	.macro strb1 reg, ptr, val
> -	strb \reg, [\ptr], \val
> +	USER_MC(9998f, strb \reg, [\ptr], \val)
>  	.endm

So if I got the above correctly, why do we need an exception table entry
for the store to the kernel address?

-- 
Catalin

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH -next v4 4/7] arm64: add copy_{to, from}_user to machine check safe
@ 2022-05-04 10:26     ` Catalin Marinas
  0 siblings, 0 replies; 96+ messages in thread
From: Catalin Marinas @ 2022-05-04 10:26 UTC (permalink / raw)
  To: Tong Tiangen
  Cc: Mark Rutland, James Morse, Andrew Morton, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Robin Murphy, Dave Hansen,
	Will Deacon, Alexander Viro, Michael Ellerman,
	Benjamin Herrenschmidt, Paul Mackerras, x86, H . Peter Anvin,
	linuxppc-dev, linux-arm-kernel, linux-kernel, linux-mm,
	Kefeng Wang, Xie XiuQi, Guohanjun

On Wed, Apr 20, 2022 at 03:04:15AM +0000, Tong Tiangen wrote:
> Add copy_{to, from}_user() to machine check safe.
> 
> If copy fail due to hardware memory error, only the relevant processes are
> affected, so killing the user process and isolate the user page with
> hardware memory errors is a more reasonable choice than kernel panic.

Just to make sure I understand - we can only recover if the fault is in
a user page. That is, for a copy_from_user(), we can only handle the
faults in the source address, not the destination.

> diff --git a/arch/arm64/lib/copy_from_user.S b/arch/arm64/lib/copy_from_user.S
> index 34e317907524..480cc5ac0a8d 100644
> --- a/arch/arm64/lib/copy_from_user.S
> +++ b/arch/arm64/lib/copy_from_user.S
> @@ -25,7 +25,7 @@
>  	.endm
>  
>  	.macro strb1 reg, ptr, val
> -	strb \reg, [\ptr], \val
> +	USER_MC(9998f, strb \reg, [\ptr], \val)
>  	.endm

So if I got the above correctly, why do we need an exception table entry
for the store to the kernel address?

-- 
Catalin

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: (subset) [PATCH -next v4 0/7]arm64: add machine check safe support
  2022-04-20  3:04 ` Tong Tiangen
  (?)
@ 2022-05-04 19:58   ` Catalin Marinas
  -1 siblings, 0 replies; 96+ messages in thread
From: Catalin Marinas @ 2022-05-04 19:58 UTC (permalink / raw)
  To: Dave Hansen, Ingo Molnar, James Morse, Paul Mackerras,
	Robin Murphy, H . Peter Anvin, Benjamin Herrenschmidt,
	Thomas Gleixner, Michael Ellerman, Tong Tiangen, Will Deacon,
	Andrew Morton, Mark Rutland, Alexander Viro, x86,
	Borislav Petkov
  Cc: Kefeng Wang, linux-arm-kernel, linux-kernel, Xie XiuQi,
	Guohanjun, linux-mm, linuxppc-dev

On Wed, 20 Apr 2022 03:04:11 +0000, Tong Tiangen wrote:
> With the increase of memory capacity and density, the probability of
> memory error increases. The increasing size and density of server RAM
> in the data center and cloud have shown increased uncorrectable memory
> errors.
> 
> Currently, the kernel has a mechanism to recover from hardware memory
> errors. This patchset provides an new recovery mechanism.
> 
> [...]

Applied to arm64 (for-next/misc), thanks!

[2/7] arm64: fix types in copy_highpage()
      https://git.kernel.org/arm64/c/921d161f15d6

-- 
Catalin


^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: (subset) [PATCH -next v4 0/7]arm64: add machine check safe support
@ 2022-05-04 19:58   ` Catalin Marinas
  0 siblings, 0 replies; 96+ messages in thread
From: Catalin Marinas @ 2022-05-04 19:58 UTC (permalink / raw)
  To: Dave Hansen, Ingo Molnar, James Morse, Paul Mackerras,
	Robin Murphy, H . Peter Anvin, Benjamin Herrenschmidt,
	Thomas Gleixner, Michael Ellerman, Tong Tiangen, Will Deacon,
	Andrew Morton, Mark Rutland, Alexander Viro, x86,
	Borislav Petkov
  Cc: Kefeng Wang, Xie XiuQi, linux-kernel, linux-mm, Guohanjun,
	linuxppc-dev, linux-arm-kernel

On Wed, 20 Apr 2022 03:04:11 +0000, Tong Tiangen wrote:
> With the increase of memory capacity and density, the probability of
> memory error increases. The increasing size and density of server RAM
> in the data center and cloud have shown increased uncorrectable memory
> errors.
> 
> Currently, the kernel has a mechanism to recover from hardware memory
> errors. This patchset provides an new recovery mechanism.
> 
> [...]

Applied to arm64 (for-next/misc), thanks!

[2/7] arm64: fix types in copy_highpage()
      https://git.kernel.org/arm64/c/921d161f15d6

-- 
Catalin


^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: (subset) [PATCH -next v4 0/7]arm64: add machine check safe support
@ 2022-05-04 19:58   ` Catalin Marinas
  0 siblings, 0 replies; 96+ messages in thread
From: Catalin Marinas @ 2022-05-04 19:58 UTC (permalink / raw)
  To: Dave Hansen, Ingo Molnar, James Morse, Paul Mackerras,
	Robin Murphy, H . Peter Anvin, Benjamin Herrenschmidt,
	Thomas Gleixner, Michael Ellerman, Tong Tiangen, Will Deacon,
	Andrew Morton, Mark Rutland, Alexander Viro, x86,
	Borislav Petkov
  Cc: Kefeng Wang, linux-arm-kernel, linux-kernel, Xie XiuQi,
	Guohanjun, linux-mm, linuxppc-dev

On Wed, 20 Apr 2022 03:04:11 +0000, Tong Tiangen wrote:
> With the increase of memory capacity and density, the probability of
> memory error increases. The increasing size and density of server RAM
> in the data center and cloud have shown increased uncorrectable memory
> errors.
> 
> Currently, the kernel has a mechanism to recover from hardware memory
> errors. This patchset provides an new recovery mechanism.
> 
> [...]

Applied to arm64 (for-next/misc), thanks!

[2/7] arm64: fix types in copy_highpage()
      https://git.kernel.org/arm64/c/921d161f15d6

-- 
Catalin


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH -next v4 1/7] x86, powerpc: fix function define in copy_mc_to_user
  2022-05-03  1:06       ` Tong Tiangen
@ 2022-05-05  1:21         ` Kefeng Wang
  -1 siblings, 0 replies; 96+ messages in thread
From: Kefeng Wang @ 2022-05-05  1:21 UTC (permalink / raw)
  To: Tong Tiangen, Christophe Leroy, Mark Rutland, James Morse,
	Andrew Morton, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Robin Murphy, Dave Hansen, Catalin Marinas, Will Deacon,
	Alexander Viro, Michael Ellerman, Benjamin Herrenschmidt,
	Paul Mackerras, x86, H . Peter Anvin
  Cc: Xie XiuQi, linux-kernel, linux-mm, Guohanjun, linuxppc-dev,
	linux-arm-kernel


On 2022/5/3 9:06, Tong Tiangen wrote:
>
>
> 在 2022/5/2 22:24, Christophe Leroy 写道:
>>
>>
>> Le 20/04/2022 à 05:04, Tong Tiangen a écrit :
>>> x86/powerpc has it's implementation of copy_mc_to_user but not use 
>>> #define
>>> to declare.
>>>
>>> This may cause problems, for example, if other architectures open
>>> CONFIG_ARCH_HAS_COPY_MC, but want to use copy_mc_to_user() outside the
>>> architecture, the code add to include/linux/uaddess.h is as follows:
>>>
>>>       #ifndef copy_mc_to_user
>>>       static inline unsigned long __must_check
>>>       copy_mc_to_user(void *dst, const void *src, size_t cnt)
>>>       {
>>>         ...
>>>       }
>>>       #endif
>>>
>>> Then this definition will conflict with the implementation of 
>>> x86/powerpc
>>> and cause compilation errors as follow:
>>>
>>> Fixes: ec6347bb4339 ("x86, powerpc: Rename memcpy_mcsafe() to 
>>> copy_mc_to_{user, kernel}()")
>>
>> I don't understand, what does it fix really ? What was the
>> (existing/real) bug introduced by that patch and that your are fixing ?
>>
>> If those defined had been expected and missing, we would have had a
>> build failure. If you have one, can you describe it ?
>
It could prevent future problems when patch3 is introduced, and yes,for 
now,

this patch won't fix any issue,we could drop the fix tag, and update the 
changelog.


> There will be build failure after patch 3 is added, there is a little
> confusing for a reader of this commit in isolation.
> In the next version, I will put this patch after patch 3.
This is an alternative.
>
> Thanks,
> Tong.
> .

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH -next v4 1/7] x86, powerpc: fix function define in copy_mc_to_user
@ 2022-05-05  1:21         ` Kefeng Wang
  0 siblings, 0 replies; 96+ messages in thread
From: Kefeng Wang @ 2022-05-05  1:21 UTC (permalink / raw)
  To: Tong Tiangen, Christophe Leroy, Mark Rutland, James Morse,
	Andrew Morton, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Robin Murphy, Dave Hansen, Catalin Marinas, Will Deacon,
	Alexander Viro, Michael Ellerman, Benjamin Herrenschmidt,
	Paul Mackerras, x86, H . Peter Anvin
  Cc: Xie XiuQi, linux-kernel, linux-mm, Guohanjun, linuxppc-dev,
	linux-arm-kernel


On 2022/5/3 9:06, Tong Tiangen wrote:
>
>
> 在 2022/5/2 22:24, Christophe Leroy 写道:
>>
>>
>> Le 20/04/2022 à 05:04, Tong Tiangen a écrit :
>>> x86/powerpc has it's implementation of copy_mc_to_user but not use 
>>> #define
>>> to declare.
>>>
>>> This may cause problems, for example, if other architectures open
>>> CONFIG_ARCH_HAS_COPY_MC, but want to use copy_mc_to_user() outside the
>>> architecture, the code add to include/linux/uaddess.h is as follows:
>>>
>>>       #ifndef copy_mc_to_user
>>>       static inline unsigned long __must_check
>>>       copy_mc_to_user(void *dst, const void *src, size_t cnt)
>>>       {
>>>         ...
>>>       }
>>>       #endif
>>>
>>> Then this definition will conflict with the implementation of 
>>> x86/powerpc
>>> and cause compilation errors as follow:
>>>
>>> Fixes: ec6347bb4339 ("x86, powerpc: Rename memcpy_mcsafe() to 
>>> copy_mc_to_{user, kernel}()")
>>
>> I don't understand, what does it fix really ? What was the
>> (existing/real) bug introduced by that patch and that your are fixing ?
>>
>> If those defined had been expected and missing, we would have had a
>> build failure. If you have one, can you describe it ?
>
It could prevent future problems when patch3 is introduced, and yes,for 
now,

this patch won't fix any issue,we could drop the fix tag, and update the 
changelog.


> There will be build failure after patch 3 is added, there is a little
> confusing for a reader of this commit in isolation.
> In the next version, I will put this patch after patch 3.
This is an alternative.
>
> Thanks,
> Tong.
> .

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH -next v4 4/7] arm64: add copy_{to, from}_user to machine check safe
  2022-05-04 10:26     ` Catalin Marinas
  (?)
@ 2022-05-05  6:39       ` Tong Tiangen
  -1 siblings, 0 replies; 96+ messages in thread
From: Tong Tiangen @ 2022-05-05  6:39 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Mark Rutland, James Morse, Andrew Morton, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Robin Murphy, Dave Hansen,
	Will Deacon, Alexander Viro, Michael Ellerman,
	Benjamin Herrenschmidt, Paul Mackerras, x86, H . Peter Anvin,
	linuxppc-dev, linux-arm-kernel, linux-kernel, linux-mm,
	Kefeng Wang, Xie XiuQi, Guohanjun



在 2022/5/4 18:26, Catalin Marinas 写道:
> On Wed, Apr 20, 2022 at 03:04:15AM +0000, Tong Tiangen wrote:
>> Add copy_{to, from}_user() to machine check safe.
>>
>> If copy fail due to hardware memory error, only the relevant processes are
>> affected, so killing the user process and isolate the user page with
>> hardware memory errors is a more reasonable choice than kernel panic.
> 
> Just to make sure I understand - we can only recover if the fault is in
> a user page. That is, for a copy_from_user(), we can only handle the
> faults in the source address, not the destination.

At the beginning, I also thought we can only recover if the fault is in 
a user page.
After discussion with a Mark[1], I think no matter user page or kernel 
page, as long as it is triggered by the user process, only related 
processes will be affected. According to this
understanding, it seems that all uaccess can be recovered.

[1]https://patchwork.kernel.org/project/linux-arm-kernel/patch/20220406091311.3354723-6-tongtiangen@huawei.com/

Thanks,
Tong.

> 
>> diff --git a/arch/arm64/lib/copy_from_user.S b/arch/arm64/lib/copy_from_user.S
>> index 34e317907524..480cc5ac0a8d 100644
>> --- a/arch/arm64/lib/copy_from_user.S
>> +++ b/arch/arm64/lib/copy_from_user.S
>> @@ -25,7 +25,7 @@
>>   	.endm
>>   
>>   	.macro strb1 reg, ptr, val
>> -	strb \reg, [\ptr], \val
>> +	USER_MC(9998f, strb \reg, [\ptr], \val)
>>   	.endm
> 
> So if I got the above correctly, why do we need an exception table entry
> for the store to the kernel address?
> 

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH -next v4 4/7] arm64: add copy_{to, from}_user to machine check safe
@ 2022-05-05  6:39       ` Tong Tiangen
  0 siblings, 0 replies; 96+ messages in thread
From: Tong Tiangen @ 2022-05-05  6:39 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Mark Rutland, Kefeng Wang, Dave Hansen, linux-mm, Paul Mackerras,
	Guohanjun, Will Deacon, H . Peter Anvin, x86, Ingo Molnar,
	Xie XiuQi, Borislav Petkov, Alexander Viro, Thomas Gleixner,
	linux-arm-kernel, Robin Murphy, linux-kernel, James Morse,
	Andrew Morton, linuxppc-dev



在 2022/5/4 18:26, Catalin Marinas 写道:
> On Wed, Apr 20, 2022 at 03:04:15AM +0000, Tong Tiangen wrote:
>> Add copy_{to, from}_user() to machine check safe.
>>
>> If copy fail due to hardware memory error, only the relevant processes are
>> affected, so killing the user process and isolate the user page with
>> hardware memory errors is a more reasonable choice than kernel panic.
> 
> Just to make sure I understand - we can only recover if the fault is in
> a user page. That is, for a copy_from_user(), we can only handle the
> faults in the source address, not the destination.

At the beginning, I also thought we can only recover if the fault is in 
a user page.
After discussion with a Mark[1], I think no matter user page or kernel 
page, as long as it is triggered by the user process, only related 
processes will be affected. According to this
understanding, it seems that all uaccess can be recovered.

[1]https://patchwork.kernel.org/project/linux-arm-kernel/patch/20220406091311.3354723-6-tongtiangen@huawei.com/

Thanks,
Tong.

> 
>> diff --git a/arch/arm64/lib/copy_from_user.S b/arch/arm64/lib/copy_from_user.S
>> index 34e317907524..480cc5ac0a8d 100644
>> --- a/arch/arm64/lib/copy_from_user.S
>> +++ b/arch/arm64/lib/copy_from_user.S
>> @@ -25,7 +25,7 @@
>>   	.endm
>>   
>>   	.macro strb1 reg, ptr, val
>> -	strb \reg, [\ptr], \val
>> +	USER_MC(9998f, strb \reg, [\ptr], \val)
>>   	.endm
> 
> So if I got the above correctly, why do we need an exception table entry
> for the store to the kernel address?
> 

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH -next v4 4/7] arm64: add copy_{to, from}_user to machine check safe
@ 2022-05-05  6:39       ` Tong Tiangen
  0 siblings, 0 replies; 96+ messages in thread
From: Tong Tiangen @ 2022-05-05  6:39 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Mark Rutland, James Morse, Andrew Morton, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Robin Murphy, Dave Hansen,
	Will Deacon, Alexander Viro, Michael Ellerman,
	Benjamin Herrenschmidt, Paul Mackerras, x86, H . Peter Anvin,
	linuxppc-dev, linux-arm-kernel, linux-kernel, linux-mm,
	Kefeng Wang, Xie XiuQi, Guohanjun



在 2022/5/4 18:26, Catalin Marinas 写道:
> On Wed, Apr 20, 2022 at 03:04:15AM +0000, Tong Tiangen wrote:
>> Add copy_{to, from}_user() to machine check safe.
>>
>> If copy fail due to hardware memory error, only the relevant processes are
>> affected, so killing the user process and isolate the user page with
>> hardware memory errors is a more reasonable choice than kernel panic.
> 
> Just to make sure I understand - we can only recover if the fault is in
> a user page. That is, for a copy_from_user(), we can only handle the
> faults in the source address, not the destination.

At the beginning, I also thought we can only recover if the fault is in 
a user page.
After discussion with a Mark[1], I think no matter user page or kernel 
page, as long as it is triggered by the user process, only related 
processes will be affected. According to this
understanding, it seems that all uaccess can be recovered.

[1]https://patchwork.kernel.org/project/linux-arm-kernel/patch/20220406091311.3354723-6-tongtiangen@huawei.com/

Thanks,
Tong.

> 
>> diff --git a/arch/arm64/lib/copy_from_user.S b/arch/arm64/lib/copy_from_user.S
>> index 34e317907524..480cc5ac0a8d 100644
>> --- a/arch/arm64/lib/copy_from_user.S
>> +++ b/arch/arm64/lib/copy_from_user.S
>> @@ -25,7 +25,7 @@
>>   	.endm
>>   
>>   	.macro strb1 reg, ptr, val
>> -	strb \reg, [\ptr], \val
>> +	USER_MC(9998f, strb \reg, [\ptr], \val)
>>   	.endm
> 
> So if I got the above correctly, why do we need an exception table entry
> for the store to the kernel address?
> 

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH -next v4 4/7] arm64: add copy_{to, from}_user to machine check safe
  2022-05-05  6:39       ` Tong Tiangen
  (?)
@ 2022-05-05 13:41         ` Catalin Marinas
  -1 siblings, 0 replies; 96+ messages in thread
From: Catalin Marinas @ 2022-05-05 13:41 UTC (permalink / raw)
  To: Tong Tiangen
  Cc: Mark Rutland, James Morse, Andrew Morton, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Robin Murphy, Dave Hansen,
	Will Deacon, Alexander Viro, Michael Ellerman,
	Benjamin Herrenschmidt, Paul Mackerras, x86, H . Peter Anvin,
	linuxppc-dev, linux-arm-kernel, linux-kernel, linux-mm,
	Kefeng Wang, Xie XiuQi, Guohanjun

On Thu, May 05, 2022 at 02:39:43PM +0800, Tong Tiangen wrote:
> 在 2022/5/4 18:26, Catalin Marinas 写道:
> > On Wed, Apr 20, 2022 at 03:04:15AM +0000, Tong Tiangen wrote:
> > > Add copy_{to, from}_user() to machine check safe.
> > > 
> > > If copy fail due to hardware memory error, only the relevant processes are
> > > affected, so killing the user process and isolate the user page with
> > > hardware memory errors is a more reasonable choice than kernel panic.
> > 
> > Just to make sure I understand - we can only recover if the fault is in
> > a user page. That is, for a copy_from_user(), we can only handle the
> > faults in the source address, not the destination.
> 
> At the beginning, I also thought we can only recover if the fault is in a
> user page.
> After discussion with a Mark[1], I think no matter user page or kernel page,
> as long as it is triggered by the user process, only related processes will
> be affected. According to this
> understanding, it seems that all uaccess can be recovered.
> 
> [1]https://patchwork.kernel.org/project/linux-arm-kernel/patch/20220406091311.3354723-6-tongtiangen@huawei.com/

We can indeed safely skip this copy and return an error just like
pretending there was a user page fault. However, my point was more
around the "isolate the user page with hardware memory errors". If the
fault is on a kernel address, there's not much you can do about. You'll
likely trigger it later when you try to access that address (maybe it
was freed and re-allocated). Do we hope we won't get the same error
again on that kernel address?

-- 
Catalin

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH -next v4 4/7] arm64: add copy_{to, from}_user to machine check safe
@ 2022-05-05 13:41         ` Catalin Marinas
  0 siblings, 0 replies; 96+ messages in thread
From: Catalin Marinas @ 2022-05-05 13:41 UTC (permalink / raw)
  To: Tong Tiangen
  Cc: Mark Rutland, Kefeng Wang, Dave Hansen, linux-mm, Paul Mackerras,
	Guohanjun, Will Deacon, H . Peter Anvin, x86, Ingo Molnar,
	Xie XiuQi, Borislav Petkov, Alexander Viro, Thomas Gleixner,
	linux-arm-kernel, Robin Murphy, linux-kernel, James Morse,
	Andrew Morton, linuxppc-dev

On Thu, May 05, 2022 at 02:39:43PM +0800, Tong Tiangen wrote:
> 在 2022/5/4 18:26, Catalin Marinas 写道:
> > On Wed, Apr 20, 2022 at 03:04:15AM +0000, Tong Tiangen wrote:
> > > Add copy_{to, from}_user() to machine check safe.
> > > 
> > > If copy fail due to hardware memory error, only the relevant processes are
> > > affected, so killing the user process and isolate the user page with
> > > hardware memory errors is a more reasonable choice than kernel panic.
> > 
> > Just to make sure I understand - we can only recover if the fault is in
> > a user page. That is, for a copy_from_user(), we can only handle the
> > faults in the source address, not the destination.
> 
> At the beginning, I also thought we can only recover if the fault is in a
> user page.
> After discussion with a Mark[1], I think no matter user page or kernel page,
> as long as it is triggered by the user process, only related processes will
> be affected. According to this
> understanding, it seems that all uaccess can be recovered.
> 
> [1]https://patchwork.kernel.org/project/linux-arm-kernel/patch/20220406091311.3354723-6-tongtiangen@huawei.com/

We can indeed safely skip this copy and return an error just like
pretending there was a user page fault. However, my point was more
around the "isolate the user page with hardware memory errors". If the
fault is on a kernel address, there's not much you can do about. You'll
likely trigger it later when you try to access that address (maybe it
was freed and re-allocated). Do we hope we won't get the same error
again on that kernel address?

-- 
Catalin

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH -next v4 4/7] arm64: add copy_{to, from}_user to machine check safe
@ 2022-05-05 13:41         ` Catalin Marinas
  0 siblings, 0 replies; 96+ messages in thread
From: Catalin Marinas @ 2022-05-05 13:41 UTC (permalink / raw)
  To: Tong Tiangen
  Cc: Mark Rutland, James Morse, Andrew Morton, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Robin Murphy, Dave Hansen,
	Will Deacon, Alexander Viro, Michael Ellerman,
	Benjamin Herrenschmidt, Paul Mackerras, x86, H . Peter Anvin,
	linuxppc-dev, linux-arm-kernel, linux-kernel, linux-mm,
	Kefeng Wang, Xie XiuQi, Guohanjun

On Thu, May 05, 2022 at 02:39:43PM +0800, Tong Tiangen wrote:
> 在 2022/5/4 18:26, Catalin Marinas 写道:
> > On Wed, Apr 20, 2022 at 03:04:15AM +0000, Tong Tiangen wrote:
> > > Add copy_{to, from}_user() to machine check safe.
> > > 
> > > If copy fail due to hardware memory error, only the relevant processes are
> > > affected, so killing the user process and isolate the user page with
> > > hardware memory errors is a more reasonable choice than kernel panic.
> > 
> > Just to make sure I understand - we can only recover if the fault is in
> > a user page. That is, for a copy_from_user(), we can only handle the
> > faults in the source address, not the destination.
> 
> At the beginning, I also thought we can only recover if the fault is in a
> user page.
> After discussion with a Mark[1], I think no matter user page or kernel page,
> as long as it is triggered by the user process, only related processes will
> be affected. According to this
> understanding, it seems that all uaccess can be recovered.
> 
> [1]https://patchwork.kernel.org/project/linux-arm-kernel/patch/20220406091311.3354723-6-tongtiangen@huawei.com/

We can indeed safely skip this copy and return an error just like
pretending there was a user page fault. However, my point was more
around the "isolate the user page with hardware memory errors". If the
fault is on a kernel address, there's not much you can do about. You'll
likely trigger it later when you try to access that address (maybe it
was freed and re-allocated). Do we hope we won't get the same error
again on that kernel address?

-- 
Catalin

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH -next v4 4/7] arm64: add copy_{to, from}_user to machine check safe
  2022-05-05 13:41         ` Catalin Marinas
  (?)
@ 2022-05-05 14:33           ` Tong Tiangen
  -1 siblings, 0 replies; 96+ messages in thread
From: Tong Tiangen @ 2022-05-05 14:33 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Mark Rutland, James Morse, Andrew Morton, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Robin Murphy, Dave Hansen,
	Will Deacon, Alexander Viro, Michael Ellerman,
	Benjamin Herrenschmidt, Paul Mackerras, x86, H . Peter Anvin,
	linuxppc-dev, linux-arm-kernel, linux-kernel, linux-mm,
	Kefeng Wang, Xie XiuQi, Guohanjun



在 2022/5/5 21:41, Catalin Marinas 写道:
> On Thu, May 05, 2022 at 02:39:43PM +0800, Tong Tiangen wrote:
>> 在 2022/5/4 18:26, Catalin Marinas 写道:
>>> On Wed, Apr 20, 2022 at 03:04:15AM +0000, Tong Tiangen wrote:
>>>> Add copy_{to, from}_user() to machine check safe.
>>>>
>>>> If copy fail due to hardware memory error, only the relevant processes are
>>>> affected, so killing the user process and isolate the user page with
>>>> hardware memory errors is a more reasonable choice than kernel panic.
>>>
>>> Just to make sure I understand - we can only recover if the fault is in
>>> a user page. That is, for a copy_from_user(), we can only handle the
>>> faults in the source address, not the destination.
>>
>> At the beginning, I also thought we can only recover if the fault is in a
>> user page.
>> After discussion with a Mark[1], I think no matter user page or kernel page,
>> as long as it is triggered by the user process, only related processes will
>> be affected. According to this
>> understanding, it seems that all uaccess can be recovered.
>>
>> [1]https://patchwork.kernel.org/project/linux-arm-kernel/patch/20220406091311.3354723-6-tongtiangen@huawei.com/
> 
> We can indeed safely skip this copy and return an error just like
> pretending there was a user page fault. However, my point was more
> around the "isolate the user page with hardware memory errors". If the
> fault is on a kernel address, there's not much you can do about. You'll
> likely trigger it later when you try to access that address (maybe it
> was freed and re-allocated). Do we hope we won't get the same error
> again on that kernel address?

I think the page with memory error will be isolated by memory_failure(), 
generally, isolation will succeed, if isolate failure(we need to find 
out why), then maybe the same error will trigger it later.

Thanks.

> 

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH -next v4 4/7] arm64: add copy_{to, from}_user to machine check safe
@ 2022-05-05 14:33           ` Tong Tiangen
  0 siblings, 0 replies; 96+ messages in thread
From: Tong Tiangen @ 2022-05-05 14:33 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Mark Rutland, Kefeng Wang, Dave Hansen, linux-mm, Paul Mackerras,
	Guohanjun, Will Deacon, H . Peter Anvin, x86, Ingo Molnar,
	Xie XiuQi, Borislav Petkov, Alexander Viro, Thomas Gleixner,
	linux-arm-kernel, Robin Murphy, linux-kernel, James Morse,
	Andrew Morton, linuxppc-dev



在 2022/5/5 21:41, Catalin Marinas 写道:
> On Thu, May 05, 2022 at 02:39:43PM +0800, Tong Tiangen wrote:
>> 在 2022/5/4 18:26, Catalin Marinas 写道:
>>> On Wed, Apr 20, 2022 at 03:04:15AM +0000, Tong Tiangen wrote:
>>>> Add copy_{to, from}_user() to machine check safe.
>>>>
>>>> If copy fail due to hardware memory error, only the relevant processes are
>>>> affected, so killing the user process and isolate the user page with
>>>> hardware memory errors is a more reasonable choice than kernel panic.
>>>
>>> Just to make sure I understand - we can only recover if the fault is in
>>> a user page. That is, for a copy_from_user(), we can only handle the
>>> faults in the source address, not the destination.
>>
>> At the beginning, I also thought we can only recover if the fault is in a
>> user page.
>> After discussion with a Mark[1], I think no matter user page or kernel page,
>> as long as it is triggered by the user process, only related processes will
>> be affected. According to this
>> understanding, it seems that all uaccess can be recovered.
>>
>> [1]https://patchwork.kernel.org/project/linux-arm-kernel/patch/20220406091311.3354723-6-tongtiangen@huawei.com/
> 
> We can indeed safely skip this copy and return an error just like
> pretending there was a user page fault. However, my point was more
> around the "isolate the user page with hardware memory errors". If the
> fault is on a kernel address, there's not much you can do about. You'll
> likely trigger it later when you try to access that address (maybe it
> was freed and re-allocated). Do we hope we won't get the same error
> again on that kernel address?

I think the page with memory error will be isolated by memory_failure(), 
generally, isolation will succeed, if isolate failure(we need to find 
out why), then maybe the same error will trigger it later.

Thanks.

> 

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH -next v4 4/7] arm64: add copy_{to, from}_user to machine check safe
@ 2022-05-05 14:33           ` Tong Tiangen
  0 siblings, 0 replies; 96+ messages in thread
From: Tong Tiangen @ 2022-05-05 14:33 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Mark Rutland, James Morse, Andrew Morton, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Robin Murphy, Dave Hansen,
	Will Deacon, Alexander Viro, Michael Ellerman,
	Benjamin Herrenschmidt, Paul Mackerras, x86, H . Peter Anvin,
	linuxppc-dev, linux-arm-kernel, linux-kernel, linux-mm,
	Kefeng Wang, Xie XiuQi, Guohanjun



在 2022/5/5 21:41, Catalin Marinas 写道:
> On Thu, May 05, 2022 at 02:39:43PM +0800, Tong Tiangen wrote:
>> 在 2022/5/4 18:26, Catalin Marinas 写道:
>>> On Wed, Apr 20, 2022 at 03:04:15AM +0000, Tong Tiangen wrote:
>>>> Add copy_{to, from}_user() to machine check safe.
>>>>
>>>> If copy fail due to hardware memory error, only the relevant processes are
>>>> affected, so killing the user process and isolate the user page with
>>>> hardware memory errors is a more reasonable choice than kernel panic.
>>>
>>> Just to make sure I understand - we can only recover if the fault is in
>>> a user page. That is, for a copy_from_user(), we can only handle the
>>> faults in the source address, not the destination.
>>
>> At the beginning, I also thought we can only recover if the fault is in a
>> user page.
>> After discussion with a Mark[1], I think no matter user page or kernel page,
>> as long as it is triggered by the user process, only related processes will
>> be affected. According to this
>> understanding, it seems that all uaccess can be recovered.
>>
>> [1]https://patchwork.kernel.org/project/linux-arm-kernel/patch/20220406091311.3354723-6-tongtiangen@huawei.com/
> 
> We can indeed safely skip this copy and return an error just like
> pretending there was a user page fault. However, my point was more
> around the "isolate the user page with hardware memory errors". If the
> fault is on a kernel address, there's not much you can do about. You'll
> likely trigger it later when you try to access that address (maybe it
> was freed and re-allocated). Do we hope we won't get the same error
> again on that kernel address?

I think the page with memory error will be isolated by memory_failure(), 
generally, isolation will succeed, if isolate failure(we need to find 
out why), then maybe the same error will trigger it later.

Thanks.

> 

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH -next v4 3/7] arm64: add support for machine check error safe
  2022-04-20  3:04   ` Tong Tiangen
  (?)
@ 2022-05-13 15:26     ` Mark Rutland
  -1 siblings, 0 replies; 96+ messages in thread
From: Mark Rutland @ 2022-05-13 15:26 UTC (permalink / raw)
  To: Tong Tiangen
  Cc: James Morse, Andrew Morton, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Robin Murphy, Dave Hansen, Catalin Marinas,
	Will Deacon, Alexander Viro, Michael Ellerman,
	Benjamin Herrenschmidt, Paul Mackerras, x86, H . Peter Anvin,
	linuxppc-dev, linux-arm-kernel, linux-kernel, linux-mm,
	Kefeng Wang, Xie XiuQi, Guohanjun

On Wed, Apr 20, 2022 at 03:04:14AM +0000, Tong Tiangen wrote:
> During the processing of arm64 kernel hardware memory errors(do_sea()), if
> the errors is consumed in the kernel, the current processing is panic.
> However, it is not optimal.
> 
> Take uaccess for example, if the uaccess operation fails due to memory
> error, only the user process will be affected, kill the user process
> and isolate the user page with hardware memory errors is a better choice.

Conceptually, I'm fine with the idea of constraining what we do for a
true uaccess, but I don't like the implementation of this at all, and I
think we first need to clean up the arm64 extable usage to clearly
distinguish a uaccess from another access.

> This patch only enable machine error check framework, it add exception
> fixup before kernel panic in do_sea() and only limit the consumption of
> hardware memory errors in kernel mode triggered by user mode processes.
> If fixup successful, panic can be avoided.
> 
> Consistent with PPC/x86, it is implemented by CONFIG_ARCH_HAS_COPY_MC.
> 
> Also add copy_mc_to_user() in include/linux/uaccess.h, this helper is
> called when CONFIG_ARCH_HAS_COPOY_MC is open.
> 
> Signed-off-by: Tong Tiangen <tongtiangen@huawei.com>
> ---
>  arch/arm64/Kconfig               |  1 +
>  arch/arm64/include/asm/extable.h |  1 +
>  arch/arm64/mm/extable.c          | 17 +++++++++++++++++
>  arch/arm64/mm/fault.c            | 27 ++++++++++++++++++++++++++-
>  include/linux/uaccess.h          |  9 +++++++++
>  5 files changed, 54 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
> index d9325dd95eba..012e38309955 100644
> --- a/arch/arm64/Kconfig
> +++ b/arch/arm64/Kconfig
> @@ -19,6 +19,7 @@ config ARM64
>  	select ARCH_ENABLE_SPLIT_PMD_PTLOCK if PGTABLE_LEVELS > 2
>  	select ARCH_ENABLE_THP_MIGRATION if TRANSPARENT_HUGEPAGE
>  	select ARCH_HAS_CACHE_LINE_SIZE
> +	select ARCH_HAS_COPY_MC if ACPI_APEI_GHES
>  	select ARCH_HAS_CURRENT_STACK_POINTER
>  	select ARCH_HAS_DEBUG_VIRTUAL
>  	select ARCH_HAS_DEBUG_VM_PGTABLE
> diff --git a/arch/arm64/include/asm/extable.h b/arch/arm64/include/asm/extable.h
> index 72b0e71cc3de..f80ebd0addfd 100644
> --- a/arch/arm64/include/asm/extable.h
> +++ b/arch/arm64/include/asm/extable.h
> @@ -46,4 +46,5 @@ bool ex_handler_bpf(const struct exception_table_entry *ex,
>  #endif /* !CONFIG_BPF_JIT */
>  
>  bool fixup_exception(struct pt_regs *regs);
> +bool fixup_exception_mc(struct pt_regs *regs);
>  #endif
> diff --git a/arch/arm64/mm/extable.c b/arch/arm64/mm/extable.c
> index 489455309695..4f0083a550d4 100644
> --- a/arch/arm64/mm/extable.c
> +++ b/arch/arm64/mm/extable.c
> @@ -9,6 +9,7 @@
>  
>  #include <asm/asm-extable.h>
>  #include <asm/ptrace.h>
> +#include <asm/esr.h>
>  
>  static inline unsigned long
>  get_ex_fixup(const struct exception_table_entry *ex)
> @@ -84,3 +85,19 @@ bool fixup_exception(struct pt_regs *regs)
>  
>  	BUG();
>  }
> +
> +bool fixup_exception_mc(struct pt_regs *regs)
> +{
> +	const struct exception_table_entry *ex;
> +
> +	ex = search_exception_tables(instruction_pointer(regs));
> +	if (!ex)
> +		return false;
> +
> +	/*
> +	 * This is not complete, More Machine check safe extable type can
> +	 * be processed here.
> +	 */
> +
> +	return false;
> +}

This is at best misnamed; It doesn't actually apply the fixup, it just
searches for one.

> diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c
> index 77341b160aca..a9e6fb1999d1 100644
> --- a/arch/arm64/mm/fault.c
> +++ b/arch/arm64/mm/fault.c
> @@ -695,6 +695,29 @@ static int do_bad(unsigned long far, unsigned int esr, struct pt_regs *regs)
>  	return 1; /* "fault" */
>  }
>  
> +static bool arm64_do_kernel_sea(unsigned long addr, unsigned int esr,
> +				     struct pt_regs *regs, int sig, int code)
> +{
> +	if (!IS_ENABLED(CONFIG_ARCH_HAS_COPY_MC))
> +		return false;
> +
> +	if (user_mode(regs) || !current->mm)
> +		return false;
> +
> +	if (apei_claim_sea(regs) < 0)
> +		return false;
> +
> +	if (!fixup_exception_mc(regs))
> +		return false;
> +
> +	set_thread_esr(0, esr);
> +
> +	arm64_force_sig_fault(sig, code, addr,
> +		"Uncorrected hardware memory error in kernel-access\n");
> +
> +	return true;
> +}
> +
>  static int do_sea(unsigned long far, unsigned int esr, struct pt_regs *regs)
>  {
>  	const struct fault_info *inf;
> @@ -720,7 +743,9 @@ static int do_sea(unsigned long far, unsigned int esr, struct pt_regs *regs)
>  		 */
>  		siaddr  = untagged_addr(far);
>  	}
> -	arm64_notify_die(inf->name, regs, inf->sig, inf->code, siaddr, esr);
> +
> +	if (!arm64_do_kernel_sea(siaddr, esr, regs, inf->sig, inf->code))
> +		arm64_notify_die(inf->name, regs, inf->sig, inf->code, siaddr, esr);
>  
>  	return 0;
>  }
> diff --git a/include/linux/uaccess.h b/include/linux/uaccess.h
> index 546179418ffa..884661b29c17 100644
> --- a/include/linux/uaccess.h
> +++ b/include/linux/uaccess.h
> @@ -174,6 +174,15 @@ copy_mc_to_kernel(void *dst, const void *src, size_t cnt)
>  }
>  #endif
>  
> +#ifndef copy_mc_to_user
> +static inline unsigned long __must_check
> +copy_mc_to_user(void *dst, const void *src, size_t cnt)
> +{
> +	check_object_size(src, cnt, true);
> +	return raw_copy_to_user(dst, src, cnt);
> +}
> +#endif

Why do we need a special copy_mc_to_user() ?

Why are we not making *every* true uaccess recoverable? That way the
regular copy_to_user() would just work.

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH -next v4 3/7] arm64: add support for machine check error safe
@ 2022-05-13 15:26     ` Mark Rutland
  0 siblings, 0 replies; 96+ messages in thread
From: Mark Rutland @ 2022-05-13 15:26 UTC (permalink / raw)
  To: Tong Tiangen
  Cc: Kefeng Wang, Dave Hansen, linux-mm, Paul Mackerras, Guohanjun,
	Will Deacon, H . Peter Anvin, x86, Ingo Molnar, Catalin Marinas,
	Xie XiuQi, Borislav Petkov, Alexander Viro, Thomas Gleixner,
	linux-arm-kernel, Robin Murphy, linux-kernel, James Morse,
	Andrew Morton, linuxppc-dev

On Wed, Apr 20, 2022 at 03:04:14AM +0000, Tong Tiangen wrote:
> During the processing of arm64 kernel hardware memory errors(do_sea()), if
> the errors is consumed in the kernel, the current processing is panic.
> However, it is not optimal.
> 
> Take uaccess for example, if the uaccess operation fails due to memory
> error, only the user process will be affected, kill the user process
> and isolate the user page with hardware memory errors is a better choice.

Conceptually, I'm fine with the idea of constraining what we do for a
true uaccess, but I don't like the implementation of this at all, and I
think we first need to clean up the arm64 extable usage to clearly
distinguish a uaccess from another access.

> This patch only enable machine error check framework, it add exception
> fixup before kernel panic in do_sea() and only limit the consumption of
> hardware memory errors in kernel mode triggered by user mode processes.
> If fixup successful, panic can be avoided.
> 
> Consistent with PPC/x86, it is implemented by CONFIG_ARCH_HAS_COPY_MC.
> 
> Also add copy_mc_to_user() in include/linux/uaccess.h, this helper is
> called when CONFIG_ARCH_HAS_COPOY_MC is open.
> 
> Signed-off-by: Tong Tiangen <tongtiangen@huawei.com>
> ---
>  arch/arm64/Kconfig               |  1 +
>  arch/arm64/include/asm/extable.h |  1 +
>  arch/arm64/mm/extable.c          | 17 +++++++++++++++++
>  arch/arm64/mm/fault.c            | 27 ++++++++++++++++++++++++++-
>  include/linux/uaccess.h          |  9 +++++++++
>  5 files changed, 54 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
> index d9325dd95eba..012e38309955 100644
> --- a/arch/arm64/Kconfig
> +++ b/arch/arm64/Kconfig
> @@ -19,6 +19,7 @@ config ARM64
>  	select ARCH_ENABLE_SPLIT_PMD_PTLOCK if PGTABLE_LEVELS > 2
>  	select ARCH_ENABLE_THP_MIGRATION if TRANSPARENT_HUGEPAGE
>  	select ARCH_HAS_CACHE_LINE_SIZE
> +	select ARCH_HAS_COPY_MC if ACPI_APEI_GHES
>  	select ARCH_HAS_CURRENT_STACK_POINTER
>  	select ARCH_HAS_DEBUG_VIRTUAL
>  	select ARCH_HAS_DEBUG_VM_PGTABLE
> diff --git a/arch/arm64/include/asm/extable.h b/arch/arm64/include/asm/extable.h
> index 72b0e71cc3de..f80ebd0addfd 100644
> --- a/arch/arm64/include/asm/extable.h
> +++ b/arch/arm64/include/asm/extable.h
> @@ -46,4 +46,5 @@ bool ex_handler_bpf(const struct exception_table_entry *ex,
>  #endif /* !CONFIG_BPF_JIT */
>  
>  bool fixup_exception(struct pt_regs *regs);
> +bool fixup_exception_mc(struct pt_regs *regs);
>  #endif
> diff --git a/arch/arm64/mm/extable.c b/arch/arm64/mm/extable.c
> index 489455309695..4f0083a550d4 100644
> --- a/arch/arm64/mm/extable.c
> +++ b/arch/arm64/mm/extable.c
> @@ -9,6 +9,7 @@
>  
>  #include <asm/asm-extable.h>
>  #include <asm/ptrace.h>
> +#include <asm/esr.h>
>  
>  static inline unsigned long
>  get_ex_fixup(const struct exception_table_entry *ex)
> @@ -84,3 +85,19 @@ bool fixup_exception(struct pt_regs *regs)
>  
>  	BUG();
>  }
> +
> +bool fixup_exception_mc(struct pt_regs *regs)
> +{
> +	const struct exception_table_entry *ex;
> +
> +	ex = search_exception_tables(instruction_pointer(regs));
> +	if (!ex)
> +		return false;
> +
> +	/*
> +	 * This is not complete, More Machine check safe extable type can
> +	 * be processed here.
> +	 */
> +
> +	return false;
> +}

This is at best misnamed; It doesn't actually apply the fixup, it just
searches for one.

> diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c
> index 77341b160aca..a9e6fb1999d1 100644
> --- a/arch/arm64/mm/fault.c
> +++ b/arch/arm64/mm/fault.c
> @@ -695,6 +695,29 @@ static int do_bad(unsigned long far, unsigned int esr, struct pt_regs *regs)
>  	return 1; /* "fault" */
>  }
>  
> +static bool arm64_do_kernel_sea(unsigned long addr, unsigned int esr,
> +				     struct pt_regs *regs, int sig, int code)
> +{
> +	if (!IS_ENABLED(CONFIG_ARCH_HAS_COPY_MC))
> +		return false;
> +
> +	if (user_mode(regs) || !current->mm)
> +		return false;
> +
> +	if (apei_claim_sea(regs) < 0)
> +		return false;
> +
> +	if (!fixup_exception_mc(regs))
> +		return false;
> +
> +	set_thread_esr(0, esr);
> +
> +	arm64_force_sig_fault(sig, code, addr,
> +		"Uncorrected hardware memory error in kernel-access\n");
> +
> +	return true;
> +}
> +
>  static int do_sea(unsigned long far, unsigned int esr, struct pt_regs *regs)
>  {
>  	const struct fault_info *inf;
> @@ -720,7 +743,9 @@ static int do_sea(unsigned long far, unsigned int esr, struct pt_regs *regs)
>  		 */
>  		siaddr  = untagged_addr(far);
>  	}
> -	arm64_notify_die(inf->name, regs, inf->sig, inf->code, siaddr, esr);
> +
> +	if (!arm64_do_kernel_sea(siaddr, esr, regs, inf->sig, inf->code))
> +		arm64_notify_die(inf->name, regs, inf->sig, inf->code, siaddr, esr);
>  
>  	return 0;
>  }
> diff --git a/include/linux/uaccess.h b/include/linux/uaccess.h
> index 546179418ffa..884661b29c17 100644
> --- a/include/linux/uaccess.h
> +++ b/include/linux/uaccess.h
> @@ -174,6 +174,15 @@ copy_mc_to_kernel(void *dst, const void *src, size_t cnt)
>  }
>  #endif
>  
> +#ifndef copy_mc_to_user
> +static inline unsigned long __must_check
> +copy_mc_to_user(void *dst, const void *src, size_t cnt)
> +{
> +	check_object_size(src, cnt, true);
> +	return raw_copy_to_user(dst, src, cnt);
> +}
> +#endif

Why do we need a special copy_mc_to_user() ?

Why are we not making *every* true uaccess recoverable? That way the
regular copy_to_user() would just work.

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH -next v4 3/7] arm64: add support for machine check error safe
@ 2022-05-13 15:26     ` Mark Rutland
  0 siblings, 0 replies; 96+ messages in thread
From: Mark Rutland @ 2022-05-13 15:26 UTC (permalink / raw)
  To: Tong Tiangen
  Cc: James Morse, Andrew Morton, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Robin Murphy, Dave Hansen, Catalin Marinas,
	Will Deacon, Alexander Viro, Michael Ellerman,
	Benjamin Herrenschmidt, Paul Mackerras, x86, H . Peter Anvin,
	linuxppc-dev, linux-arm-kernel, linux-kernel, linux-mm,
	Kefeng Wang, Xie XiuQi, Guohanjun

On Wed, Apr 20, 2022 at 03:04:14AM +0000, Tong Tiangen wrote:
> During the processing of arm64 kernel hardware memory errors(do_sea()), if
> the errors is consumed in the kernel, the current processing is panic.
> However, it is not optimal.
> 
> Take uaccess for example, if the uaccess operation fails due to memory
> error, only the user process will be affected, kill the user process
> and isolate the user page with hardware memory errors is a better choice.

Conceptually, I'm fine with the idea of constraining what we do for a
true uaccess, but I don't like the implementation of this at all, and I
think we first need to clean up the arm64 extable usage to clearly
distinguish a uaccess from another access.

> This patch only enable machine error check framework, it add exception
> fixup before kernel panic in do_sea() and only limit the consumption of
> hardware memory errors in kernel mode triggered by user mode processes.
> If fixup successful, panic can be avoided.
> 
> Consistent with PPC/x86, it is implemented by CONFIG_ARCH_HAS_COPY_MC.
> 
> Also add copy_mc_to_user() in include/linux/uaccess.h, this helper is
> called when CONFIG_ARCH_HAS_COPOY_MC is open.
> 
> Signed-off-by: Tong Tiangen <tongtiangen@huawei.com>
> ---
>  arch/arm64/Kconfig               |  1 +
>  arch/arm64/include/asm/extable.h |  1 +
>  arch/arm64/mm/extable.c          | 17 +++++++++++++++++
>  arch/arm64/mm/fault.c            | 27 ++++++++++++++++++++++++++-
>  include/linux/uaccess.h          |  9 +++++++++
>  5 files changed, 54 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
> index d9325dd95eba..012e38309955 100644
> --- a/arch/arm64/Kconfig
> +++ b/arch/arm64/Kconfig
> @@ -19,6 +19,7 @@ config ARM64
>  	select ARCH_ENABLE_SPLIT_PMD_PTLOCK if PGTABLE_LEVELS > 2
>  	select ARCH_ENABLE_THP_MIGRATION if TRANSPARENT_HUGEPAGE
>  	select ARCH_HAS_CACHE_LINE_SIZE
> +	select ARCH_HAS_COPY_MC if ACPI_APEI_GHES
>  	select ARCH_HAS_CURRENT_STACK_POINTER
>  	select ARCH_HAS_DEBUG_VIRTUAL
>  	select ARCH_HAS_DEBUG_VM_PGTABLE
> diff --git a/arch/arm64/include/asm/extable.h b/arch/arm64/include/asm/extable.h
> index 72b0e71cc3de..f80ebd0addfd 100644
> --- a/arch/arm64/include/asm/extable.h
> +++ b/arch/arm64/include/asm/extable.h
> @@ -46,4 +46,5 @@ bool ex_handler_bpf(const struct exception_table_entry *ex,
>  #endif /* !CONFIG_BPF_JIT */
>  
>  bool fixup_exception(struct pt_regs *regs);
> +bool fixup_exception_mc(struct pt_regs *regs);
>  #endif
> diff --git a/arch/arm64/mm/extable.c b/arch/arm64/mm/extable.c
> index 489455309695..4f0083a550d4 100644
> --- a/arch/arm64/mm/extable.c
> +++ b/arch/arm64/mm/extable.c
> @@ -9,6 +9,7 @@
>  
>  #include <asm/asm-extable.h>
>  #include <asm/ptrace.h>
> +#include <asm/esr.h>
>  
>  static inline unsigned long
>  get_ex_fixup(const struct exception_table_entry *ex)
> @@ -84,3 +85,19 @@ bool fixup_exception(struct pt_regs *regs)
>  
>  	BUG();
>  }
> +
> +bool fixup_exception_mc(struct pt_regs *regs)
> +{
> +	const struct exception_table_entry *ex;
> +
> +	ex = search_exception_tables(instruction_pointer(regs));
> +	if (!ex)
> +		return false;
> +
> +	/*
> +	 * This is not complete, More Machine check safe extable type can
> +	 * be processed here.
> +	 */
> +
> +	return false;
> +}

This is at best misnamed; It doesn't actually apply the fixup, it just
searches for one.

> diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c
> index 77341b160aca..a9e6fb1999d1 100644
> --- a/arch/arm64/mm/fault.c
> +++ b/arch/arm64/mm/fault.c
> @@ -695,6 +695,29 @@ static int do_bad(unsigned long far, unsigned int esr, struct pt_regs *regs)
>  	return 1; /* "fault" */
>  }
>  
> +static bool arm64_do_kernel_sea(unsigned long addr, unsigned int esr,
> +				     struct pt_regs *regs, int sig, int code)
> +{
> +	if (!IS_ENABLED(CONFIG_ARCH_HAS_COPY_MC))
> +		return false;
> +
> +	if (user_mode(regs) || !current->mm)
> +		return false;
> +
> +	if (apei_claim_sea(regs) < 0)
> +		return false;
> +
> +	if (!fixup_exception_mc(regs))
> +		return false;
> +
> +	set_thread_esr(0, esr);
> +
> +	arm64_force_sig_fault(sig, code, addr,
> +		"Uncorrected hardware memory error in kernel-access\n");
> +
> +	return true;
> +}
> +
>  static int do_sea(unsigned long far, unsigned int esr, struct pt_regs *regs)
>  {
>  	const struct fault_info *inf;
> @@ -720,7 +743,9 @@ static int do_sea(unsigned long far, unsigned int esr, struct pt_regs *regs)
>  		 */
>  		siaddr  = untagged_addr(far);
>  	}
> -	arm64_notify_die(inf->name, regs, inf->sig, inf->code, siaddr, esr);
> +
> +	if (!arm64_do_kernel_sea(siaddr, esr, regs, inf->sig, inf->code))
> +		arm64_notify_die(inf->name, regs, inf->sig, inf->code, siaddr, esr);
>  
>  	return 0;
>  }
> diff --git a/include/linux/uaccess.h b/include/linux/uaccess.h
> index 546179418ffa..884661b29c17 100644
> --- a/include/linux/uaccess.h
> +++ b/include/linux/uaccess.h
> @@ -174,6 +174,15 @@ copy_mc_to_kernel(void *dst, const void *src, size_t cnt)
>  }
>  #endif
>  
> +#ifndef copy_mc_to_user
> +static inline unsigned long __must_check
> +copy_mc_to_user(void *dst, const void *src, size_t cnt)
> +{
> +	check_object_size(src, cnt, true);
> +	return raw_copy_to_user(dst, src, cnt);
> +}
> +#endif

Why do we need a special copy_mc_to_user() ?

Why are we not making *every* true uaccess recoverable? That way the
regular copy_to_user() would just work.

Thanks,
Mark.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH -next v4 4/7] arm64: add copy_{to, from}_user to machine check safe
  2022-04-20  3:04   ` Tong Tiangen
  (?)
@ 2022-05-13 15:31     ` Mark Rutland
  -1 siblings, 0 replies; 96+ messages in thread
From: Mark Rutland @ 2022-05-13 15:31 UTC (permalink / raw)
  To: Tong Tiangen
  Cc: James Morse, Andrew Morton, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Robin Murphy, Dave Hansen, Catalin Marinas,
	Will Deacon, Alexander Viro, Michael Ellerman,
	Benjamin Herrenschmidt, Paul Mackerras, x86, H . Peter Anvin,
	linuxppc-dev, linux-arm-kernel, linux-kernel, linux-mm,
	Kefeng Wang, Xie XiuQi, Guohanjun

On Wed, Apr 20, 2022 at 03:04:15AM +0000, Tong Tiangen wrote:
> Add copy_{to, from}_user() to machine check safe.
> 
> If copy fail due to hardware memory error, only the relevant processes are
> affected, so killing the user process and isolate the user page with
> hardware memory errors is a more reasonable choice than kernel panic.
> 
> Add new extable type EX_TYPE_UACCESS_MC which can be used for uaccess that
> can be recovered from hardware memory errors.

I don't understand why we need this.

If we apply EX_TYPE_UACCESS consistently to *all* user accesses, and
*only* to user accesses, that would *always* indicate that we can
recover, and that seems much simpler to deal with.

Today we use EX_TYPE_UACCESS_ERR_ZERO for kernel accesses in a couple of
cases, which we should clean up, and we user EX_TYPE_FIXUP for a couple
of user accesses, but those could easily be converted over.

> The x16 register is used to save the fixup type in copy_xxx_user which
> used extable type EX_TYPE_UACCESS_MC.

Why x16?

How is this intended to be consumed, and why is that behaviour different
from any *other* fault?

Mark.

> Signed-off-by: Tong Tiangen <tongtiangen@huawei.com>
> ---
>  arch/arm64/include/asm/asm-extable.h | 14 ++++++++++++++
>  arch/arm64/include/asm/asm-uaccess.h | 15 ++++++++++-----
>  arch/arm64/lib/copy_from_user.S      | 18 +++++++++++-------
>  arch/arm64/lib/copy_to_user.S        | 18 +++++++++++-------
>  arch/arm64/mm/extable.c              | 18 ++++++++++++++----
>  5 files changed, 60 insertions(+), 23 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/asm-extable.h b/arch/arm64/include/asm/asm-extable.h
> index c39f2437e08e..75b2c00e9523 100644
> --- a/arch/arm64/include/asm/asm-extable.h
> +++ b/arch/arm64/include/asm/asm-extable.h
> @@ -2,12 +2,18 @@
>  #ifndef __ASM_ASM_EXTABLE_H
>  #define __ASM_ASM_EXTABLE_H
>  
> +#define FIXUP_TYPE_NORMAL		0
> +#define FIXUP_TYPE_MC			1
> +
>  #define EX_TYPE_NONE			0
>  #define EX_TYPE_FIXUP			1
>  #define EX_TYPE_BPF			2
>  #define EX_TYPE_UACCESS_ERR_ZERO	3
>  #define EX_TYPE_LOAD_UNALIGNED_ZEROPAD	4
>  
> +/* _MC indicates that can fixup from machine check errors */
> +#define EX_TYPE_UACCESS_MC		5
> +
>  #ifdef __ASSEMBLY__
>  
>  #define __ASM_EXTABLE_RAW(insn, fixup, type, data)	\
> @@ -27,6 +33,14 @@
>  	__ASM_EXTABLE_RAW(\insn, \fixup, EX_TYPE_FIXUP, 0)
>  	.endm
>  
> +/*
> + * Create an exception table entry for `insn`, which will branch to `fixup`
> + * when an unhandled fault(include sea fault) is taken.
> + */
> +	.macro          _asm_extable_uaccess_mc, insn, fixup
> +	__ASM_EXTABLE_RAW(\insn, \fixup, EX_TYPE_UACCESS_MC, 0)
> +	.endm
> +
>  /*
>   * Create an exception table entry for `insn` if `fixup` is provided. Otherwise
>   * do nothing.
> diff --git a/arch/arm64/include/asm/asm-uaccess.h b/arch/arm64/include/asm/asm-uaccess.h
> index 0557af834e03..6c23c138e1fc 100644
> --- a/arch/arm64/include/asm/asm-uaccess.h
> +++ b/arch/arm64/include/asm/asm-uaccess.h
> @@ -63,6 +63,11 @@ alternative_else_nop_endif
>  9999:	x;					\
>  	_asm_extable	9999b, l
>  
> +
> +#define USER_MC(l, x...)			\
> +9999:	x;					\
> +	_asm_extable_uaccess_mc	9999b, l
> +
>  /*
>   * Generate the assembly for LDTR/STTR with exception table entries.
>   * This is complicated as there is no post-increment or pair versions of the
> @@ -73,8 +78,8 @@ alternative_else_nop_endif
>  8889:		ldtr	\reg2, [\addr, #8];
>  		add	\addr, \addr, \post_inc;
>  
> -		_asm_extable	8888b,\l;
> -		_asm_extable	8889b,\l;
> +		_asm_extable_uaccess_mc	8888b, \l;
> +		_asm_extable_uaccess_mc	8889b, \l;
>  	.endm
>  
>  	.macro user_stp l, reg1, reg2, addr, post_inc
> @@ -82,14 +87,14 @@ alternative_else_nop_endif
>  8889:		sttr	\reg2, [\addr, #8];
>  		add	\addr, \addr, \post_inc;
>  
> -		_asm_extable	8888b,\l;
> -		_asm_extable	8889b,\l;
> +		_asm_extable_uaccess_mc	8888b,\l;
> +		_asm_extable_uaccess_mc	8889b,\l;
>  	.endm
>  
>  	.macro user_ldst l, inst, reg, addr, post_inc
>  8888:		\inst		\reg, [\addr];
>  		add		\addr, \addr, \post_inc;
>  
> -		_asm_extable	8888b,\l;
> +		_asm_extable_uaccess_mc	8888b, \l;
>  	.endm
>  #endif
> diff --git a/arch/arm64/lib/copy_from_user.S b/arch/arm64/lib/copy_from_user.S
> index 34e317907524..480cc5ac0a8d 100644
> --- a/arch/arm64/lib/copy_from_user.S
> +++ b/arch/arm64/lib/copy_from_user.S
> @@ -25,7 +25,7 @@
>  	.endm
>  
>  	.macro strb1 reg, ptr, val
> -	strb \reg, [\ptr], \val
> +	USER_MC(9998f, strb \reg, [\ptr], \val)
>  	.endm
>  
>  	.macro ldrh1 reg, ptr, val
> @@ -33,7 +33,7 @@
>  	.endm
>  
>  	.macro strh1 reg, ptr, val
> -	strh \reg, [\ptr], \val
> +	USER_MC(9998f, strh \reg, [\ptr], \val)
>  	.endm
>  
>  	.macro ldr1 reg, ptr, val
> @@ -41,7 +41,7 @@
>  	.endm
>  
>  	.macro str1 reg, ptr, val
> -	str \reg, [\ptr], \val
> +	USER_MC(9998f, str \reg, [\ptr], \val)
>  	.endm
>  
>  	.macro ldp1 reg1, reg2, ptr, val
> @@ -49,11 +49,12 @@
>  	.endm
>  
>  	.macro stp1 reg1, reg2, ptr, val
> -	stp \reg1, \reg2, [\ptr], \val
> +	USER_MC(9998f, stp \reg1, \reg2, [\ptr], \val)
>  	.endm
>  
> -end	.req	x5
> -srcin	.req	x15
> +end		.req	x5
> +srcin		.req	x15
> +fixup_type	.req	x16
>  SYM_FUNC_START(__arch_copy_from_user)
>  	add	end, x0, x2
>  	mov	srcin, x1
> @@ -62,7 +63,10 @@ SYM_FUNC_START(__arch_copy_from_user)
>  	ret
>  
>  	// Exception fixups
> -9997:	cmp	dst, dstin
> +	// x16: fixup type written by ex_handler_uaccess_mc
> +9997:	cmp 	fixup_type, #FIXUP_TYPE_MC
> +	b.eq	9998f
> +	cmp	dst, dstin
>  	b.ne	9998f
>  	// Before being absolutely sure we couldn't copy anything, try harder
>  USER(9998f, ldtrb tmp1w, [srcin])
> diff --git a/arch/arm64/lib/copy_to_user.S b/arch/arm64/lib/copy_to_user.S
> index 802231772608..021a7d27b3a4 100644
> --- a/arch/arm64/lib/copy_to_user.S
> +++ b/arch/arm64/lib/copy_to_user.S
> @@ -20,7 +20,7 @@
>   *	x0 - bytes not copied
>   */
>  	.macro ldrb1 reg, ptr, val
> -	ldrb  \reg, [\ptr], \val
> +	USER_MC(9998f, ldrb  \reg, [\ptr], \val)
>  	.endm
>  
>  	.macro strb1 reg, ptr, val
> @@ -28,7 +28,7 @@
>  	.endm
>  
>  	.macro ldrh1 reg, ptr, val
> -	ldrh  \reg, [\ptr], \val
> +	USER_MC(9998f, ldrh  \reg, [\ptr], \val)
>  	.endm
>  
>  	.macro strh1 reg, ptr, val
> @@ -36,7 +36,7 @@
>  	.endm
>  
>  	.macro ldr1 reg, ptr, val
> -	ldr \reg, [\ptr], \val
> +	USER_MC(9998f, ldr \reg, [\ptr], \val)
>  	.endm
>  
>  	.macro str1 reg, ptr, val
> @@ -44,15 +44,16 @@
>  	.endm
>  
>  	.macro ldp1 reg1, reg2, ptr, val
> -	ldp \reg1, \reg2, [\ptr], \val
> +	USER_MC(9998f, ldp \reg1, \reg2, [\ptr], \val)
>  	.endm
>  
>  	.macro stp1 reg1, reg2, ptr, val
>  	user_stp 9997f, \reg1, \reg2, \ptr, \val
>  	.endm
>  
> -end	.req	x5
> -srcin	.req	x15
> +end		.req	x5
> +srcin		.req	x15
> +fixup_type	.req	x16
>  SYM_FUNC_START(__arch_copy_to_user)
>  	add	end, x0, x2
>  	mov	srcin, x1
> @@ -61,7 +62,10 @@ SYM_FUNC_START(__arch_copy_to_user)
>  	ret
>  
>  	// Exception fixups
> -9997:	cmp	dst, dstin
> +	// x16: fixup type written by ex_handler_uaccess_mc
> +9997:	cmp 	fixup_type, #FIXUP_TYPE_MC
> +	b.eq	9998f
> +	cmp	dst, dstin
>  	b.ne	9998f
>  	// Before being absolutely sure we couldn't copy anything, try harder
>  	ldrb	tmp1w, [srcin]
> diff --git a/arch/arm64/mm/extable.c b/arch/arm64/mm/extable.c
> index 4f0083a550d4..525876c3ebf4 100644
> --- a/arch/arm64/mm/extable.c
> +++ b/arch/arm64/mm/extable.c
> @@ -24,6 +24,14 @@ static bool ex_handler_fixup(const struct exception_table_entry *ex,
>  	return true;
>  }
>  
> +static bool ex_handler_uaccess_type(const struct exception_table_entry *ex,
> +			     struct pt_regs *regs,
> +			     unsigned long fixup_type)
> +{
> +	regs->regs[16] = fixup_type;
> +	return ex_handler_fixup(ex, regs);
> +}
> +
>  static bool ex_handler_uaccess_err_zero(const struct exception_table_entry *ex,
>  					struct pt_regs *regs)
>  {
> @@ -75,6 +83,8 @@ bool fixup_exception(struct pt_regs *regs)
>  	switch (ex->type) {
>  	case EX_TYPE_FIXUP:
>  		return ex_handler_fixup(ex, regs);
> +	case EX_TYPE_UACCESS_MC:
> +		return ex_handler_uaccess_type(ex, regs, FIXUP_TYPE_NORMAL);
>  	case EX_TYPE_BPF:
>  		return ex_handler_bpf(ex, regs);
>  	case EX_TYPE_UACCESS_ERR_ZERO:
> @@ -94,10 +104,10 @@ bool fixup_exception_mc(struct pt_regs *regs)
>  	if (!ex)
>  		return false;
>  
> -	/*
> -	 * This is not complete, More Machine check safe extable type can
> -	 * be processed here.
> -	 */
> +	switch (ex->type) {
> +	case EX_TYPE_UACCESS_MC:
> +		return ex_handler_uaccess_type(ex, regs, FIXUP_TYPE_MC);
> +	}
>  
>  	return false;
>  }
> -- 
> 2.25.1
> 

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH -next v4 4/7] arm64: add copy_{to, from}_user to machine check safe
@ 2022-05-13 15:31     ` Mark Rutland
  0 siblings, 0 replies; 96+ messages in thread
From: Mark Rutland @ 2022-05-13 15:31 UTC (permalink / raw)
  To: Tong Tiangen
  Cc: Kefeng Wang, Dave Hansen, linux-mm, Paul Mackerras, Guohanjun,
	Will Deacon, H . Peter Anvin, x86, Ingo Molnar, Catalin Marinas,
	Xie XiuQi, Borislav Petkov, Alexander Viro, Thomas Gleixner,
	linux-arm-kernel, Robin Murphy, linux-kernel, James Morse,
	Andrew Morton, linuxppc-dev

On Wed, Apr 20, 2022 at 03:04:15AM +0000, Tong Tiangen wrote:
> Add copy_{to, from}_user() to machine check safe.
> 
> If copy fail due to hardware memory error, only the relevant processes are
> affected, so killing the user process and isolate the user page with
> hardware memory errors is a more reasonable choice than kernel panic.
> 
> Add new extable type EX_TYPE_UACCESS_MC which can be used for uaccess that
> can be recovered from hardware memory errors.

I don't understand why we need this.

If we apply EX_TYPE_UACCESS consistently to *all* user accesses, and
*only* to user accesses, that would *always* indicate that we can
recover, and that seems much simpler to deal with.

Today we use EX_TYPE_UACCESS_ERR_ZERO for kernel accesses in a couple of
cases, which we should clean up, and we user EX_TYPE_FIXUP for a couple
of user accesses, but those could easily be converted over.

> The x16 register is used to save the fixup type in copy_xxx_user which
> used extable type EX_TYPE_UACCESS_MC.

Why x16?

How is this intended to be consumed, and why is that behaviour different
from any *other* fault?

Mark.

> Signed-off-by: Tong Tiangen <tongtiangen@huawei.com>
> ---
>  arch/arm64/include/asm/asm-extable.h | 14 ++++++++++++++
>  arch/arm64/include/asm/asm-uaccess.h | 15 ++++++++++-----
>  arch/arm64/lib/copy_from_user.S      | 18 +++++++++++-------
>  arch/arm64/lib/copy_to_user.S        | 18 +++++++++++-------
>  arch/arm64/mm/extable.c              | 18 ++++++++++++++----
>  5 files changed, 60 insertions(+), 23 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/asm-extable.h b/arch/arm64/include/asm/asm-extable.h
> index c39f2437e08e..75b2c00e9523 100644
> --- a/arch/arm64/include/asm/asm-extable.h
> +++ b/arch/arm64/include/asm/asm-extable.h
> @@ -2,12 +2,18 @@
>  #ifndef __ASM_ASM_EXTABLE_H
>  #define __ASM_ASM_EXTABLE_H
>  
> +#define FIXUP_TYPE_NORMAL		0
> +#define FIXUP_TYPE_MC			1
> +
>  #define EX_TYPE_NONE			0
>  #define EX_TYPE_FIXUP			1
>  #define EX_TYPE_BPF			2
>  #define EX_TYPE_UACCESS_ERR_ZERO	3
>  #define EX_TYPE_LOAD_UNALIGNED_ZEROPAD	4
>  
> +/* _MC indicates that can fixup from machine check errors */
> +#define EX_TYPE_UACCESS_MC		5
> +
>  #ifdef __ASSEMBLY__
>  
>  #define __ASM_EXTABLE_RAW(insn, fixup, type, data)	\
> @@ -27,6 +33,14 @@
>  	__ASM_EXTABLE_RAW(\insn, \fixup, EX_TYPE_FIXUP, 0)
>  	.endm
>  
> +/*
> + * Create an exception table entry for `insn`, which will branch to `fixup`
> + * when an unhandled fault(include sea fault) is taken.
> + */
> +	.macro          _asm_extable_uaccess_mc, insn, fixup
> +	__ASM_EXTABLE_RAW(\insn, \fixup, EX_TYPE_UACCESS_MC, 0)
> +	.endm
> +
>  /*
>   * Create an exception table entry for `insn` if `fixup` is provided. Otherwise
>   * do nothing.
> diff --git a/arch/arm64/include/asm/asm-uaccess.h b/arch/arm64/include/asm/asm-uaccess.h
> index 0557af834e03..6c23c138e1fc 100644
> --- a/arch/arm64/include/asm/asm-uaccess.h
> +++ b/arch/arm64/include/asm/asm-uaccess.h
> @@ -63,6 +63,11 @@ alternative_else_nop_endif
>  9999:	x;					\
>  	_asm_extable	9999b, l
>  
> +
> +#define USER_MC(l, x...)			\
> +9999:	x;					\
> +	_asm_extable_uaccess_mc	9999b, l
> +
>  /*
>   * Generate the assembly for LDTR/STTR with exception table entries.
>   * This is complicated as there is no post-increment or pair versions of the
> @@ -73,8 +78,8 @@ alternative_else_nop_endif
>  8889:		ldtr	\reg2, [\addr, #8];
>  		add	\addr, \addr, \post_inc;
>  
> -		_asm_extable	8888b,\l;
> -		_asm_extable	8889b,\l;
> +		_asm_extable_uaccess_mc	8888b, \l;
> +		_asm_extable_uaccess_mc	8889b, \l;
>  	.endm
>  
>  	.macro user_stp l, reg1, reg2, addr, post_inc
> @@ -82,14 +87,14 @@ alternative_else_nop_endif
>  8889:		sttr	\reg2, [\addr, #8];
>  		add	\addr, \addr, \post_inc;
>  
> -		_asm_extable	8888b,\l;
> -		_asm_extable	8889b,\l;
> +		_asm_extable_uaccess_mc	8888b,\l;
> +		_asm_extable_uaccess_mc	8889b,\l;
>  	.endm
>  
>  	.macro user_ldst l, inst, reg, addr, post_inc
>  8888:		\inst		\reg, [\addr];
>  		add		\addr, \addr, \post_inc;
>  
> -		_asm_extable	8888b,\l;
> +		_asm_extable_uaccess_mc	8888b, \l;
>  	.endm
>  #endif
> diff --git a/arch/arm64/lib/copy_from_user.S b/arch/arm64/lib/copy_from_user.S
> index 34e317907524..480cc5ac0a8d 100644
> --- a/arch/arm64/lib/copy_from_user.S
> +++ b/arch/arm64/lib/copy_from_user.S
> @@ -25,7 +25,7 @@
>  	.endm
>  
>  	.macro strb1 reg, ptr, val
> -	strb \reg, [\ptr], \val
> +	USER_MC(9998f, strb \reg, [\ptr], \val)
>  	.endm
>  
>  	.macro ldrh1 reg, ptr, val
> @@ -33,7 +33,7 @@
>  	.endm
>  
>  	.macro strh1 reg, ptr, val
> -	strh \reg, [\ptr], \val
> +	USER_MC(9998f, strh \reg, [\ptr], \val)
>  	.endm
>  
>  	.macro ldr1 reg, ptr, val
> @@ -41,7 +41,7 @@
>  	.endm
>  
>  	.macro str1 reg, ptr, val
> -	str \reg, [\ptr], \val
> +	USER_MC(9998f, str \reg, [\ptr], \val)
>  	.endm
>  
>  	.macro ldp1 reg1, reg2, ptr, val
> @@ -49,11 +49,12 @@
>  	.endm
>  
>  	.macro stp1 reg1, reg2, ptr, val
> -	stp \reg1, \reg2, [\ptr], \val
> +	USER_MC(9998f, stp \reg1, \reg2, [\ptr], \val)
>  	.endm
>  
> -end	.req	x5
> -srcin	.req	x15
> +end		.req	x5
> +srcin		.req	x15
> +fixup_type	.req	x16
>  SYM_FUNC_START(__arch_copy_from_user)
>  	add	end, x0, x2
>  	mov	srcin, x1
> @@ -62,7 +63,10 @@ SYM_FUNC_START(__arch_copy_from_user)
>  	ret
>  
>  	// Exception fixups
> -9997:	cmp	dst, dstin
> +	// x16: fixup type written by ex_handler_uaccess_mc
> +9997:	cmp 	fixup_type, #FIXUP_TYPE_MC
> +	b.eq	9998f
> +	cmp	dst, dstin
>  	b.ne	9998f
>  	// Before being absolutely sure we couldn't copy anything, try harder
>  USER(9998f, ldtrb tmp1w, [srcin])
> diff --git a/arch/arm64/lib/copy_to_user.S b/arch/arm64/lib/copy_to_user.S
> index 802231772608..021a7d27b3a4 100644
> --- a/arch/arm64/lib/copy_to_user.S
> +++ b/arch/arm64/lib/copy_to_user.S
> @@ -20,7 +20,7 @@
>   *	x0 - bytes not copied
>   */
>  	.macro ldrb1 reg, ptr, val
> -	ldrb  \reg, [\ptr], \val
> +	USER_MC(9998f, ldrb  \reg, [\ptr], \val)
>  	.endm
>  
>  	.macro strb1 reg, ptr, val
> @@ -28,7 +28,7 @@
>  	.endm
>  
>  	.macro ldrh1 reg, ptr, val
> -	ldrh  \reg, [\ptr], \val
> +	USER_MC(9998f, ldrh  \reg, [\ptr], \val)
>  	.endm
>  
>  	.macro strh1 reg, ptr, val
> @@ -36,7 +36,7 @@
>  	.endm
>  
>  	.macro ldr1 reg, ptr, val
> -	ldr \reg, [\ptr], \val
> +	USER_MC(9998f, ldr \reg, [\ptr], \val)
>  	.endm
>  
>  	.macro str1 reg, ptr, val
> @@ -44,15 +44,16 @@
>  	.endm
>  
>  	.macro ldp1 reg1, reg2, ptr, val
> -	ldp \reg1, \reg2, [\ptr], \val
> +	USER_MC(9998f, ldp \reg1, \reg2, [\ptr], \val)
>  	.endm
>  
>  	.macro stp1 reg1, reg2, ptr, val
>  	user_stp 9997f, \reg1, \reg2, \ptr, \val
>  	.endm
>  
> -end	.req	x5
> -srcin	.req	x15
> +end		.req	x5
> +srcin		.req	x15
> +fixup_type	.req	x16
>  SYM_FUNC_START(__arch_copy_to_user)
>  	add	end, x0, x2
>  	mov	srcin, x1
> @@ -61,7 +62,10 @@ SYM_FUNC_START(__arch_copy_to_user)
>  	ret
>  
>  	// Exception fixups
> -9997:	cmp	dst, dstin
> +	// x16: fixup type written by ex_handler_uaccess_mc
> +9997:	cmp 	fixup_type, #FIXUP_TYPE_MC
> +	b.eq	9998f
> +	cmp	dst, dstin
>  	b.ne	9998f
>  	// Before being absolutely sure we couldn't copy anything, try harder
>  	ldrb	tmp1w, [srcin]
> diff --git a/arch/arm64/mm/extable.c b/arch/arm64/mm/extable.c
> index 4f0083a550d4..525876c3ebf4 100644
> --- a/arch/arm64/mm/extable.c
> +++ b/arch/arm64/mm/extable.c
> @@ -24,6 +24,14 @@ static bool ex_handler_fixup(const struct exception_table_entry *ex,
>  	return true;
>  }
>  
> +static bool ex_handler_uaccess_type(const struct exception_table_entry *ex,
> +			     struct pt_regs *regs,
> +			     unsigned long fixup_type)
> +{
> +	regs->regs[16] = fixup_type;
> +	return ex_handler_fixup(ex, regs);
> +}
> +
>  static bool ex_handler_uaccess_err_zero(const struct exception_table_entry *ex,
>  					struct pt_regs *regs)
>  {
> @@ -75,6 +83,8 @@ bool fixup_exception(struct pt_regs *regs)
>  	switch (ex->type) {
>  	case EX_TYPE_FIXUP:
>  		return ex_handler_fixup(ex, regs);
> +	case EX_TYPE_UACCESS_MC:
> +		return ex_handler_uaccess_type(ex, regs, FIXUP_TYPE_NORMAL);
>  	case EX_TYPE_BPF:
>  		return ex_handler_bpf(ex, regs);
>  	case EX_TYPE_UACCESS_ERR_ZERO:
> @@ -94,10 +104,10 @@ bool fixup_exception_mc(struct pt_regs *regs)
>  	if (!ex)
>  		return false;
>  
> -	/*
> -	 * This is not complete, More Machine check safe extable type can
> -	 * be processed here.
> -	 */
> +	switch (ex->type) {
> +	case EX_TYPE_UACCESS_MC:
> +		return ex_handler_uaccess_type(ex, regs, FIXUP_TYPE_MC);
> +	}
>  
>  	return false;
>  }
> -- 
> 2.25.1
> 

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH -next v4 4/7] arm64: add copy_{to, from}_user to machine check safe
@ 2022-05-13 15:31     ` Mark Rutland
  0 siblings, 0 replies; 96+ messages in thread
From: Mark Rutland @ 2022-05-13 15:31 UTC (permalink / raw)
  To: Tong Tiangen
  Cc: James Morse, Andrew Morton, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Robin Murphy, Dave Hansen, Catalin Marinas,
	Will Deacon, Alexander Viro, Michael Ellerman,
	Benjamin Herrenschmidt, Paul Mackerras, x86, H . Peter Anvin,
	linuxppc-dev, linux-arm-kernel, linux-kernel, linux-mm,
	Kefeng Wang, Xie XiuQi, Guohanjun

On Wed, Apr 20, 2022 at 03:04:15AM +0000, Tong Tiangen wrote:
> Add copy_{to, from}_user() to machine check safe.
> 
> If copy fail due to hardware memory error, only the relevant processes are
> affected, so killing the user process and isolate the user page with
> hardware memory errors is a more reasonable choice than kernel panic.
> 
> Add new extable type EX_TYPE_UACCESS_MC which can be used for uaccess that
> can be recovered from hardware memory errors.

I don't understand why we need this.

If we apply EX_TYPE_UACCESS consistently to *all* user accesses, and
*only* to user accesses, that would *always* indicate that we can
recover, and that seems much simpler to deal with.

Today we use EX_TYPE_UACCESS_ERR_ZERO for kernel accesses in a couple of
cases, which we should clean up, and we user EX_TYPE_FIXUP for a couple
of user accesses, but those could easily be converted over.

> The x16 register is used to save the fixup type in copy_xxx_user which
> used extable type EX_TYPE_UACCESS_MC.

Why x16?

How is this intended to be consumed, and why is that behaviour different
from any *other* fault?

Mark.

> Signed-off-by: Tong Tiangen <tongtiangen@huawei.com>
> ---
>  arch/arm64/include/asm/asm-extable.h | 14 ++++++++++++++
>  arch/arm64/include/asm/asm-uaccess.h | 15 ++++++++++-----
>  arch/arm64/lib/copy_from_user.S      | 18 +++++++++++-------
>  arch/arm64/lib/copy_to_user.S        | 18 +++++++++++-------
>  arch/arm64/mm/extable.c              | 18 ++++++++++++++----
>  5 files changed, 60 insertions(+), 23 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/asm-extable.h b/arch/arm64/include/asm/asm-extable.h
> index c39f2437e08e..75b2c00e9523 100644
> --- a/arch/arm64/include/asm/asm-extable.h
> +++ b/arch/arm64/include/asm/asm-extable.h
> @@ -2,12 +2,18 @@
>  #ifndef __ASM_ASM_EXTABLE_H
>  #define __ASM_ASM_EXTABLE_H
>  
> +#define FIXUP_TYPE_NORMAL		0
> +#define FIXUP_TYPE_MC			1
> +
>  #define EX_TYPE_NONE			0
>  #define EX_TYPE_FIXUP			1
>  #define EX_TYPE_BPF			2
>  #define EX_TYPE_UACCESS_ERR_ZERO	3
>  #define EX_TYPE_LOAD_UNALIGNED_ZEROPAD	4
>  
> +/* _MC indicates that can fixup from machine check errors */
> +#define EX_TYPE_UACCESS_MC		5
> +
>  #ifdef __ASSEMBLY__
>  
>  #define __ASM_EXTABLE_RAW(insn, fixup, type, data)	\
> @@ -27,6 +33,14 @@
>  	__ASM_EXTABLE_RAW(\insn, \fixup, EX_TYPE_FIXUP, 0)
>  	.endm
>  
> +/*
> + * Create an exception table entry for `insn`, which will branch to `fixup`
> + * when an unhandled fault(include sea fault) is taken.
> + */
> +	.macro          _asm_extable_uaccess_mc, insn, fixup
> +	__ASM_EXTABLE_RAW(\insn, \fixup, EX_TYPE_UACCESS_MC, 0)
> +	.endm
> +
>  /*
>   * Create an exception table entry for `insn` if `fixup` is provided. Otherwise
>   * do nothing.
> diff --git a/arch/arm64/include/asm/asm-uaccess.h b/arch/arm64/include/asm/asm-uaccess.h
> index 0557af834e03..6c23c138e1fc 100644
> --- a/arch/arm64/include/asm/asm-uaccess.h
> +++ b/arch/arm64/include/asm/asm-uaccess.h
> @@ -63,6 +63,11 @@ alternative_else_nop_endif
>  9999:	x;					\
>  	_asm_extable	9999b, l
>  
> +
> +#define USER_MC(l, x...)			\
> +9999:	x;					\
> +	_asm_extable_uaccess_mc	9999b, l
> +
>  /*
>   * Generate the assembly for LDTR/STTR with exception table entries.
>   * This is complicated as there is no post-increment or pair versions of the
> @@ -73,8 +78,8 @@ alternative_else_nop_endif
>  8889:		ldtr	\reg2, [\addr, #8];
>  		add	\addr, \addr, \post_inc;
>  
> -		_asm_extable	8888b,\l;
> -		_asm_extable	8889b,\l;
> +		_asm_extable_uaccess_mc	8888b, \l;
> +		_asm_extable_uaccess_mc	8889b, \l;
>  	.endm
>  
>  	.macro user_stp l, reg1, reg2, addr, post_inc
> @@ -82,14 +87,14 @@ alternative_else_nop_endif
>  8889:		sttr	\reg2, [\addr, #8];
>  		add	\addr, \addr, \post_inc;
>  
> -		_asm_extable	8888b,\l;
> -		_asm_extable	8889b,\l;
> +		_asm_extable_uaccess_mc	8888b,\l;
> +		_asm_extable_uaccess_mc	8889b,\l;
>  	.endm
>  
>  	.macro user_ldst l, inst, reg, addr, post_inc
>  8888:		\inst		\reg, [\addr];
>  		add		\addr, \addr, \post_inc;
>  
> -		_asm_extable	8888b,\l;
> +		_asm_extable_uaccess_mc	8888b, \l;
>  	.endm
>  #endif
> diff --git a/arch/arm64/lib/copy_from_user.S b/arch/arm64/lib/copy_from_user.S
> index 34e317907524..480cc5ac0a8d 100644
> --- a/arch/arm64/lib/copy_from_user.S
> +++ b/arch/arm64/lib/copy_from_user.S
> @@ -25,7 +25,7 @@
>  	.endm
>  
>  	.macro strb1 reg, ptr, val
> -	strb \reg, [\ptr], \val
> +	USER_MC(9998f, strb \reg, [\ptr], \val)
>  	.endm
>  
>  	.macro ldrh1 reg, ptr, val
> @@ -33,7 +33,7 @@
>  	.endm
>  
>  	.macro strh1 reg, ptr, val
> -	strh \reg, [\ptr], \val
> +	USER_MC(9998f, strh \reg, [\ptr], \val)
>  	.endm
>  
>  	.macro ldr1 reg, ptr, val
> @@ -41,7 +41,7 @@
>  	.endm
>  
>  	.macro str1 reg, ptr, val
> -	str \reg, [\ptr], \val
> +	USER_MC(9998f, str \reg, [\ptr], \val)
>  	.endm
>  
>  	.macro ldp1 reg1, reg2, ptr, val
> @@ -49,11 +49,12 @@
>  	.endm
>  
>  	.macro stp1 reg1, reg2, ptr, val
> -	stp \reg1, \reg2, [\ptr], \val
> +	USER_MC(9998f, stp \reg1, \reg2, [\ptr], \val)
>  	.endm
>  
> -end	.req	x5
> -srcin	.req	x15
> +end		.req	x5
> +srcin		.req	x15
> +fixup_type	.req	x16
>  SYM_FUNC_START(__arch_copy_from_user)
>  	add	end, x0, x2
>  	mov	srcin, x1
> @@ -62,7 +63,10 @@ SYM_FUNC_START(__arch_copy_from_user)
>  	ret
>  
>  	// Exception fixups
> -9997:	cmp	dst, dstin
> +	// x16: fixup type written by ex_handler_uaccess_mc
> +9997:	cmp 	fixup_type, #FIXUP_TYPE_MC
> +	b.eq	9998f
> +	cmp	dst, dstin
>  	b.ne	9998f
>  	// Before being absolutely sure we couldn't copy anything, try harder
>  USER(9998f, ldtrb tmp1w, [srcin])
> diff --git a/arch/arm64/lib/copy_to_user.S b/arch/arm64/lib/copy_to_user.S
> index 802231772608..021a7d27b3a4 100644
> --- a/arch/arm64/lib/copy_to_user.S
> +++ b/arch/arm64/lib/copy_to_user.S
> @@ -20,7 +20,7 @@
>   *	x0 - bytes not copied
>   */
>  	.macro ldrb1 reg, ptr, val
> -	ldrb  \reg, [\ptr], \val
> +	USER_MC(9998f, ldrb  \reg, [\ptr], \val)
>  	.endm
>  
>  	.macro strb1 reg, ptr, val
> @@ -28,7 +28,7 @@
>  	.endm
>  
>  	.macro ldrh1 reg, ptr, val
> -	ldrh  \reg, [\ptr], \val
> +	USER_MC(9998f, ldrh  \reg, [\ptr], \val)
>  	.endm
>  
>  	.macro strh1 reg, ptr, val
> @@ -36,7 +36,7 @@
>  	.endm
>  
>  	.macro ldr1 reg, ptr, val
> -	ldr \reg, [\ptr], \val
> +	USER_MC(9998f, ldr \reg, [\ptr], \val)
>  	.endm
>  
>  	.macro str1 reg, ptr, val
> @@ -44,15 +44,16 @@
>  	.endm
>  
>  	.macro ldp1 reg1, reg2, ptr, val
> -	ldp \reg1, \reg2, [\ptr], \val
> +	USER_MC(9998f, ldp \reg1, \reg2, [\ptr], \val)
>  	.endm
>  
>  	.macro stp1 reg1, reg2, ptr, val
>  	user_stp 9997f, \reg1, \reg2, \ptr, \val
>  	.endm
>  
> -end	.req	x5
> -srcin	.req	x15
> +end		.req	x5
> +srcin		.req	x15
> +fixup_type	.req	x16
>  SYM_FUNC_START(__arch_copy_to_user)
>  	add	end, x0, x2
>  	mov	srcin, x1
> @@ -61,7 +62,10 @@ SYM_FUNC_START(__arch_copy_to_user)
>  	ret
>  
>  	// Exception fixups
> -9997:	cmp	dst, dstin
> +	// x16: fixup type written by ex_handler_uaccess_mc
> +9997:	cmp 	fixup_type, #FIXUP_TYPE_MC
> +	b.eq	9998f
> +	cmp	dst, dstin
>  	b.ne	9998f
>  	// Before being absolutely sure we couldn't copy anything, try harder
>  	ldrb	tmp1w, [srcin]
> diff --git a/arch/arm64/mm/extable.c b/arch/arm64/mm/extable.c
> index 4f0083a550d4..525876c3ebf4 100644
> --- a/arch/arm64/mm/extable.c
> +++ b/arch/arm64/mm/extable.c
> @@ -24,6 +24,14 @@ static bool ex_handler_fixup(const struct exception_table_entry *ex,
>  	return true;
>  }
>  
> +static bool ex_handler_uaccess_type(const struct exception_table_entry *ex,
> +			     struct pt_regs *regs,
> +			     unsigned long fixup_type)
> +{
> +	regs->regs[16] = fixup_type;
> +	return ex_handler_fixup(ex, regs);
> +}
> +
>  static bool ex_handler_uaccess_err_zero(const struct exception_table_entry *ex,
>  					struct pt_regs *regs)
>  {
> @@ -75,6 +83,8 @@ bool fixup_exception(struct pt_regs *regs)
>  	switch (ex->type) {
>  	case EX_TYPE_FIXUP:
>  		return ex_handler_fixup(ex, regs);
> +	case EX_TYPE_UACCESS_MC:
> +		return ex_handler_uaccess_type(ex, regs, FIXUP_TYPE_NORMAL);
>  	case EX_TYPE_BPF:
>  		return ex_handler_bpf(ex, regs);
>  	case EX_TYPE_UACCESS_ERR_ZERO:
> @@ -94,10 +104,10 @@ bool fixup_exception_mc(struct pt_regs *regs)
>  	if (!ex)
>  		return false;
>  
> -	/*
> -	 * This is not complete, More Machine check safe extable type can
> -	 * be processed here.
> -	 */
> +	switch (ex->type) {
> +	case EX_TYPE_UACCESS_MC:
> +		return ex_handler_uaccess_type(ex, regs, FIXUP_TYPE_MC);
> +	}
>  
>  	return false;
>  }
> -- 
> 2.25.1
> 

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH -next v4 5/7] arm64: mte: Clean up user tag accessors
  2022-04-20  3:04   ` Tong Tiangen
  (?)
@ 2022-05-13 15:36     ` Mark Rutland
  -1 siblings, 0 replies; 96+ messages in thread
From: Mark Rutland @ 2022-05-13 15:36 UTC (permalink / raw)
  To: Tong Tiangen, Catalin Marinas
  Cc: James Morse, Andrew Morton, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Robin Murphy, Dave Hansen, Will Deacon,
	Alexander Viro, Michael Ellerman, Benjamin Herrenschmidt,
	Paul Mackerras, x86, H . Peter Anvin, linuxppc-dev,
	linux-arm-kernel, linux-kernel, linux-mm, Kefeng Wang, Xie XiuQi,
	Guohanjun

On Wed, Apr 20, 2022 at 03:04:16AM +0000, Tong Tiangen wrote:
> From: Robin Murphy <robin.murphy@arm.com>
> 
> Invoking user_ldst to explicitly add a post-increment of 0 is silly.
> Just use a normal USER() annotation and save the redundant instruction.
> 
> Signed-off-by: Robin Murphy <robin.murphy@arm.com>
> Reviewed-by: Tong Tiangen <tongtiangen@huawei.com>

When posting someone else's patch, you need to add your own
Signed-off-by tag. Please see:

  https://www.kernel.org/doc/html/latest/process/submitting-patches.html#sign-your-work-the-developer-s-certificate-of-origin

That said, the patch itself looks sane, and matches its original posting
at:

  https://lore.kernel.org/linux-arm-kernel/38c6d4b5-a3db-5c3e-02e7-39875edb3476@arm.com/

So:

  Acked-by: Mark Rutland <mark.rutland@arm.com>

Catalin, are you happy to pick up this patch as a cleanup?

Thanks,
Mark.

> ---
>  arch/arm64/lib/mte.S | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/arm64/lib/mte.S b/arch/arm64/lib/mte.S
> index 8590af3c98c0..eeb9e45bcce8 100644
> --- a/arch/arm64/lib/mte.S
> +++ b/arch/arm64/lib/mte.S
> @@ -93,7 +93,7 @@ SYM_FUNC_START(mte_copy_tags_from_user)
>  	mov	x3, x1
>  	cbz	x2, 2f
>  1:
> -	user_ldst 2f, ldtrb, w4, x1, 0
> +USER(2f, ldtrb	w4, [x1])
>  	lsl	x4, x4, #MTE_TAG_SHIFT
>  	stg	x4, [x0], #MTE_GRANULE_SIZE
>  	add	x1, x1, #1
> @@ -120,7 +120,7 @@ SYM_FUNC_START(mte_copy_tags_to_user)
>  1:
>  	ldg	x4, [x1]
>  	ubfx	x4, x4, #MTE_TAG_SHIFT, #MTE_TAG_SIZE
> -	user_ldst 2f, sttrb, w4, x0, 0
> +USER(2f, sttrb	w4, [x0])
>  	add	x0, x0, #1
>  	add	x1, x1, #MTE_GRANULE_SIZE
>  	subs	x2, x2, #1
> -- 
> 2.25.1
> 

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH -next v4 5/7] arm64: mte: Clean up user tag accessors
@ 2022-05-13 15:36     ` Mark Rutland
  0 siblings, 0 replies; 96+ messages in thread
From: Mark Rutland @ 2022-05-13 15:36 UTC (permalink / raw)
  To: Tong Tiangen, Catalin Marinas
  Cc: linux-arm-kernel, Kefeng Wang, x86, Xie XiuQi, Will Deacon,
	Guohanjun, Dave Hansen, H . Peter Anvin, linuxppc-dev,
	linux-kernel, linux-mm, Ingo Molnar, Borislav Petkov,
	Alexander Viro, James Morse, Andrew Morton, Robin Murphy,
	Thomas Gleixner, Paul Mackerras

On Wed, Apr 20, 2022 at 03:04:16AM +0000, Tong Tiangen wrote:
> From: Robin Murphy <robin.murphy@arm.com>
> 
> Invoking user_ldst to explicitly add a post-increment of 0 is silly.
> Just use a normal USER() annotation and save the redundant instruction.
> 
> Signed-off-by: Robin Murphy <robin.murphy@arm.com>
> Reviewed-by: Tong Tiangen <tongtiangen@huawei.com>

When posting someone else's patch, you need to add your own
Signed-off-by tag. Please see:

  https://www.kernel.org/doc/html/latest/process/submitting-patches.html#sign-your-work-the-developer-s-certificate-of-origin

That said, the patch itself looks sane, and matches its original posting
at:

  https://lore.kernel.org/linux-arm-kernel/38c6d4b5-a3db-5c3e-02e7-39875edb3476@arm.com/

So:

  Acked-by: Mark Rutland <mark.rutland@arm.com>

Catalin, are you happy to pick up this patch as a cleanup?

Thanks,
Mark.

> ---
>  arch/arm64/lib/mte.S | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/arm64/lib/mte.S b/arch/arm64/lib/mte.S
> index 8590af3c98c0..eeb9e45bcce8 100644
> --- a/arch/arm64/lib/mte.S
> +++ b/arch/arm64/lib/mte.S
> @@ -93,7 +93,7 @@ SYM_FUNC_START(mte_copy_tags_from_user)
>  	mov	x3, x1
>  	cbz	x2, 2f
>  1:
> -	user_ldst 2f, ldtrb, w4, x1, 0
> +USER(2f, ldtrb	w4, [x1])
>  	lsl	x4, x4, #MTE_TAG_SHIFT
>  	stg	x4, [x0], #MTE_GRANULE_SIZE
>  	add	x1, x1, #1
> @@ -120,7 +120,7 @@ SYM_FUNC_START(mte_copy_tags_to_user)
>  1:
>  	ldg	x4, [x1]
>  	ubfx	x4, x4, #MTE_TAG_SHIFT, #MTE_TAG_SIZE
> -	user_ldst 2f, sttrb, w4, x0, 0
> +USER(2f, sttrb	w4, [x0])
>  	add	x0, x0, #1
>  	add	x1, x1, #MTE_GRANULE_SIZE
>  	subs	x2, x2, #1
> -- 
> 2.25.1
> 

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH -next v4 5/7] arm64: mte: Clean up user tag accessors
@ 2022-05-13 15:36     ` Mark Rutland
  0 siblings, 0 replies; 96+ messages in thread
From: Mark Rutland @ 2022-05-13 15:36 UTC (permalink / raw)
  To: Tong Tiangen, Catalin Marinas
  Cc: James Morse, Andrew Morton, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Robin Murphy, Dave Hansen, Will Deacon,
	Alexander Viro, Michael Ellerman, Benjamin Herrenschmidt,
	Paul Mackerras, x86, H . Peter Anvin, linuxppc-dev,
	linux-arm-kernel, linux-kernel, linux-mm, Kefeng Wang, Xie XiuQi,
	Guohanjun

On Wed, Apr 20, 2022 at 03:04:16AM +0000, Tong Tiangen wrote:
> From: Robin Murphy <robin.murphy@arm.com>
> 
> Invoking user_ldst to explicitly add a post-increment of 0 is silly.
> Just use a normal USER() annotation and save the redundant instruction.
> 
> Signed-off-by: Robin Murphy <robin.murphy@arm.com>
> Reviewed-by: Tong Tiangen <tongtiangen@huawei.com>

When posting someone else's patch, you need to add your own
Signed-off-by tag. Please see:

  https://www.kernel.org/doc/html/latest/process/submitting-patches.html#sign-your-work-the-developer-s-certificate-of-origin

That said, the patch itself looks sane, and matches its original posting
at:

  https://lore.kernel.org/linux-arm-kernel/38c6d4b5-a3db-5c3e-02e7-39875edb3476@arm.com/

So:

  Acked-by: Mark Rutland <mark.rutland@arm.com>

Catalin, are you happy to pick up this patch as a cleanup?

Thanks,
Mark.

> ---
>  arch/arm64/lib/mte.S | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/arm64/lib/mte.S b/arch/arm64/lib/mte.S
> index 8590af3c98c0..eeb9e45bcce8 100644
> --- a/arch/arm64/lib/mte.S
> +++ b/arch/arm64/lib/mte.S
> @@ -93,7 +93,7 @@ SYM_FUNC_START(mte_copy_tags_from_user)
>  	mov	x3, x1
>  	cbz	x2, 2f
>  1:
> -	user_ldst 2f, ldtrb, w4, x1, 0
> +USER(2f, ldtrb	w4, [x1])
>  	lsl	x4, x4, #MTE_TAG_SHIFT
>  	stg	x4, [x0], #MTE_GRANULE_SIZE
>  	add	x1, x1, #1
> @@ -120,7 +120,7 @@ SYM_FUNC_START(mte_copy_tags_to_user)
>  1:
>  	ldg	x4, [x1]
>  	ubfx	x4, x4, #MTE_TAG_SHIFT, #MTE_TAG_SIZE
> -	user_ldst 2f, sttrb, w4, x0, 0
> +USER(2f, sttrb	w4, [x0])
>  	add	x0, x0, #1
>  	add	x1, x1, #MTE_GRANULE_SIZE
>  	subs	x2, x2, #1
> -- 
> 2.25.1
> 

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH -next v4 6/7] arm64: add {get, put}_user to machine check safe
  2022-04-20  3:04   ` Tong Tiangen
  (?)
@ 2022-05-13 15:39     ` Mark Rutland
  -1 siblings, 0 replies; 96+ messages in thread
From: Mark Rutland @ 2022-05-13 15:39 UTC (permalink / raw)
  To: Tong Tiangen
  Cc: James Morse, Andrew Morton, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Robin Murphy, Dave Hansen, Catalin Marinas,
	Will Deacon, Alexander Viro, Michael Ellerman,
	Benjamin Herrenschmidt, Paul Mackerras, x86, H . Peter Anvin,
	linuxppc-dev, linux-arm-kernel, linux-kernel, linux-mm,
	Kefeng Wang, Xie XiuQi, Guohanjun

On Wed, Apr 20, 2022 at 03:04:17AM +0000, Tong Tiangen wrote:
> Add {get, put}_user() to machine check safe.
> 
> If get/put fail due to hardware memory error, only the relevant processes
> are affected, so killing the user process and isolate the user page with
> hardware memory errors is a more reasonable choice than kernel panic.
> 
> Add new extable type EX_TYPE_UACCESS_MC_ERR_ZERO which can be used for
> uaccess that can be recovered from hardware memory errors. The difference
> from EX_TYPE_UACCESS_MC is that this type also sets additional two target
> register which save error code and value needs to be set zero.

Why does this need to be in any way distinct from the existing
EX_TYPE_UACCESS_ERR_ZERO ?

Other than the case where we currently (ab)use that for
copy_{to,from}_kernel_nofault(), where do we *not* want to use
EX_TYPE_UACCESS_ERR_ZERO and *not* recover from a memory error?

Thanks,
Mark.

> 
> Signed-off-by: Tong Tiangen <tongtiangen@huawei.com>
> ---
>  arch/arm64/include/asm/asm-extable.h | 14 ++++++++++++++
>  arch/arm64/include/asm/uaccess.h     |  4 ++--
>  arch/arm64/mm/extable.c              |  4 ++++
>  3 files changed, 20 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/asm-extable.h b/arch/arm64/include/asm/asm-extable.h
> index 75b2c00e9523..80410899a9ad 100644
> --- a/arch/arm64/include/asm/asm-extable.h
> +++ b/arch/arm64/include/asm/asm-extable.h
> @@ -13,6 +13,7 @@
>  
>  /* _MC indicates that can fixup from machine check errors */
>  #define EX_TYPE_UACCESS_MC		5
> +#define EX_TYPE_UACCESS_MC_ERR_ZERO	6
>  
>  #ifdef __ASSEMBLY__
>  
> @@ -78,6 +79,15 @@
>  #define EX_DATA_REG(reg, gpr)						\
>  	"((.L__gpr_num_" #gpr ") << " __stringify(EX_DATA_REG_##reg##_SHIFT) ")"
>  
> +#define _ASM_EXTABLE_UACCESS_MC_ERR_ZERO(insn, fixup, err, zero)		\
> +	__DEFINE_ASM_GPR_NUMS							\
> +	__ASM_EXTABLE_RAW(#insn, #fixup,					\
> +			  __stringify(EX_TYPE_UACCESS_MC_ERR_ZERO),		\
> +			  "("							\
> +			    EX_DATA_REG(ERR, err) " | "				\
> +			    EX_DATA_REG(ZERO, zero)				\
> +			  ")")
> +
>  #define _ASM_EXTABLE_UACCESS_ERR_ZERO(insn, fixup, err, zero)		\
>  	__DEFINE_ASM_GPR_NUMS						\
>  	__ASM_EXTABLE_RAW(#insn, #fixup, 				\
> @@ -90,6 +100,10 @@
>  #define _ASM_EXTABLE_UACCESS_ERR(insn, fixup, err)			\
>  	_ASM_EXTABLE_UACCESS_ERR_ZERO(insn, fixup, err, wzr)
>  
> +
> +#define _ASM_EXTABLE_UACCESS_MC_ERR(insn, fixup, err)			\
> +	_ASM_EXTABLE_UACCESS_MC_ERR_ZERO(insn, fixup, err, wzr)
> +
>  #define EX_DATA_REG_DATA_SHIFT	0
>  #define EX_DATA_REG_DATA	GENMASK(4, 0)
>  #define EX_DATA_REG_ADDR_SHIFT	5
> diff --git a/arch/arm64/include/asm/uaccess.h b/arch/arm64/include/asm/uaccess.h
> index e8dce0cc5eaa..e41b47df48b0 100644
> --- a/arch/arm64/include/asm/uaccess.h
> +++ b/arch/arm64/include/asm/uaccess.h
> @@ -236,7 +236,7 @@ static inline void __user *__uaccess_mask_ptr(const void __user *ptr)
>  	asm volatile(							\
>  	"1:	" load "	" reg "1, [%2]\n"			\
>  	"2:\n"								\
> -	_ASM_EXTABLE_UACCESS_ERR_ZERO(1b, 2b, %w0, %w1)			\
> +	_ASM_EXTABLE_UACCESS_MC_ERR_ZERO(1b, 2b, %w0, %w1)		\
>  	: "+r" (err), "=&r" (x)						\
>  	: "r" (addr))
>  
> @@ -325,7 +325,7 @@ do {									\
>  	asm volatile(							\
>  	"1:	" store "	" reg "1, [%2]\n"			\
>  	"2:\n"								\
> -	_ASM_EXTABLE_UACCESS_ERR(1b, 2b, %w0)				\
> +	_ASM_EXTABLE_UACCESS_MC_ERR(1b, 2b, %w0)			\
>  	: "+r" (err)							\
>  	: "r" (x), "r" (addr))
>  
> diff --git a/arch/arm64/mm/extable.c b/arch/arm64/mm/extable.c
> index 525876c3ebf4..1023ccdb2f89 100644
> --- a/arch/arm64/mm/extable.c
> +++ b/arch/arm64/mm/extable.c
> @@ -88,6 +88,7 @@ bool fixup_exception(struct pt_regs *regs)
>  	case EX_TYPE_BPF:
>  		return ex_handler_bpf(ex, regs);
>  	case EX_TYPE_UACCESS_ERR_ZERO:
> +	case EX_TYPE_UACCESS_MC_ERR_ZERO:
>  		return ex_handler_uaccess_err_zero(ex, regs);
>  	case EX_TYPE_LOAD_UNALIGNED_ZEROPAD:
>  		return ex_handler_load_unaligned_zeropad(ex, regs);
> @@ -107,6 +108,9 @@ bool fixup_exception_mc(struct pt_regs *regs)
>  	switch (ex->type) {
>  	case EX_TYPE_UACCESS_MC:
>  		return ex_handler_uaccess_type(ex, regs, FIXUP_TYPE_MC);
> +	case EX_TYPE_UACCESS_MC_ERR_ZERO:
> +		return ex_handler_uaccess_err_zero(ex, regs);
> +
>  	}
>  
>  	return false;
> -- 
> 2.25.1
> 

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH -next v4 6/7] arm64: add {get, put}_user to machine check safe
@ 2022-05-13 15:39     ` Mark Rutland
  0 siblings, 0 replies; 96+ messages in thread
From: Mark Rutland @ 2022-05-13 15:39 UTC (permalink / raw)
  To: Tong Tiangen
  Cc: Kefeng Wang, Dave Hansen, linux-mm, Paul Mackerras, Guohanjun,
	Will Deacon, H . Peter Anvin, x86, Ingo Molnar, Catalin Marinas,
	Xie XiuQi, Borislav Petkov, Alexander Viro, Thomas Gleixner,
	linux-arm-kernel, Robin Murphy, linux-kernel, James Morse,
	Andrew Morton, linuxppc-dev

On Wed, Apr 20, 2022 at 03:04:17AM +0000, Tong Tiangen wrote:
> Add {get, put}_user() to machine check safe.
> 
> If get/put fail due to hardware memory error, only the relevant processes
> are affected, so killing the user process and isolate the user page with
> hardware memory errors is a more reasonable choice than kernel panic.
> 
> Add new extable type EX_TYPE_UACCESS_MC_ERR_ZERO which can be used for
> uaccess that can be recovered from hardware memory errors. The difference
> from EX_TYPE_UACCESS_MC is that this type also sets additional two target
> register which save error code and value needs to be set zero.

Why does this need to be in any way distinct from the existing
EX_TYPE_UACCESS_ERR_ZERO ?

Other than the case where we currently (ab)use that for
copy_{to,from}_kernel_nofault(), where do we *not* want to use
EX_TYPE_UACCESS_ERR_ZERO and *not* recover from a memory error?

Thanks,
Mark.

> 
> Signed-off-by: Tong Tiangen <tongtiangen@huawei.com>
> ---
>  arch/arm64/include/asm/asm-extable.h | 14 ++++++++++++++
>  arch/arm64/include/asm/uaccess.h     |  4 ++--
>  arch/arm64/mm/extable.c              |  4 ++++
>  3 files changed, 20 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/asm-extable.h b/arch/arm64/include/asm/asm-extable.h
> index 75b2c00e9523..80410899a9ad 100644
> --- a/arch/arm64/include/asm/asm-extable.h
> +++ b/arch/arm64/include/asm/asm-extable.h
> @@ -13,6 +13,7 @@
>  
>  /* _MC indicates that can fixup from machine check errors */
>  #define EX_TYPE_UACCESS_MC		5
> +#define EX_TYPE_UACCESS_MC_ERR_ZERO	6
>  
>  #ifdef __ASSEMBLY__
>  
> @@ -78,6 +79,15 @@
>  #define EX_DATA_REG(reg, gpr)						\
>  	"((.L__gpr_num_" #gpr ") << " __stringify(EX_DATA_REG_##reg##_SHIFT) ")"
>  
> +#define _ASM_EXTABLE_UACCESS_MC_ERR_ZERO(insn, fixup, err, zero)		\
> +	__DEFINE_ASM_GPR_NUMS							\
> +	__ASM_EXTABLE_RAW(#insn, #fixup,					\
> +			  __stringify(EX_TYPE_UACCESS_MC_ERR_ZERO),		\
> +			  "("							\
> +			    EX_DATA_REG(ERR, err) " | "				\
> +			    EX_DATA_REG(ZERO, zero)				\
> +			  ")")
> +
>  #define _ASM_EXTABLE_UACCESS_ERR_ZERO(insn, fixup, err, zero)		\
>  	__DEFINE_ASM_GPR_NUMS						\
>  	__ASM_EXTABLE_RAW(#insn, #fixup, 				\
> @@ -90,6 +100,10 @@
>  #define _ASM_EXTABLE_UACCESS_ERR(insn, fixup, err)			\
>  	_ASM_EXTABLE_UACCESS_ERR_ZERO(insn, fixup, err, wzr)
>  
> +
> +#define _ASM_EXTABLE_UACCESS_MC_ERR(insn, fixup, err)			\
> +	_ASM_EXTABLE_UACCESS_MC_ERR_ZERO(insn, fixup, err, wzr)
> +
>  #define EX_DATA_REG_DATA_SHIFT	0
>  #define EX_DATA_REG_DATA	GENMASK(4, 0)
>  #define EX_DATA_REG_ADDR_SHIFT	5
> diff --git a/arch/arm64/include/asm/uaccess.h b/arch/arm64/include/asm/uaccess.h
> index e8dce0cc5eaa..e41b47df48b0 100644
> --- a/arch/arm64/include/asm/uaccess.h
> +++ b/arch/arm64/include/asm/uaccess.h
> @@ -236,7 +236,7 @@ static inline void __user *__uaccess_mask_ptr(const void __user *ptr)
>  	asm volatile(							\
>  	"1:	" load "	" reg "1, [%2]\n"			\
>  	"2:\n"								\
> -	_ASM_EXTABLE_UACCESS_ERR_ZERO(1b, 2b, %w0, %w1)			\
> +	_ASM_EXTABLE_UACCESS_MC_ERR_ZERO(1b, 2b, %w0, %w1)		\
>  	: "+r" (err), "=&r" (x)						\
>  	: "r" (addr))
>  
> @@ -325,7 +325,7 @@ do {									\
>  	asm volatile(							\
>  	"1:	" store "	" reg "1, [%2]\n"			\
>  	"2:\n"								\
> -	_ASM_EXTABLE_UACCESS_ERR(1b, 2b, %w0)				\
> +	_ASM_EXTABLE_UACCESS_MC_ERR(1b, 2b, %w0)			\
>  	: "+r" (err)							\
>  	: "r" (x), "r" (addr))
>  
> diff --git a/arch/arm64/mm/extable.c b/arch/arm64/mm/extable.c
> index 525876c3ebf4..1023ccdb2f89 100644
> --- a/arch/arm64/mm/extable.c
> +++ b/arch/arm64/mm/extable.c
> @@ -88,6 +88,7 @@ bool fixup_exception(struct pt_regs *regs)
>  	case EX_TYPE_BPF:
>  		return ex_handler_bpf(ex, regs);
>  	case EX_TYPE_UACCESS_ERR_ZERO:
> +	case EX_TYPE_UACCESS_MC_ERR_ZERO:
>  		return ex_handler_uaccess_err_zero(ex, regs);
>  	case EX_TYPE_LOAD_UNALIGNED_ZEROPAD:
>  		return ex_handler_load_unaligned_zeropad(ex, regs);
> @@ -107,6 +108,9 @@ bool fixup_exception_mc(struct pt_regs *regs)
>  	switch (ex->type) {
>  	case EX_TYPE_UACCESS_MC:
>  		return ex_handler_uaccess_type(ex, regs, FIXUP_TYPE_MC);
> +	case EX_TYPE_UACCESS_MC_ERR_ZERO:
> +		return ex_handler_uaccess_err_zero(ex, regs);
> +
>  	}
>  
>  	return false;
> -- 
> 2.25.1
> 

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH -next v4 6/7] arm64: add {get, put}_user to machine check safe
@ 2022-05-13 15:39     ` Mark Rutland
  0 siblings, 0 replies; 96+ messages in thread
From: Mark Rutland @ 2022-05-13 15:39 UTC (permalink / raw)
  To: Tong Tiangen
  Cc: James Morse, Andrew Morton, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Robin Murphy, Dave Hansen, Catalin Marinas,
	Will Deacon, Alexander Viro, Michael Ellerman,
	Benjamin Herrenschmidt, Paul Mackerras, x86, H . Peter Anvin,
	linuxppc-dev, linux-arm-kernel, linux-kernel, linux-mm,
	Kefeng Wang, Xie XiuQi, Guohanjun

On Wed, Apr 20, 2022 at 03:04:17AM +0000, Tong Tiangen wrote:
> Add {get, put}_user() to machine check safe.
> 
> If get/put fail due to hardware memory error, only the relevant processes
> are affected, so killing the user process and isolate the user page with
> hardware memory errors is a more reasonable choice than kernel panic.
> 
> Add new extable type EX_TYPE_UACCESS_MC_ERR_ZERO which can be used for
> uaccess that can be recovered from hardware memory errors. The difference
> from EX_TYPE_UACCESS_MC is that this type also sets additional two target
> register which save error code and value needs to be set zero.

Why does this need to be in any way distinct from the existing
EX_TYPE_UACCESS_ERR_ZERO ?

Other than the case where we currently (ab)use that for
copy_{to,from}_kernel_nofault(), where do we *not* want to use
EX_TYPE_UACCESS_ERR_ZERO and *not* recover from a memory error?

Thanks,
Mark.

> 
> Signed-off-by: Tong Tiangen <tongtiangen@huawei.com>
> ---
>  arch/arm64/include/asm/asm-extable.h | 14 ++++++++++++++
>  arch/arm64/include/asm/uaccess.h     |  4 ++--
>  arch/arm64/mm/extable.c              |  4 ++++
>  3 files changed, 20 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/asm-extable.h b/arch/arm64/include/asm/asm-extable.h
> index 75b2c00e9523..80410899a9ad 100644
> --- a/arch/arm64/include/asm/asm-extable.h
> +++ b/arch/arm64/include/asm/asm-extable.h
> @@ -13,6 +13,7 @@
>  
>  /* _MC indicates that can fixup from machine check errors */
>  #define EX_TYPE_UACCESS_MC		5
> +#define EX_TYPE_UACCESS_MC_ERR_ZERO	6
>  
>  #ifdef __ASSEMBLY__
>  
> @@ -78,6 +79,15 @@
>  #define EX_DATA_REG(reg, gpr)						\
>  	"((.L__gpr_num_" #gpr ") << " __stringify(EX_DATA_REG_##reg##_SHIFT) ")"
>  
> +#define _ASM_EXTABLE_UACCESS_MC_ERR_ZERO(insn, fixup, err, zero)		\
> +	__DEFINE_ASM_GPR_NUMS							\
> +	__ASM_EXTABLE_RAW(#insn, #fixup,					\
> +			  __stringify(EX_TYPE_UACCESS_MC_ERR_ZERO),		\
> +			  "("							\
> +			    EX_DATA_REG(ERR, err) " | "				\
> +			    EX_DATA_REG(ZERO, zero)				\
> +			  ")")
> +
>  #define _ASM_EXTABLE_UACCESS_ERR_ZERO(insn, fixup, err, zero)		\
>  	__DEFINE_ASM_GPR_NUMS						\
>  	__ASM_EXTABLE_RAW(#insn, #fixup, 				\
> @@ -90,6 +100,10 @@
>  #define _ASM_EXTABLE_UACCESS_ERR(insn, fixup, err)			\
>  	_ASM_EXTABLE_UACCESS_ERR_ZERO(insn, fixup, err, wzr)
>  
> +
> +#define _ASM_EXTABLE_UACCESS_MC_ERR(insn, fixup, err)			\
> +	_ASM_EXTABLE_UACCESS_MC_ERR_ZERO(insn, fixup, err, wzr)
> +
>  #define EX_DATA_REG_DATA_SHIFT	0
>  #define EX_DATA_REG_DATA	GENMASK(4, 0)
>  #define EX_DATA_REG_ADDR_SHIFT	5
> diff --git a/arch/arm64/include/asm/uaccess.h b/arch/arm64/include/asm/uaccess.h
> index e8dce0cc5eaa..e41b47df48b0 100644
> --- a/arch/arm64/include/asm/uaccess.h
> +++ b/arch/arm64/include/asm/uaccess.h
> @@ -236,7 +236,7 @@ static inline void __user *__uaccess_mask_ptr(const void __user *ptr)
>  	asm volatile(							\
>  	"1:	" load "	" reg "1, [%2]\n"			\
>  	"2:\n"								\
> -	_ASM_EXTABLE_UACCESS_ERR_ZERO(1b, 2b, %w0, %w1)			\
> +	_ASM_EXTABLE_UACCESS_MC_ERR_ZERO(1b, 2b, %w0, %w1)		\
>  	: "+r" (err), "=&r" (x)						\
>  	: "r" (addr))
>  
> @@ -325,7 +325,7 @@ do {									\
>  	asm volatile(							\
>  	"1:	" store "	" reg "1, [%2]\n"			\
>  	"2:\n"								\
> -	_ASM_EXTABLE_UACCESS_ERR(1b, 2b, %w0)				\
> +	_ASM_EXTABLE_UACCESS_MC_ERR(1b, 2b, %w0)			\
>  	: "+r" (err)							\
>  	: "r" (x), "r" (addr))
>  
> diff --git a/arch/arm64/mm/extable.c b/arch/arm64/mm/extable.c
> index 525876c3ebf4..1023ccdb2f89 100644
> --- a/arch/arm64/mm/extable.c
> +++ b/arch/arm64/mm/extable.c
> @@ -88,6 +88,7 @@ bool fixup_exception(struct pt_regs *regs)
>  	case EX_TYPE_BPF:
>  		return ex_handler_bpf(ex, regs);
>  	case EX_TYPE_UACCESS_ERR_ZERO:
> +	case EX_TYPE_UACCESS_MC_ERR_ZERO:
>  		return ex_handler_uaccess_err_zero(ex, regs);
>  	case EX_TYPE_LOAD_UNALIGNED_ZEROPAD:
>  		return ex_handler_load_unaligned_zeropad(ex, regs);
> @@ -107,6 +108,9 @@ bool fixup_exception_mc(struct pt_regs *regs)
>  	switch (ex->type) {
>  	case EX_TYPE_UACCESS_MC:
>  		return ex_handler_uaccess_type(ex, regs, FIXUP_TYPE_MC);
> +	case EX_TYPE_UACCESS_MC_ERR_ZERO:
> +		return ex_handler_uaccess_err_zero(ex, regs);
> +
>  	}
>  
>  	return false;
> -- 
> 2.25.1
> 

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH -next v4 7/7] arm64: add cow to machine check safe
  2022-04-20  3:04   ` Tong Tiangen
  (?)
@ 2022-05-13 15:44     ` Mark Rutland
  -1 siblings, 0 replies; 96+ messages in thread
From: Mark Rutland @ 2022-05-13 15:44 UTC (permalink / raw)
  To: Tong Tiangen
  Cc: James Morse, Andrew Morton, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Robin Murphy, Dave Hansen, Catalin Marinas,
	Will Deacon, Alexander Viro, Michael Ellerman,
	Benjamin Herrenschmidt, Paul Mackerras, x86, H . Peter Anvin,
	linuxppc-dev, linux-arm-kernel, linux-kernel, linux-mm,
	Kefeng Wang, Xie XiuQi, Guohanjun

On Wed, Apr 20, 2022 at 03:04:18AM +0000, Tong Tiangen wrote:
> In the cow(copy on write) processing, the data of the user process is
> copied, when hardware memory error is encountered during copy, only the
> relevant processes are affected, so killing the user process and isolate
> the user page with hardware memory errors is a more reasonable choice than
> kernel panic.

There are plenty of other places we'll access user pages via a kernel
alias (e.g. when performing IO), so why is this special?

To be clear, I am not entirely averse to this, but it seems like this is
being done because it's easy to do rather than necessarily being all
that useful, and I'm not keen on having to duplicate a bunch of logic
for this.

> Add new helper copy_page_mc() which provide a page copy implementation with
> machine check safe. At present, only used in cow. In future, we can expand
> more scenes. As long as the consequences of page copy failure are not
> fatal(eg: only affect user process), we can use this helper.
> 
> The copy_page_mc() in copy_page_mc.S is largely borrows from copy_page()
> in copy_page.S and the main difference is copy_page_mc() add extable entry
> to every load/store insn to support machine check safe. largely to keep the
> patch simple. If needed those optimizations can be folded in.
> 
> Add new extable type EX_TYPE_COPY_PAGE_MC which used in copy_page_mc().
> 
> This type only be processed in fixup_exception_mc(), The reason is that
> copy_page_mc() is consistent with copy_page() except machine check safe is
> considered, and copy_page() do not need to consider exception fixup.
> 
> Signed-off-by: Tong Tiangen <tongtiangen@huawei.com>
> ---
>  arch/arm64/include/asm/asm-extable.h |  5 ++
>  arch/arm64/include/asm/page.h        | 10 ++++
>  arch/arm64/lib/Makefile              |  2 +
>  arch/arm64/lib/copy_page_mc.S        | 86 ++++++++++++++++++++++++++++
>  arch/arm64/mm/copypage.c             | 36 ++++++++++--
>  arch/arm64/mm/extable.c              |  2 +
>  include/linux/highmem.h              |  8 +++
>  mm/memory.c                          |  2 +-
>  8 files changed, 144 insertions(+), 7 deletions(-)
>  create mode 100644 arch/arm64/lib/copy_page_mc.S
> 
> diff --git a/arch/arm64/include/asm/asm-extable.h b/arch/arm64/include/asm/asm-extable.h
> index 80410899a9ad..74c056ddae15 100644
> --- a/arch/arm64/include/asm/asm-extable.h
> +++ b/arch/arm64/include/asm/asm-extable.h
> @@ -14,6 +14,7 @@
>  /* _MC indicates that can fixup from machine check errors */
>  #define EX_TYPE_UACCESS_MC		5
>  #define EX_TYPE_UACCESS_MC_ERR_ZERO	6
> +#define EX_TYPE_COPY_PAGE_MC		7
>  
>  #ifdef __ASSEMBLY__
>  
> @@ -42,6 +43,10 @@
>  	__ASM_EXTABLE_RAW(\insn, \fixup, EX_TYPE_UACCESS_MC, 0)
>  	.endm
>  
> +	.macro          _asm_extable_copy_page_mc, insn, fixup
> +	__ASM_EXTABLE_RAW(\insn, \fixup, EX_TYPE_COPY_PAGE_MC, 0)
> +	.endm
> +
>  /*
>   * Create an exception table entry for `insn` if `fixup` is provided. Otherwise
>   * do nothing.
> diff --git a/arch/arm64/include/asm/page.h b/arch/arm64/include/asm/page.h
> index 993a27ea6f54..832571a7dddb 100644
> --- a/arch/arm64/include/asm/page.h
> +++ b/arch/arm64/include/asm/page.h
> @@ -29,6 +29,16 @@ void copy_user_highpage(struct page *to, struct page *from,
>  void copy_highpage(struct page *to, struct page *from);
>  #define __HAVE_ARCH_COPY_HIGHPAGE
>  
> +#ifdef CONFIG_ARCH_HAS_COPY_MC
> +extern void copy_page_mc(void *to, const void *from);
> +void copy_highpage_mc(struct page *to, struct page *from);
> +#define __HAVE_ARCH_COPY_HIGHPAGE_MC
> +
> +void copy_user_highpage_mc(struct page *to, struct page *from,
> +		unsigned long vaddr, struct vm_area_struct *vma);
> +#define __HAVE_ARCH_COPY_USER_HIGHPAGE_MC
> +#endif
> +
>  struct page *alloc_zeroed_user_highpage_movable(struct vm_area_struct *vma,
>  						unsigned long vaddr);
>  #define __HAVE_ARCH_ALLOC_ZEROED_USER_HIGHPAGE_MOVABLE
> diff --git a/arch/arm64/lib/Makefile b/arch/arm64/lib/Makefile
> index 29490be2546b..0d9f292ef68a 100644
> --- a/arch/arm64/lib/Makefile
> +++ b/arch/arm64/lib/Makefile
> @@ -15,6 +15,8 @@ endif
>  
>  lib-$(CONFIG_ARCH_HAS_UACCESS_FLUSHCACHE) += uaccess_flushcache.o
>  
> +lib-$(CONFIG_ARCH_HAS_COPY_MC) += copy_page_mc.o
> +
>  obj-$(CONFIG_CRC32) += crc32.o
>  
>  obj-$(CONFIG_FUNCTION_ERROR_INJECTION) += error-inject.o
> diff --git a/arch/arm64/lib/copy_page_mc.S b/arch/arm64/lib/copy_page_mc.S
> new file mode 100644
> index 000000000000..655161363dcf
> --- /dev/null
> +++ b/arch/arm64/lib/copy_page_mc.S
> @@ -0,0 +1,86 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +/*
> + * Copyright (C) 2012 ARM Ltd.
> + */
> +
> +#include <linux/linkage.h>
> +#include <linux/const.h>
> +#include <asm/assembler.h>
> +#include <asm/page.h>
> +#include <asm/cpufeature.h>
> +#include <asm/alternative.h>
> +#include <asm/asm-extable.h>
> +
> +#define CPY_MC(l, x...)		\
> +9999:   x;			\
> +	_asm_extable_copy_page_mc    9999b, l
> +
> +/*
> + * Copy a page from src to dest (both are page aligned) with machine check
> + *
> + * Parameters:
> + *	x0 - dest
> + *	x1 - src
> + */
> +SYM_FUNC_START(__pi_copy_page_mc)
> +alternative_if ARM64_HAS_NO_HW_PREFETCH
> +	// Prefetch three cache lines ahead.
> +	prfm	pldl1strm, [x1, #128]
> +	prfm	pldl1strm, [x1, #256]
> +	prfm	pldl1strm, [x1, #384]
> +alternative_else_nop_endif
> +
> +CPY_MC(9998f, ldp	x2, x3, [x1])
> +CPY_MC(9998f, ldp	x4, x5, [x1, #16])
> +CPY_MC(9998f, ldp	x6, x7, [x1, #32])
> +CPY_MC(9998f, ldp	x8, x9, [x1, #48])
> +CPY_MC(9998f, ldp	x10, x11, [x1, #64])
> +CPY_MC(9998f, ldp	x12, x13, [x1, #80])
> +CPY_MC(9998f, ldp	x14, x15, [x1, #96])
> +CPY_MC(9998f, ldp	x16, x17, [x1, #112])
> +
> +	add	x0, x0, #256
> +	add	x1, x1, #128
> +1:
> +	tst	x0, #(PAGE_SIZE - 1)
> +
> +alternative_if ARM64_HAS_NO_HW_PREFETCH
> +	prfm	pldl1strm, [x1, #384]
> +alternative_else_nop_endif
> +
> +CPY_MC(9998f, stnp	x2, x3, [x0, #-256])
> +CPY_MC(9998f, ldp	x2, x3, [x1])
> +CPY_MC(9998f, stnp	x4, x5, [x0, #16 - 256])
> +CPY_MC(9998f, ldp	x4, x5, [x1, #16])
> +CPY_MC(9998f, stnp	x6, x7, [x0, #32 - 256])
> +CPY_MC(9998f, ldp	x6, x7, [x1, #32])
> +CPY_MC(9998f, stnp	x8, x9, [x0, #48 - 256])
> +CPY_MC(9998f, ldp	x8, x9, [x1, #48])
> +CPY_MC(9998f, stnp	x10, x11, [x0, #64 - 256])
> +CPY_MC(9998f, ldp	x10, x11, [x1, #64])
> +CPY_MC(9998f, stnp	x12, x13, [x0, #80 - 256])
> +CPY_MC(9998f, ldp	x12, x13, [x1, #80])
> +CPY_MC(9998f, stnp	x14, x15, [x0, #96 - 256])
> +CPY_MC(9998f, ldp	x14, x15, [x1, #96])
> +CPY_MC(9998f, stnp	x16, x17, [x0, #112 - 256])
> +CPY_MC(9998f, ldp	x16, x17, [x1, #112])
> +
> +	add	x0, x0, #128
> +	add	x1, x1, #128
> +
> +	b.ne	1b
> +
> +CPY_MC(9998f, stnp	x2, x3, [x0, #-256])
> +CPY_MC(9998f, stnp	x4, x5, [x0, #16 - 256])
> +CPY_MC(9998f, stnp	x6, x7, [x0, #32 - 256])
> +CPY_MC(9998f, stnp	x8, x9, [x0, #48 - 256])
> +CPY_MC(9998f, stnp	x10, x11, [x0, #64 - 256])
> +CPY_MC(9998f, stnp	x12, x13, [x0, #80 - 256])
> +CPY_MC(9998f, stnp	x14, x15, [x0, #96 - 256])
> +CPY_MC(9998f, stnp	x16, x17, [x0, #112 - 256])
> +
> +9998:	ret
> +
> +SYM_FUNC_END(__pi_copy_page_mc)
> +SYM_FUNC_ALIAS(copy_page_mc, __pi_copy_page_mc)
> +EXPORT_SYMBOL(copy_page_mc)
> diff --git a/arch/arm64/mm/copypage.c b/arch/arm64/mm/copypage.c
> index 0dea80bf6de4..0f28edfcb234 100644
> --- a/arch/arm64/mm/copypage.c
> +++ b/arch/arm64/mm/copypage.c
> @@ -14,13 +14,8 @@
>  #include <asm/cpufeature.h>
>  #include <asm/mte.h>
>  
> -void copy_highpage(struct page *to, struct page *from)
> +static void do_mte(struct page *to, struct page *from, void *kto, void *kfrom)
>  {
> -	void *kto = page_address(to);
> -	void *kfrom = page_address(from);
> -
> -	copy_page(kto, kfrom);
> -
>  	if (system_supports_mte() && test_bit(PG_mte_tagged, &from->flags)) {
>  		set_bit(PG_mte_tagged, &to->flags);
>  		page_kasan_tag_reset(to);
> @@ -35,6 +30,15 @@ void copy_highpage(struct page *to, struct page *from)
>  		mte_copy_page_tags(kto, kfrom);
>  	}
>  }
> +
> +void copy_highpage(struct page *to, struct page *from)
> +{
> +	void *kto = page_address(to);
> +	void *kfrom = page_address(from);
> +
> +	copy_page(kto, kfrom);
> +	do_mte(to, from, kto, kfrom);
> +}
>  EXPORT_SYMBOL(copy_highpage);
>  
>  void copy_user_highpage(struct page *to, struct page *from,
> @@ -44,3 +48,23 @@ void copy_user_highpage(struct page *to, struct page *from,
>  	flush_dcache_page(to);
>  }
>  EXPORT_SYMBOL_GPL(copy_user_highpage);
> +
> +#ifdef CONFIG_ARCH_HAS_COPY_MC
> +void copy_highpage_mc(struct page *to, struct page *from)
> +{
> +	void *kto = page_address(to);
> +	void *kfrom = page_address(from);
> +
> +	copy_page_mc(kto, kfrom);
> +	do_mte(to, from, kto, kfrom);
> +}
> +EXPORT_SYMBOL(copy_highpage_mc);

IIUC the do_mte() portion won't handle mermoy errors, so this isn't
actually going to recover safely.

Thanks,
Mark.

> +
> +void copy_user_highpage_mc(struct page *to, struct page *from,
> +			unsigned long vaddr, struct vm_area_struct *vma)
> +{
> +	copy_highpage_mc(to, from);
> +	flush_dcache_page(to);
> +}
> +EXPORT_SYMBOL_GPL(copy_user_highpage_mc);
> +#endif
> diff --git a/arch/arm64/mm/extable.c b/arch/arm64/mm/extable.c
> index 1023ccdb2f89..4c882d36dd64 100644
> --- a/arch/arm64/mm/extable.c
> +++ b/arch/arm64/mm/extable.c
> @@ -110,6 +110,8 @@ bool fixup_exception_mc(struct pt_regs *regs)
>  		return ex_handler_uaccess_type(ex, regs, FIXUP_TYPE_MC);
>  	case EX_TYPE_UACCESS_MC_ERR_ZERO:
>  		return ex_handler_uaccess_err_zero(ex, regs);
> +	case EX_TYPE_COPY_PAGE_MC:
> +		return ex_handler_fixup(ex, regs);
>  
>  	}
>  
> diff --git a/include/linux/highmem.h b/include/linux/highmem.h
> index 39bb9b47fa9c..a9dbf331b038 100644
> --- a/include/linux/highmem.h
> +++ b/include/linux/highmem.h
> @@ -283,6 +283,10 @@ static inline void copy_user_highpage(struct page *to, struct page *from,
>  
>  #endif
>  
> +#ifndef __HAVE_ARCH_COPY_USER_HIGHPAGE_MC
> +#define copy_user_highpage_mc copy_user_highpage
> +#endif
> +
>  #ifndef __HAVE_ARCH_COPY_HIGHPAGE
>  
>  static inline void copy_highpage(struct page *to, struct page *from)
> @@ -298,6 +302,10 @@ static inline void copy_highpage(struct page *to, struct page *from)
>  
>  #endif
>  
> +#ifndef __HAVE_ARCH_COPY_HIGHPAGE_MC
> +#define cop_highpage_mc copy_highpage
> +#endif
> +
>  static inline void memcpy_page(struct page *dst_page, size_t dst_off,
>  			       struct page *src_page, size_t src_off,
>  			       size_t len)
> diff --git a/mm/memory.c b/mm/memory.c
> index 76e3af9639d9..d5f62234152d 100644
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -2767,7 +2767,7 @@ static inline bool cow_user_page(struct page *dst, struct page *src,
>  	unsigned long addr = vmf->address;
>  
>  	if (likely(src)) {
> -		copy_user_highpage(dst, src, addr, vma);
> +		copy_user_highpage_mc(dst, src, addr, vma);
>  		return true;
>  	}
>  
> -- 
> 2.25.1
> 

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH -next v4 7/7] arm64: add cow to machine check safe
@ 2022-05-13 15:44     ` Mark Rutland
  0 siblings, 0 replies; 96+ messages in thread
From: Mark Rutland @ 2022-05-13 15:44 UTC (permalink / raw)
  To: Tong Tiangen
  Cc: Kefeng Wang, Dave Hansen, linux-mm, Paul Mackerras, Guohanjun,
	Will Deacon, H . Peter Anvin, x86, Ingo Molnar, Catalin Marinas,
	Xie XiuQi, Borislav Petkov, Alexander Viro, Thomas Gleixner,
	linux-arm-kernel, Robin Murphy, linux-kernel, James Morse,
	Andrew Morton, linuxppc-dev

On Wed, Apr 20, 2022 at 03:04:18AM +0000, Tong Tiangen wrote:
> In the cow(copy on write) processing, the data of the user process is
> copied, when hardware memory error is encountered during copy, only the
> relevant processes are affected, so killing the user process and isolate
> the user page with hardware memory errors is a more reasonable choice than
> kernel panic.

There are plenty of other places we'll access user pages via a kernel
alias (e.g. when performing IO), so why is this special?

To be clear, I am not entirely averse to this, but it seems like this is
being done because it's easy to do rather than necessarily being all
that useful, and I'm not keen on having to duplicate a bunch of logic
for this.

> Add new helper copy_page_mc() which provide a page copy implementation with
> machine check safe. At present, only used in cow. In future, we can expand
> more scenes. As long as the consequences of page copy failure are not
> fatal(eg: only affect user process), we can use this helper.
> 
> The copy_page_mc() in copy_page_mc.S is largely borrows from copy_page()
> in copy_page.S and the main difference is copy_page_mc() add extable entry
> to every load/store insn to support machine check safe. largely to keep the
> patch simple. If needed those optimizations can be folded in.
> 
> Add new extable type EX_TYPE_COPY_PAGE_MC which used in copy_page_mc().
> 
> This type only be processed in fixup_exception_mc(), The reason is that
> copy_page_mc() is consistent with copy_page() except machine check safe is
> considered, and copy_page() do not need to consider exception fixup.
> 
> Signed-off-by: Tong Tiangen <tongtiangen@huawei.com>
> ---
>  arch/arm64/include/asm/asm-extable.h |  5 ++
>  arch/arm64/include/asm/page.h        | 10 ++++
>  arch/arm64/lib/Makefile              |  2 +
>  arch/arm64/lib/copy_page_mc.S        | 86 ++++++++++++++++++++++++++++
>  arch/arm64/mm/copypage.c             | 36 ++++++++++--
>  arch/arm64/mm/extable.c              |  2 +
>  include/linux/highmem.h              |  8 +++
>  mm/memory.c                          |  2 +-
>  8 files changed, 144 insertions(+), 7 deletions(-)
>  create mode 100644 arch/arm64/lib/copy_page_mc.S
> 
> diff --git a/arch/arm64/include/asm/asm-extable.h b/arch/arm64/include/asm/asm-extable.h
> index 80410899a9ad..74c056ddae15 100644
> --- a/arch/arm64/include/asm/asm-extable.h
> +++ b/arch/arm64/include/asm/asm-extable.h
> @@ -14,6 +14,7 @@
>  /* _MC indicates that can fixup from machine check errors */
>  #define EX_TYPE_UACCESS_MC		5
>  #define EX_TYPE_UACCESS_MC_ERR_ZERO	6
> +#define EX_TYPE_COPY_PAGE_MC		7
>  
>  #ifdef __ASSEMBLY__
>  
> @@ -42,6 +43,10 @@
>  	__ASM_EXTABLE_RAW(\insn, \fixup, EX_TYPE_UACCESS_MC, 0)
>  	.endm
>  
> +	.macro          _asm_extable_copy_page_mc, insn, fixup
> +	__ASM_EXTABLE_RAW(\insn, \fixup, EX_TYPE_COPY_PAGE_MC, 0)
> +	.endm
> +
>  /*
>   * Create an exception table entry for `insn` if `fixup` is provided. Otherwise
>   * do nothing.
> diff --git a/arch/arm64/include/asm/page.h b/arch/arm64/include/asm/page.h
> index 993a27ea6f54..832571a7dddb 100644
> --- a/arch/arm64/include/asm/page.h
> +++ b/arch/arm64/include/asm/page.h
> @@ -29,6 +29,16 @@ void copy_user_highpage(struct page *to, struct page *from,
>  void copy_highpage(struct page *to, struct page *from);
>  #define __HAVE_ARCH_COPY_HIGHPAGE
>  
> +#ifdef CONFIG_ARCH_HAS_COPY_MC
> +extern void copy_page_mc(void *to, const void *from);
> +void copy_highpage_mc(struct page *to, struct page *from);
> +#define __HAVE_ARCH_COPY_HIGHPAGE_MC
> +
> +void copy_user_highpage_mc(struct page *to, struct page *from,
> +		unsigned long vaddr, struct vm_area_struct *vma);
> +#define __HAVE_ARCH_COPY_USER_HIGHPAGE_MC
> +#endif
> +
>  struct page *alloc_zeroed_user_highpage_movable(struct vm_area_struct *vma,
>  						unsigned long vaddr);
>  #define __HAVE_ARCH_ALLOC_ZEROED_USER_HIGHPAGE_MOVABLE
> diff --git a/arch/arm64/lib/Makefile b/arch/arm64/lib/Makefile
> index 29490be2546b..0d9f292ef68a 100644
> --- a/arch/arm64/lib/Makefile
> +++ b/arch/arm64/lib/Makefile
> @@ -15,6 +15,8 @@ endif
>  
>  lib-$(CONFIG_ARCH_HAS_UACCESS_FLUSHCACHE) += uaccess_flushcache.o
>  
> +lib-$(CONFIG_ARCH_HAS_COPY_MC) += copy_page_mc.o
> +
>  obj-$(CONFIG_CRC32) += crc32.o
>  
>  obj-$(CONFIG_FUNCTION_ERROR_INJECTION) += error-inject.o
> diff --git a/arch/arm64/lib/copy_page_mc.S b/arch/arm64/lib/copy_page_mc.S
> new file mode 100644
> index 000000000000..655161363dcf
> --- /dev/null
> +++ b/arch/arm64/lib/copy_page_mc.S
> @@ -0,0 +1,86 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +/*
> + * Copyright (C) 2012 ARM Ltd.
> + */
> +
> +#include <linux/linkage.h>
> +#include <linux/const.h>
> +#include <asm/assembler.h>
> +#include <asm/page.h>
> +#include <asm/cpufeature.h>
> +#include <asm/alternative.h>
> +#include <asm/asm-extable.h>
> +
> +#define CPY_MC(l, x...)		\
> +9999:   x;			\
> +	_asm_extable_copy_page_mc    9999b, l
> +
> +/*
> + * Copy a page from src to dest (both are page aligned) with machine check
> + *
> + * Parameters:
> + *	x0 - dest
> + *	x1 - src
> + */
> +SYM_FUNC_START(__pi_copy_page_mc)
> +alternative_if ARM64_HAS_NO_HW_PREFETCH
> +	// Prefetch three cache lines ahead.
> +	prfm	pldl1strm, [x1, #128]
> +	prfm	pldl1strm, [x1, #256]
> +	prfm	pldl1strm, [x1, #384]
> +alternative_else_nop_endif
> +
> +CPY_MC(9998f, ldp	x2, x3, [x1])
> +CPY_MC(9998f, ldp	x4, x5, [x1, #16])
> +CPY_MC(9998f, ldp	x6, x7, [x1, #32])
> +CPY_MC(9998f, ldp	x8, x9, [x1, #48])
> +CPY_MC(9998f, ldp	x10, x11, [x1, #64])
> +CPY_MC(9998f, ldp	x12, x13, [x1, #80])
> +CPY_MC(9998f, ldp	x14, x15, [x1, #96])
> +CPY_MC(9998f, ldp	x16, x17, [x1, #112])
> +
> +	add	x0, x0, #256
> +	add	x1, x1, #128
> +1:
> +	tst	x0, #(PAGE_SIZE - 1)
> +
> +alternative_if ARM64_HAS_NO_HW_PREFETCH
> +	prfm	pldl1strm, [x1, #384]
> +alternative_else_nop_endif
> +
> +CPY_MC(9998f, stnp	x2, x3, [x0, #-256])
> +CPY_MC(9998f, ldp	x2, x3, [x1])
> +CPY_MC(9998f, stnp	x4, x5, [x0, #16 - 256])
> +CPY_MC(9998f, ldp	x4, x5, [x1, #16])
> +CPY_MC(9998f, stnp	x6, x7, [x0, #32 - 256])
> +CPY_MC(9998f, ldp	x6, x7, [x1, #32])
> +CPY_MC(9998f, stnp	x8, x9, [x0, #48 - 256])
> +CPY_MC(9998f, ldp	x8, x9, [x1, #48])
> +CPY_MC(9998f, stnp	x10, x11, [x0, #64 - 256])
> +CPY_MC(9998f, ldp	x10, x11, [x1, #64])
> +CPY_MC(9998f, stnp	x12, x13, [x0, #80 - 256])
> +CPY_MC(9998f, ldp	x12, x13, [x1, #80])
> +CPY_MC(9998f, stnp	x14, x15, [x0, #96 - 256])
> +CPY_MC(9998f, ldp	x14, x15, [x1, #96])
> +CPY_MC(9998f, stnp	x16, x17, [x0, #112 - 256])
> +CPY_MC(9998f, ldp	x16, x17, [x1, #112])
> +
> +	add	x0, x0, #128
> +	add	x1, x1, #128
> +
> +	b.ne	1b
> +
> +CPY_MC(9998f, stnp	x2, x3, [x0, #-256])
> +CPY_MC(9998f, stnp	x4, x5, [x0, #16 - 256])
> +CPY_MC(9998f, stnp	x6, x7, [x0, #32 - 256])
> +CPY_MC(9998f, stnp	x8, x9, [x0, #48 - 256])
> +CPY_MC(9998f, stnp	x10, x11, [x0, #64 - 256])
> +CPY_MC(9998f, stnp	x12, x13, [x0, #80 - 256])
> +CPY_MC(9998f, stnp	x14, x15, [x0, #96 - 256])
> +CPY_MC(9998f, stnp	x16, x17, [x0, #112 - 256])
> +
> +9998:	ret
> +
> +SYM_FUNC_END(__pi_copy_page_mc)
> +SYM_FUNC_ALIAS(copy_page_mc, __pi_copy_page_mc)
> +EXPORT_SYMBOL(copy_page_mc)
> diff --git a/arch/arm64/mm/copypage.c b/arch/arm64/mm/copypage.c
> index 0dea80bf6de4..0f28edfcb234 100644
> --- a/arch/arm64/mm/copypage.c
> +++ b/arch/arm64/mm/copypage.c
> @@ -14,13 +14,8 @@
>  #include <asm/cpufeature.h>
>  #include <asm/mte.h>
>  
> -void copy_highpage(struct page *to, struct page *from)
> +static void do_mte(struct page *to, struct page *from, void *kto, void *kfrom)
>  {
> -	void *kto = page_address(to);
> -	void *kfrom = page_address(from);
> -
> -	copy_page(kto, kfrom);
> -
>  	if (system_supports_mte() && test_bit(PG_mte_tagged, &from->flags)) {
>  		set_bit(PG_mte_tagged, &to->flags);
>  		page_kasan_tag_reset(to);
> @@ -35,6 +30,15 @@ void copy_highpage(struct page *to, struct page *from)
>  		mte_copy_page_tags(kto, kfrom);
>  	}
>  }
> +
> +void copy_highpage(struct page *to, struct page *from)
> +{
> +	void *kto = page_address(to);
> +	void *kfrom = page_address(from);
> +
> +	copy_page(kto, kfrom);
> +	do_mte(to, from, kto, kfrom);
> +}
>  EXPORT_SYMBOL(copy_highpage);
>  
>  void copy_user_highpage(struct page *to, struct page *from,
> @@ -44,3 +48,23 @@ void copy_user_highpage(struct page *to, struct page *from,
>  	flush_dcache_page(to);
>  }
>  EXPORT_SYMBOL_GPL(copy_user_highpage);
> +
> +#ifdef CONFIG_ARCH_HAS_COPY_MC
> +void copy_highpage_mc(struct page *to, struct page *from)
> +{
> +	void *kto = page_address(to);
> +	void *kfrom = page_address(from);
> +
> +	copy_page_mc(kto, kfrom);
> +	do_mte(to, from, kto, kfrom);
> +}
> +EXPORT_SYMBOL(copy_highpage_mc);

IIUC the do_mte() portion won't handle mermoy errors, so this isn't
actually going to recover safely.

Thanks,
Mark.

> +
> +void copy_user_highpage_mc(struct page *to, struct page *from,
> +			unsigned long vaddr, struct vm_area_struct *vma)
> +{
> +	copy_highpage_mc(to, from);
> +	flush_dcache_page(to);
> +}
> +EXPORT_SYMBOL_GPL(copy_user_highpage_mc);
> +#endif
> diff --git a/arch/arm64/mm/extable.c b/arch/arm64/mm/extable.c
> index 1023ccdb2f89..4c882d36dd64 100644
> --- a/arch/arm64/mm/extable.c
> +++ b/arch/arm64/mm/extable.c
> @@ -110,6 +110,8 @@ bool fixup_exception_mc(struct pt_regs *regs)
>  		return ex_handler_uaccess_type(ex, regs, FIXUP_TYPE_MC);
>  	case EX_TYPE_UACCESS_MC_ERR_ZERO:
>  		return ex_handler_uaccess_err_zero(ex, regs);
> +	case EX_TYPE_COPY_PAGE_MC:
> +		return ex_handler_fixup(ex, regs);
>  
>  	}
>  
> diff --git a/include/linux/highmem.h b/include/linux/highmem.h
> index 39bb9b47fa9c..a9dbf331b038 100644
> --- a/include/linux/highmem.h
> +++ b/include/linux/highmem.h
> @@ -283,6 +283,10 @@ static inline void copy_user_highpage(struct page *to, struct page *from,
>  
>  #endif
>  
> +#ifndef __HAVE_ARCH_COPY_USER_HIGHPAGE_MC
> +#define copy_user_highpage_mc copy_user_highpage
> +#endif
> +
>  #ifndef __HAVE_ARCH_COPY_HIGHPAGE
>  
>  static inline void copy_highpage(struct page *to, struct page *from)
> @@ -298,6 +302,10 @@ static inline void copy_highpage(struct page *to, struct page *from)
>  
>  #endif
>  
> +#ifndef __HAVE_ARCH_COPY_HIGHPAGE_MC
> +#define cop_highpage_mc copy_highpage
> +#endif
> +
>  static inline void memcpy_page(struct page *dst_page, size_t dst_off,
>  			       struct page *src_page, size_t src_off,
>  			       size_t len)
> diff --git a/mm/memory.c b/mm/memory.c
> index 76e3af9639d9..d5f62234152d 100644
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -2767,7 +2767,7 @@ static inline bool cow_user_page(struct page *dst, struct page *src,
>  	unsigned long addr = vmf->address;
>  
>  	if (likely(src)) {
> -		copy_user_highpage(dst, src, addr, vma);
> +		copy_user_highpage_mc(dst, src, addr, vma);
>  		return true;
>  	}
>  
> -- 
> 2.25.1
> 

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH -next v4 7/7] arm64: add cow to machine check safe
@ 2022-05-13 15:44     ` Mark Rutland
  0 siblings, 0 replies; 96+ messages in thread
From: Mark Rutland @ 2022-05-13 15:44 UTC (permalink / raw)
  To: Tong Tiangen
  Cc: James Morse, Andrew Morton, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Robin Murphy, Dave Hansen, Catalin Marinas,
	Will Deacon, Alexander Viro, Michael Ellerman,
	Benjamin Herrenschmidt, Paul Mackerras, x86, H . Peter Anvin,
	linuxppc-dev, linux-arm-kernel, linux-kernel, linux-mm,
	Kefeng Wang, Xie XiuQi, Guohanjun

On Wed, Apr 20, 2022 at 03:04:18AM +0000, Tong Tiangen wrote:
> In the cow(copy on write) processing, the data of the user process is
> copied, when hardware memory error is encountered during copy, only the
> relevant processes are affected, so killing the user process and isolate
> the user page with hardware memory errors is a more reasonable choice than
> kernel panic.

There are plenty of other places we'll access user pages via a kernel
alias (e.g. when performing IO), so why is this special?

To be clear, I am not entirely averse to this, but it seems like this is
being done because it's easy to do rather than necessarily being all
that useful, and I'm not keen on having to duplicate a bunch of logic
for this.

> Add new helper copy_page_mc() which provide a page copy implementation with
> machine check safe. At present, only used in cow. In future, we can expand
> more scenes. As long as the consequences of page copy failure are not
> fatal(eg: only affect user process), we can use this helper.
> 
> The copy_page_mc() in copy_page_mc.S is largely borrows from copy_page()
> in copy_page.S and the main difference is copy_page_mc() add extable entry
> to every load/store insn to support machine check safe. largely to keep the
> patch simple. If needed those optimizations can be folded in.
> 
> Add new extable type EX_TYPE_COPY_PAGE_MC which used in copy_page_mc().
> 
> This type only be processed in fixup_exception_mc(), The reason is that
> copy_page_mc() is consistent with copy_page() except machine check safe is
> considered, and copy_page() do not need to consider exception fixup.
> 
> Signed-off-by: Tong Tiangen <tongtiangen@huawei.com>
> ---
>  arch/arm64/include/asm/asm-extable.h |  5 ++
>  arch/arm64/include/asm/page.h        | 10 ++++
>  arch/arm64/lib/Makefile              |  2 +
>  arch/arm64/lib/copy_page_mc.S        | 86 ++++++++++++++++++++++++++++
>  arch/arm64/mm/copypage.c             | 36 ++++++++++--
>  arch/arm64/mm/extable.c              |  2 +
>  include/linux/highmem.h              |  8 +++
>  mm/memory.c                          |  2 +-
>  8 files changed, 144 insertions(+), 7 deletions(-)
>  create mode 100644 arch/arm64/lib/copy_page_mc.S
> 
> diff --git a/arch/arm64/include/asm/asm-extable.h b/arch/arm64/include/asm/asm-extable.h
> index 80410899a9ad..74c056ddae15 100644
> --- a/arch/arm64/include/asm/asm-extable.h
> +++ b/arch/arm64/include/asm/asm-extable.h
> @@ -14,6 +14,7 @@
>  /* _MC indicates that can fixup from machine check errors */
>  #define EX_TYPE_UACCESS_MC		5
>  #define EX_TYPE_UACCESS_MC_ERR_ZERO	6
> +#define EX_TYPE_COPY_PAGE_MC		7
>  
>  #ifdef __ASSEMBLY__
>  
> @@ -42,6 +43,10 @@
>  	__ASM_EXTABLE_RAW(\insn, \fixup, EX_TYPE_UACCESS_MC, 0)
>  	.endm
>  
> +	.macro          _asm_extable_copy_page_mc, insn, fixup
> +	__ASM_EXTABLE_RAW(\insn, \fixup, EX_TYPE_COPY_PAGE_MC, 0)
> +	.endm
> +
>  /*
>   * Create an exception table entry for `insn` if `fixup` is provided. Otherwise
>   * do nothing.
> diff --git a/arch/arm64/include/asm/page.h b/arch/arm64/include/asm/page.h
> index 993a27ea6f54..832571a7dddb 100644
> --- a/arch/arm64/include/asm/page.h
> +++ b/arch/arm64/include/asm/page.h
> @@ -29,6 +29,16 @@ void copy_user_highpage(struct page *to, struct page *from,
>  void copy_highpage(struct page *to, struct page *from);
>  #define __HAVE_ARCH_COPY_HIGHPAGE
>  
> +#ifdef CONFIG_ARCH_HAS_COPY_MC
> +extern void copy_page_mc(void *to, const void *from);
> +void copy_highpage_mc(struct page *to, struct page *from);
> +#define __HAVE_ARCH_COPY_HIGHPAGE_MC
> +
> +void copy_user_highpage_mc(struct page *to, struct page *from,
> +		unsigned long vaddr, struct vm_area_struct *vma);
> +#define __HAVE_ARCH_COPY_USER_HIGHPAGE_MC
> +#endif
> +
>  struct page *alloc_zeroed_user_highpage_movable(struct vm_area_struct *vma,
>  						unsigned long vaddr);
>  #define __HAVE_ARCH_ALLOC_ZEROED_USER_HIGHPAGE_MOVABLE
> diff --git a/arch/arm64/lib/Makefile b/arch/arm64/lib/Makefile
> index 29490be2546b..0d9f292ef68a 100644
> --- a/arch/arm64/lib/Makefile
> +++ b/arch/arm64/lib/Makefile
> @@ -15,6 +15,8 @@ endif
>  
>  lib-$(CONFIG_ARCH_HAS_UACCESS_FLUSHCACHE) += uaccess_flushcache.o
>  
> +lib-$(CONFIG_ARCH_HAS_COPY_MC) += copy_page_mc.o
> +
>  obj-$(CONFIG_CRC32) += crc32.o
>  
>  obj-$(CONFIG_FUNCTION_ERROR_INJECTION) += error-inject.o
> diff --git a/arch/arm64/lib/copy_page_mc.S b/arch/arm64/lib/copy_page_mc.S
> new file mode 100644
> index 000000000000..655161363dcf
> --- /dev/null
> +++ b/arch/arm64/lib/copy_page_mc.S
> @@ -0,0 +1,86 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +/*
> + * Copyright (C) 2012 ARM Ltd.
> + */
> +
> +#include <linux/linkage.h>
> +#include <linux/const.h>
> +#include <asm/assembler.h>
> +#include <asm/page.h>
> +#include <asm/cpufeature.h>
> +#include <asm/alternative.h>
> +#include <asm/asm-extable.h>
> +
> +#define CPY_MC(l, x...)		\
> +9999:   x;			\
> +	_asm_extable_copy_page_mc    9999b, l
> +
> +/*
> + * Copy a page from src to dest (both are page aligned) with machine check
> + *
> + * Parameters:
> + *	x0 - dest
> + *	x1 - src
> + */
> +SYM_FUNC_START(__pi_copy_page_mc)
> +alternative_if ARM64_HAS_NO_HW_PREFETCH
> +	// Prefetch three cache lines ahead.
> +	prfm	pldl1strm, [x1, #128]
> +	prfm	pldl1strm, [x1, #256]
> +	prfm	pldl1strm, [x1, #384]
> +alternative_else_nop_endif
> +
> +CPY_MC(9998f, ldp	x2, x3, [x1])
> +CPY_MC(9998f, ldp	x4, x5, [x1, #16])
> +CPY_MC(9998f, ldp	x6, x7, [x1, #32])
> +CPY_MC(9998f, ldp	x8, x9, [x1, #48])
> +CPY_MC(9998f, ldp	x10, x11, [x1, #64])
> +CPY_MC(9998f, ldp	x12, x13, [x1, #80])
> +CPY_MC(9998f, ldp	x14, x15, [x1, #96])
> +CPY_MC(9998f, ldp	x16, x17, [x1, #112])
> +
> +	add	x0, x0, #256
> +	add	x1, x1, #128
> +1:
> +	tst	x0, #(PAGE_SIZE - 1)
> +
> +alternative_if ARM64_HAS_NO_HW_PREFETCH
> +	prfm	pldl1strm, [x1, #384]
> +alternative_else_nop_endif
> +
> +CPY_MC(9998f, stnp	x2, x3, [x0, #-256])
> +CPY_MC(9998f, ldp	x2, x3, [x1])
> +CPY_MC(9998f, stnp	x4, x5, [x0, #16 - 256])
> +CPY_MC(9998f, ldp	x4, x5, [x1, #16])
> +CPY_MC(9998f, stnp	x6, x7, [x0, #32 - 256])
> +CPY_MC(9998f, ldp	x6, x7, [x1, #32])
> +CPY_MC(9998f, stnp	x8, x9, [x0, #48 - 256])
> +CPY_MC(9998f, ldp	x8, x9, [x1, #48])
> +CPY_MC(9998f, stnp	x10, x11, [x0, #64 - 256])
> +CPY_MC(9998f, ldp	x10, x11, [x1, #64])
> +CPY_MC(9998f, stnp	x12, x13, [x0, #80 - 256])
> +CPY_MC(9998f, ldp	x12, x13, [x1, #80])
> +CPY_MC(9998f, stnp	x14, x15, [x0, #96 - 256])
> +CPY_MC(9998f, ldp	x14, x15, [x1, #96])
> +CPY_MC(9998f, stnp	x16, x17, [x0, #112 - 256])
> +CPY_MC(9998f, ldp	x16, x17, [x1, #112])
> +
> +	add	x0, x0, #128
> +	add	x1, x1, #128
> +
> +	b.ne	1b
> +
> +CPY_MC(9998f, stnp	x2, x3, [x0, #-256])
> +CPY_MC(9998f, stnp	x4, x5, [x0, #16 - 256])
> +CPY_MC(9998f, stnp	x6, x7, [x0, #32 - 256])
> +CPY_MC(9998f, stnp	x8, x9, [x0, #48 - 256])
> +CPY_MC(9998f, stnp	x10, x11, [x0, #64 - 256])
> +CPY_MC(9998f, stnp	x12, x13, [x0, #80 - 256])
> +CPY_MC(9998f, stnp	x14, x15, [x0, #96 - 256])
> +CPY_MC(9998f, stnp	x16, x17, [x0, #112 - 256])
> +
> +9998:	ret
> +
> +SYM_FUNC_END(__pi_copy_page_mc)
> +SYM_FUNC_ALIAS(copy_page_mc, __pi_copy_page_mc)
> +EXPORT_SYMBOL(copy_page_mc)
> diff --git a/arch/arm64/mm/copypage.c b/arch/arm64/mm/copypage.c
> index 0dea80bf6de4..0f28edfcb234 100644
> --- a/arch/arm64/mm/copypage.c
> +++ b/arch/arm64/mm/copypage.c
> @@ -14,13 +14,8 @@
>  #include <asm/cpufeature.h>
>  #include <asm/mte.h>
>  
> -void copy_highpage(struct page *to, struct page *from)
> +static void do_mte(struct page *to, struct page *from, void *kto, void *kfrom)
>  {
> -	void *kto = page_address(to);
> -	void *kfrom = page_address(from);
> -
> -	copy_page(kto, kfrom);
> -
>  	if (system_supports_mte() && test_bit(PG_mte_tagged, &from->flags)) {
>  		set_bit(PG_mte_tagged, &to->flags);
>  		page_kasan_tag_reset(to);
> @@ -35,6 +30,15 @@ void copy_highpage(struct page *to, struct page *from)
>  		mte_copy_page_tags(kto, kfrom);
>  	}
>  }
> +
> +void copy_highpage(struct page *to, struct page *from)
> +{
> +	void *kto = page_address(to);
> +	void *kfrom = page_address(from);
> +
> +	copy_page(kto, kfrom);
> +	do_mte(to, from, kto, kfrom);
> +}
>  EXPORT_SYMBOL(copy_highpage);
>  
>  void copy_user_highpage(struct page *to, struct page *from,
> @@ -44,3 +48,23 @@ void copy_user_highpage(struct page *to, struct page *from,
>  	flush_dcache_page(to);
>  }
>  EXPORT_SYMBOL_GPL(copy_user_highpage);
> +
> +#ifdef CONFIG_ARCH_HAS_COPY_MC
> +void copy_highpage_mc(struct page *to, struct page *from)
> +{
> +	void *kto = page_address(to);
> +	void *kfrom = page_address(from);
> +
> +	copy_page_mc(kto, kfrom);
> +	do_mte(to, from, kto, kfrom);
> +}
> +EXPORT_SYMBOL(copy_highpage_mc);

IIUC the do_mte() portion won't handle mermoy errors, so this isn't
actually going to recover safely.

Thanks,
Mark.

> +
> +void copy_user_highpage_mc(struct page *to, struct page *from,
> +			unsigned long vaddr, struct vm_area_struct *vma)
> +{
> +	copy_highpage_mc(to, from);
> +	flush_dcache_page(to);
> +}
> +EXPORT_SYMBOL_GPL(copy_user_highpage_mc);
> +#endif
> diff --git a/arch/arm64/mm/extable.c b/arch/arm64/mm/extable.c
> index 1023ccdb2f89..4c882d36dd64 100644
> --- a/arch/arm64/mm/extable.c
> +++ b/arch/arm64/mm/extable.c
> @@ -110,6 +110,8 @@ bool fixup_exception_mc(struct pt_regs *regs)
>  		return ex_handler_uaccess_type(ex, regs, FIXUP_TYPE_MC);
>  	case EX_TYPE_UACCESS_MC_ERR_ZERO:
>  		return ex_handler_uaccess_err_zero(ex, regs);
> +	case EX_TYPE_COPY_PAGE_MC:
> +		return ex_handler_fixup(ex, regs);
>  
>  	}
>  
> diff --git a/include/linux/highmem.h b/include/linux/highmem.h
> index 39bb9b47fa9c..a9dbf331b038 100644
> --- a/include/linux/highmem.h
> +++ b/include/linux/highmem.h
> @@ -283,6 +283,10 @@ static inline void copy_user_highpage(struct page *to, struct page *from,
>  
>  #endif
>  
> +#ifndef __HAVE_ARCH_COPY_USER_HIGHPAGE_MC
> +#define copy_user_highpage_mc copy_user_highpage
> +#endif
> +
>  #ifndef __HAVE_ARCH_COPY_HIGHPAGE
>  
>  static inline void copy_highpage(struct page *to, struct page *from)
> @@ -298,6 +302,10 @@ static inline void copy_highpage(struct page *to, struct page *from)
>  
>  #endif
>  
> +#ifndef __HAVE_ARCH_COPY_HIGHPAGE_MC
> +#define cop_highpage_mc copy_highpage
> +#endif
> +
>  static inline void memcpy_page(struct page *dst_page, size_t dst_off,
>  			       struct page *src_page, size_t src_off,
>  			       size_t len)
> diff --git a/mm/memory.c b/mm/memory.c
> index 76e3af9639d9..d5f62234152d 100644
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -2767,7 +2767,7 @@ static inline bool cow_user_page(struct page *dst, struct page *src,
>  	unsigned long addr = vmf->address;
>  
>  	if (likely(src)) {
> -		copy_user_highpage(dst, src, addr, vma);
> +		copy_user_highpage_mc(dst, src, addr, vma);
>  		return true;
>  	}
>  
> -- 
> 2.25.1
> 

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: (subset) [PATCH -next v4 0/7]arm64: add machine check safe support
  2022-04-20  3:04 ` Tong Tiangen
  (?)
@ 2022-05-16 18:45   ` Catalin Marinas
  -1 siblings, 0 replies; 96+ messages in thread
From: Catalin Marinas @ 2022-05-16 18:45 UTC (permalink / raw)
  To: H . Peter Anvin, Mark Rutland, Alexander Viro, Will Deacon,
	Michael Ellerman, Ingo Molnar, Paul Mackerras, Tong Tiangen,
	Thomas Gleixner, Borislav Petkov, James Morse, Robin Murphy,
	Benjamin Herrenschmidt, Andrew Morton, x86, Dave Hansen
  Cc: Guohanjun, linux-arm-kernel, linux-kernel, Kefeng Wang,
	Xie XiuQi, linuxppc-dev, linux-mm

On Wed, 20 Apr 2022 03:04:11 +0000, Tong Tiangen wrote:
> With the increase of memory capacity and density, the probability of
> memory error increases. The increasing size and density of server RAM
> in the data center and cloud have shown increased uncorrectable memory
> errors.
> 
> Currently, the kernel has a mechanism to recover from hardware memory
> errors. This patchset provides an new recovery mechanism.
> 
> [...]

Applied to arm64 (for-next/misc), thanks!

[5/7] arm64: mte: Clean up user tag accessors
      https://git.kernel.org/arm64/c/b4d6bb38f9dc

-- 
Catalin


^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: (subset) [PATCH -next v4 0/7]arm64: add machine check safe support
@ 2022-05-16 18:45   ` Catalin Marinas
  0 siblings, 0 replies; 96+ messages in thread
From: Catalin Marinas @ 2022-05-16 18:45 UTC (permalink / raw)
  To: H . Peter Anvin, Mark Rutland, Alexander Viro, Will Deacon,
	Michael Ellerman, Ingo Molnar, Paul Mackerras, Tong Tiangen,
	Thomas Gleixner, Borislav Petkov, James Morse, Robin Murphy,
	Benjamin Herrenschmidt, Andrew Morton, x86, Dave Hansen
  Cc: Kefeng Wang, Xie XiuQi, linux-kernel, linux-mm, Guohanjun,
	linuxppc-dev, linux-arm-kernel

On Wed, 20 Apr 2022 03:04:11 +0000, Tong Tiangen wrote:
> With the increase of memory capacity and density, the probability of
> memory error increases. The increasing size and density of server RAM
> in the data center and cloud have shown increased uncorrectable memory
> errors.
> 
> Currently, the kernel has a mechanism to recover from hardware memory
> errors. This patchset provides an new recovery mechanism.
> 
> [...]

Applied to arm64 (for-next/misc), thanks!

[5/7] arm64: mte: Clean up user tag accessors
      https://git.kernel.org/arm64/c/b4d6bb38f9dc

-- 
Catalin


^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: (subset) [PATCH -next v4 0/7]arm64: add machine check safe support
@ 2022-05-16 18:45   ` Catalin Marinas
  0 siblings, 0 replies; 96+ messages in thread
From: Catalin Marinas @ 2022-05-16 18:45 UTC (permalink / raw)
  To: H . Peter Anvin, Mark Rutland, Alexander Viro, Will Deacon,
	Michael Ellerman, Ingo Molnar, Paul Mackerras, Tong Tiangen,
	Thomas Gleixner, Borislav Petkov, James Morse, Robin Murphy,
	Benjamin Herrenschmidt, Andrew Morton, x86, Dave Hansen
  Cc: Guohanjun, linux-arm-kernel, linux-kernel, Kefeng Wang,
	Xie XiuQi, linuxppc-dev, linux-mm

On Wed, 20 Apr 2022 03:04:11 +0000, Tong Tiangen wrote:
> With the increase of memory capacity and density, the probability of
> memory error increases. The increasing size and density of server RAM
> in the data center and cloud have shown increased uncorrectable memory
> errors.
> 
> Currently, the kernel has a mechanism to recover from hardware memory
> errors. This patchset provides an new recovery mechanism.
> 
> [...]

Applied to arm64 (for-next/misc), thanks!

[5/7] arm64: mte: Clean up user tag accessors
      https://git.kernel.org/arm64/c/b4d6bb38f9dc

-- 
Catalin


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH -next v4 3/7] arm64: add support for machine check error safe
  2022-05-13 15:26     ` Mark Rutland
  (?)
@ 2022-05-19  6:29       ` Tong Tiangen
  -1 siblings, 0 replies; 96+ messages in thread
From: Tong Tiangen @ 2022-05-19  6:29 UTC (permalink / raw)
  To: Mark Rutland
  Cc: James Morse, Andrew Morton, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Robin Murphy, Dave Hansen, Catalin Marinas,
	Will Deacon, Alexander Viro, Michael Ellerman,
	Benjamin Herrenschmidt, Paul Mackerras, x86, H . Peter Anvin,
	linuxppc-dev, linux-arm-kernel, linux-kernel, linux-mm,
	Kefeng Wang, Xie XiuQi, Guohanjun



在 2022/5/13 23:26, Mark Rutland 写道:
> On Wed, Apr 20, 2022 at 03:04:14AM +0000, Tong Tiangen wrote:
>> During the processing of arm64 kernel hardware memory errors(do_sea()), if
>> the errors is consumed in the kernel, the current processing is panic.
>> However, it is not optimal.
>>
>> Take uaccess for example, if the uaccess operation fails due to memory
>> error, only the user process will be affected, kill the user process
>> and isolate the user page with hardware memory errors is a better choice.
> 
> Conceptually, I'm fine with the idea of constraining what we do for a
> true uaccess, but I don't like the implementation of this at all, and I
> think we first need to clean up the arm64 extable usage to clearly
> distinguish a uaccess from another access.

OK,using EX_TYPE_UACCESS and this extable type could be recover, this is 
more reasonable.

For EX_TYPE_UACCESS_ERR_ZERO, today we use it for kernel accesses in a 
couple of cases, such as 
get_user/futex/__user_cache_maint()/__user_swpX_asm(), your suggestion is:
get_user continues to use EX_TYPE_UACCESS_ERR_ZERO and the other cases 
use new type EX_TYPE_FIXUP_ERR_ZERO?

Thanks,
Tong.

> 
>> This patch only enable machine error check framework, it add exception
>> fixup before kernel panic in do_sea() and only limit the consumption of
>> hardware memory errors in kernel mode triggered by user mode processes.
>> If fixup successful, panic can be avoided.
>>
>> Consistent with PPC/x86, it is implemented by CONFIG_ARCH_HAS_COPY_MC.
>>
>> Also add copy_mc_to_user() in include/linux/uaccess.h, this helper is
>> called when CONFIG_ARCH_HAS_COPOY_MC is open.
>>
>> Signed-off-by: Tong Tiangen <tongtiangen@huawei.com>
>> ---
>>   arch/arm64/Kconfig               |  1 +
>>   arch/arm64/include/asm/extable.h |  1 +
>>   arch/arm64/mm/extable.c          | 17 +++++++++++++++++
>>   arch/arm64/mm/fault.c            | 27 ++++++++++++++++++++++++++-
>>   include/linux/uaccess.h          |  9 +++++++++
>>   5 files changed, 54 insertions(+), 1 deletion(-)
>>
>> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
>> index d9325dd95eba..012e38309955 100644
>> --- a/arch/arm64/Kconfig
>> +++ b/arch/arm64/Kconfig
>> @@ -19,6 +19,7 @@ config ARM64
>>   	select ARCH_ENABLE_SPLIT_PMD_PTLOCK if PGTABLE_LEVELS > 2
>>   	select ARCH_ENABLE_THP_MIGRATION if TRANSPARENT_HUGEPAGE
>>   	select ARCH_HAS_CACHE_LINE_SIZE
>> +	select ARCH_HAS_COPY_MC if ACPI_APEI_GHES
>>   	select ARCH_HAS_CURRENT_STACK_POINTER
>>   	select ARCH_HAS_DEBUG_VIRTUAL
>>   	select ARCH_HAS_DEBUG_VM_PGTABLE
>> diff --git a/arch/arm64/include/asm/extable.h b/arch/arm64/include/asm/extable.h
>> index 72b0e71cc3de..f80ebd0addfd 100644
>> --- a/arch/arm64/include/asm/extable.h
>> +++ b/arch/arm64/include/asm/extable.h
>> @@ -46,4 +46,5 @@ bool ex_handler_bpf(const struct exception_table_entry *ex,
>>   #endif /* !CONFIG_BPF_JIT */
>>   
>>   bool fixup_exception(struct pt_regs *regs);
>> +bool fixup_exception_mc(struct pt_regs *regs);
>>   #endif
>> diff --git a/arch/arm64/mm/extable.c b/arch/arm64/mm/extable.c
>> index 489455309695..4f0083a550d4 100644
>> --- a/arch/arm64/mm/extable.c
>> +++ b/arch/arm64/mm/extable.c
>> @@ -9,6 +9,7 @@
>>   
>>   #include <asm/asm-extable.h>
>>   #include <asm/ptrace.h>
>> +#include <asm/esr.h>
>>   
>>   static inline unsigned long
>>   get_ex_fixup(const struct exception_table_entry *ex)
>> @@ -84,3 +85,19 @@ bool fixup_exception(struct pt_regs *regs)
>>   
>>   	BUG();
>>   }
>> +
>> +bool fixup_exception_mc(struct pt_regs *regs)
>> +{
>> +	const struct exception_table_entry *ex;
>> +
>> +	ex = search_exception_tables(instruction_pointer(regs));
>> +	if (!ex)
>> +		return false;
>> +
>> +	/*
>> +	 * This is not complete, More Machine check safe extable type can
>> +	 * be processed here.
>> +	 */
>> +
>> +	return false;
>> +}
> 
> This is at best misnamed; It doesn't actually apply the fixup, it just
> searches for one.

Yeah, you're right about the current logic, so i added notes to explain 
the scenarios that will be added later.

> 
>> diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c
>> index 77341b160aca..a9e6fb1999d1 100644
>> --- a/arch/arm64/mm/fault.c
>> +++ b/arch/arm64/mm/fault.c
>> @@ -695,6 +695,29 @@ static int do_bad(unsigned long far, unsigned int esr, struct pt_regs *regs)
>>   	return 1; /* "fault" */
>>   }
>>   
>> +static bool arm64_do_kernel_sea(unsigned long addr, unsigned int esr,
>> +				     struct pt_regs *regs, int sig, int code)
>> +{
>> +	if (!IS_ENABLED(CONFIG_ARCH_HAS_COPY_MC))
>> +		return false;
>> +
>> +	if (user_mode(regs) || !current->mm)
>> +		return false;
>> +
>> +	if (apei_claim_sea(regs) < 0)
>> +		return false;
>> +
>> +	if (!fixup_exception_mc(regs))
>> +		return false;
>> +
>> +	set_thread_esr(0, esr);
>> +
>> +	arm64_force_sig_fault(sig, code, addr,
>> +		"Uncorrected hardware memory error in kernel-access\n");
>> +
>> +	return true;
>> +}
>> +
>>   static int do_sea(unsigned long far, unsigned int esr, struct pt_regs *regs)
>>   {
>>   	const struct fault_info *inf;
>> @@ -720,7 +743,9 @@ static int do_sea(unsigned long far, unsigned int esr, struct pt_regs *regs)
>>   		 */
>>   		siaddr  = untagged_addr(far);
>>   	}
>> -	arm64_notify_die(inf->name, regs, inf->sig, inf->code, siaddr, esr);
>> +
>> +	if (!arm64_do_kernel_sea(siaddr, esr, regs, inf->sig, inf->code))
>> +		arm64_notify_die(inf->name, regs, inf->sig, inf->code, siaddr, esr);
>>   
>>   	return 0;
>>   }
>> diff --git a/include/linux/uaccess.h b/include/linux/uaccess.h
>> index 546179418ffa..884661b29c17 100644
>> --- a/include/linux/uaccess.h
>> +++ b/include/linux/uaccess.h
>> @@ -174,6 +174,15 @@ copy_mc_to_kernel(void *dst, const void *src, size_t cnt)
>>   }
>>   #endif
>>   
>> +#ifndef copy_mc_to_user
>> +static inline unsigned long __must_check
>> +copy_mc_to_user(void *dst, const void *src, size_t cnt)
>> +{
>> +	check_object_size(src, cnt, true);
>> +	return raw_copy_to_user(dst, src, cnt);
>> +}
>> +#endif
> 
> Why do we need a special copy_mc_to_user() ?
> 
> Why are we not making *every* true uaccess recoverable? That way the
> regular copy_to_user() would just work.

Agreed, will fixed next version.

Thanks,
Tong.

> 
> Thanks,
> Mark.
> .

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH -next v4 3/7] arm64: add support for machine check error safe
@ 2022-05-19  6:29       ` Tong Tiangen
  0 siblings, 0 replies; 96+ messages in thread
From: Tong Tiangen @ 2022-05-19  6:29 UTC (permalink / raw)
  To: Mark Rutland
  Cc: James Morse, Andrew Morton, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Robin Murphy, Dave Hansen, Catalin Marinas,
	Will Deacon, Alexander Viro, Michael Ellerman,
	Benjamin Herrenschmidt, Paul Mackerras, x86, H . Peter Anvin,
	linuxppc-dev, linux-arm-kernel, linux-kernel, linux-mm,
	Kefeng Wang, Xie XiuQi, Guohanjun



在 2022/5/13 23:26, Mark Rutland 写道:
> On Wed, Apr 20, 2022 at 03:04:14AM +0000, Tong Tiangen wrote:
>> During the processing of arm64 kernel hardware memory errors(do_sea()), if
>> the errors is consumed in the kernel, the current processing is panic.
>> However, it is not optimal.
>>
>> Take uaccess for example, if the uaccess operation fails due to memory
>> error, only the user process will be affected, kill the user process
>> and isolate the user page with hardware memory errors is a better choice.
> 
> Conceptually, I'm fine with the idea of constraining what we do for a
> true uaccess, but I don't like the implementation of this at all, and I
> think we first need to clean up the arm64 extable usage to clearly
> distinguish a uaccess from another access.

OK,using EX_TYPE_UACCESS and this extable type could be recover, this is 
more reasonable.

For EX_TYPE_UACCESS_ERR_ZERO, today we use it for kernel accesses in a 
couple of cases, such as 
get_user/futex/__user_cache_maint()/__user_swpX_asm(), your suggestion is:
get_user continues to use EX_TYPE_UACCESS_ERR_ZERO and the other cases 
use new type EX_TYPE_FIXUP_ERR_ZERO?

Thanks,
Tong.

> 
>> This patch only enable machine error check framework, it add exception
>> fixup before kernel panic in do_sea() and only limit the consumption of
>> hardware memory errors in kernel mode triggered by user mode processes.
>> If fixup successful, panic can be avoided.
>>
>> Consistent with PPC/x86, it is implemented by CONFIG_ARCH_HAS_COPY_MC.
>>
>> Also add copy_mc_to_user() in include/linux/uaccess.h, this helper is
>> called when CONFIG_ARCH_HAS_COPOY_MC is open.
>>
>> Signed-off-by: Tong Tiangen <tongtiangen@huawei.com>
>> ---
>>   arch/arm64/Kconfig               |  1 +
>>   arch/arm64/include/asm/extable.h |  1 +
>>   arch/arm64/mm/extable.c          | 17 +++++++++++++++++
>>   arch/arm64/mm/fault.c            | 27 ++++++++++++++++++++++++++-
>>   include/linux/uaccess.h          |  9 +++++++++
>>   5 files changed, 54 insertions(+), 1 deletion(-)
>>
>> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
>> index d9325dd95eba..012e38309955 100644
>> --- a/arch/arm64/Kconfig
>> +++ b/arch/arm64/Kconfig
>> @@ -19,6 +19,7 @@ config ARM64
>>   	select ARCH_ENABLE_SPLIT_PMD_PTLOCK if PGTABLE_LEVELS > 2
>>   	select ARCH_ENABLE_THP_MIGRATION if TRANSPARENT_HUGEPAGE
>>   	select ARCH_HAS_CACHE_LINE_SIZE
>> +	select ARCH_HAS_COPY_MC if ACPI_APEI_GHES
>>   	select ARCH_HAS_CURRENT_STACK_POINTER
>>   	select ARCH_HAS_DEBUG_VIRTUAL
>>   	select ARCH_HAS_DEBUG_VM_PGTABLE
>> diff --git a/arch/arm64/include/asm/extable.h b/arch/arm64/include/asm/extable.h
>> index 72b0e71cc3de..f80ebd0addfd 100644
>> --- a/arch/arm64/include/asm/extable.h
>> +++ b/arch/arm64/include/asm/extable.h
>> @@ -46,4 +46,5 @@ bool ex_handler_bpf(const struct exception_table_entry *ex,
>>   #endif /* !CONFIG_BPF_JIT */
>>   
>>   bool fixup_exception(struct pt_regs *regs);
>> +bool fixup_exception_mc(struct pt_regs *regs);
>>   #endif
>> diff --git a/arch/arm64/mm/extable.c b/arch/arm64/mm/extable.c
>> index 489455309695..4f0083a550d4 100644
>> --- a/arch/arm64/mm/extable.c
>> +++ b/arch/arm64/mm/extable.c
>> @@ -9,6 +9,7 @@
>>   
>>   #include <asm/asm-extable.h>
>>   #include <asm/ptrace.h>
>> +#include <asm/esr.h>
>>   
>>   static inline unsigned long
>>   get_ex_fixup(const struct exception_table_entry *ex)
>> @@ -84,3 +85,19 @@ bool fixup_exception(struct pt_regs *regs)
>>   
>>   	BUG();
>>   }
>> +
>> +bool fixup_exception_mc(struct pt_regs *regs)
>> +{
>> +	const struct exception_table_entry *ex;
>> +
>> +	ex = search_exception_tables(instruction_pointer(regs));
>> +	if (!ex)
>> +		return false;
>> +
>> +	/*
>> +	 * This is not complete, More Machine check safe extable type can
>> +	 * be processed here.
>> +	 */
>> +
>> +	return false;
>> +}
> 
> This is at best misnamed; It doesn't actually apply the fixup, it just
> searches for one.

Yeah, you're right about the current logic, so i added notes to explain 
the scenarios that will be added later.

> 
>> diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c
>> index 77341b160aca..a9e6fb1999d1 100644
>> --- a/arch/arm64/mm/fault.c
>> +++ b/arch/arm64/mm/fault.c
>> @@ -695,6 +695,29 @@ static int do_bad(unsigned long far, unsigned int esr, struct pt_regs *regs)
>>   	return 1; /* "fault" */
>>   }
>>   
>> +static bool arm64_do_kernel_sea(unsigned long addr, unsigned int esr,
>> +				     struct pt_regs *regs, int sig, int code)
>> +{
>> +	if (!IS_ENABLED(CONFIG_ARCH_HAS_COPY_MC))
>> +		return false;
>> +
>> +	if (user_mode(regs) || !current->mm)
>> +		return false;
>> +
>> +	if (apei_claim_sea(regs) < 0)
>> +		return false;
>> +
>> +	if (!fixup_exception_mc(regs))
>> +		return false;
>> +
>> +	set_thread_esr(0, esr);
>> +
>> +	arm64_force_sig_fault(sig, code, addr,
>> +		"Uncorrected hardware memory error in kernel-access\n");
>> +
>> +	return true;
>> +}
>> +
>>   static int do_sea(unsigned long far, unsigned int esr, struct pt_regs *regs)
>>   {
>>   	const struct fault_info *inf;
>> @@ -720,7 +743,9 @@ static int do_sea(unsigned long far, unsigned int esr, struct pt_regs *regs)
>>   		 */
>>   		siaddr  = untagged_addr(far);
>>   	}
>> -	arm64_notify_die(inf->name, regs, inf->sig, inf->code, siaddr, esr);
>> +
>> +	if (!arm64_do_kernel_sea(siaddr, esr, regs, inf->sig, inf->code))
>> +		arm64_notify_die(inf->name, regs, inf->sig, inf->code, siaddr, esr);
>>   
>>   	return 0;
>>   }
>> diff --git a/include/linux/uaccess.h b/include/linux/uaccess.h
>> index 546179418ffa..884661b29c17 100644
>> --- a/include/linux/uaccess.h
>> +++ b/include/linux/uaccess.h
>> @@ -174,6 +174,15 @@ copy_mc_to_kernel(void *dst, const void *src, size_t cnt)
>>   }
>>   #endif
>>   
>> +#ifndef copy_mc_to_user
>> +static inline unsigned long __must_check
>> +copy_mc_to_user(void *dst, const void *src, size_t cnt)
>> +{
>> +	check_object_size(src, cnt, true);
>> +	return raw_copy_to_user(dst, src, cnt);
>> +}
>> +#endif
> 
> Why do we need a special copy_mc_to_user() ?
> 
> Why are we not making *every* true uaccess recoverable? That way the
> regular copy_to_user() would just work.

Agreed, will fixed next version.

Thanks,
Tong.

> 
> Thanks,
> Mark.
> .

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH -next v4 3/7] arm64: add support for machine check error safe
@ 2022-05-19  6:29       ` Tong Tiangen
  0 siblings, 0 replies; 96+ messages in thread
From: Tong Tiangen @ 2022-05-19  6:29 UTC (permalink / raw)
  To: Mark Rutland
  Cc: Kefeng Wang, Dave Hansen, linux-mm, Paul Mackerras, Guohanjun,
	Will Deacon, H . Peter Anvin, x86, Ingo Molnar, Catalin Marinas,
	Xie XiuQi, Borislav Petkov, Alexander Viro, Thomas Gleixner,
	linux-arm-kernel, Robin Murphy, linux-kernel, James Morse,
	Andrew Morton, linuxppc-dev



在 2022/5/13 23:26, Mark Rutland 写道:
> On Wed, Apr 20, 2022 at 03:04:14AM +0000, Tong Tiangen wrote:
>> During the processing of arm64 kernel hardware memory errors(do_sea()), if
>> the errors is consumed in the kernel, the current processing is panic.
>> However, it is not optimal.
>>
>> Take uaccess for example, if the uaccess operation fails due to memory
>> error, only the user process will be affected, kill the user process
>> and isolate the user page with hardware memory errors is a better choice.
> 
> Conceptually, I'm fine with the idea of constraining what we do for a
> true uaccess, but I don't like the implementation of this at all, and I
> think we first need to clean up the arm64 extable usage to clearly
> distinguish a uaccess from another access.

OK,using EX_TYPE_UACCESS and this extable type could be recover, this is 
more reasonable.

For EX_TYPE_UACCESS_ERR_ZERO, today we use it for kernel accesses in a 
couple of cases, such as 
get_user/futex/__user_cache_maint()/__user_swpX_asm(), your suggestion is:
get_user continues to use EX_TYPE_UACCESS_ERR_ZERO and the other cases 
use new type EX_TYPE_FIXUP_ERR_ZERO?

Thanks,
Tong.

> 
>> This patch only enable machine error check framework, it add exception
>> fixup before kernel panic in do_sea() and only limit the consumption of
>> hardware memory errors in kernel mode triggered by user mode processes.
>> If fixup successful, panic can be avoided.
>>
>> Consistent with PPC/x86, it is implemented by CONFIG_ARCH_HAS_COPY_MC.
>>
>> Also add copy_mc_to_user() in include/linux/uaccess.h, this helper is
>> called when CONFIG_ARCH_HAS_COPOY_MC is open.
>>
>> Signed-off-by: Tong Tiangen <tongtiangen@huawei.com>
>> ---
>>   arch/arm64/Kconfig               |  1 +
>>   arch/arm64/include/asm/extable.h |  1 +
>>   arch/arm64/mm/extable.c          | 17 +++++++++++++++++
>>   arch/arm64/mm/fault.c            | 27 ++++++++++++++++++++++++++-
>>   include/linux/uaccess.h          |  9 +++++++++
>>   5 files changed, 54 insertions(+), 1 deletion(-)
>>
>> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
>> index d9325dd95eba..012e38309955 100644
>> --- a/arch/arm64/Kconfig
>> +++ b/arch/arm64/Kconfig
>> @@ -19,6 +19,7 @@ config ARM64
>>   	select ARCH_ENABLE_SPLIT_PMD_PTLOCK if PGTABLE_LEVELS > 2
>>   	select ARCH_ENABLE_THP_MIGRATION if TRANSPARENT_HUGEPAGE
>>   	select ARCH_HAS_CACHE_LINE_SIZE
>> +	select ARCH_HAS_COPY_MC if ACPI_APEI_GHES
>>   	select ARCH_HAS_CURRENT_STACK_POINTER
>>   	select ARCH_HAS_DEBUG_VIRTUAL
>>   	select ARCH_HAS_DEBUG_VM_PGTABLE
>> diff --git a/arch/arm64/include/asm/extable.h b/arch/arm64/include/asm/extable.h
>> index 72b0e71cc3de..f80ebd0addfd 100644
>> --- a/arch/arm64/include/asm/extable.h
>> +++ b/arch/arm64/include/asm/extable.h
>> @@ -46,4 +46,5 @@ bool ex_handler_bpf(const struct exception_table_entry *ex,
>>   #endif /* !CONFIG_BPF_JIT */
>>   
>>   bool fixup_exception(struct pt_regs *regs);
>> +bool fixup_exception_mc(struct pt_regs *regs);
>>   #endif
>> diff --git a/arch/arm64/mm/extable.c b/arch/arm64/mm/extable.c
>> index 489455309695..4f0083a550d4 100644
>> --- a/arch/arm64/mm/extable.c
>> +++ b/arch/arm64/mm/extable.c
>> @@ -9,6 +9,7 @@
>>   
>>   #include <asm/asm-extable.h>
>>   #include <asm/ptrace.h>
>> +#include <asm/esr.h>
>>   
>>   static inline unsigned long
>>   get_ex_fixup(const struct exception_table_entry *ex)
>> @@ -84,3 +85,19 @@ bool fixup_exception(struct pt_regs *regs)
>>   
>>   	BUG();
>>   }
>> +
>> +bool fixup_exception_mc(struct pt_regs *regs)
>> +{
>> +	const struct exception_table_entry *ex;
>> +
>> +	ex = search_exception_tables(instruction_pointer(regs));
>> +	if (!ex)
>> +		return false;
>> +
>> +	/*
>> +	 * This is not complete, More Machine check safe extable type can
>> +	 * be processed here.
>> +	 */
>> +
>> +	return false;
>> +}
> 
> This is at best misnamed; It doesn't actually apply the fixup, it just
> searches for one.

Yeah, you're right about the current logic, so i added notes to explain 
the scenarios that will be added later.

> 
>> diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c
>> index 77341b160aca..a9e6fb1999d1 100644
>> --- a/arch/arm64/mm/fault.c
>> +++ b/arch/arm64/mm/fault.c
>> @@ -695,6 +695,29 @@ static int do_bad(unsigned long far, unsigned int esr, struct pt_regs *regs)
>>   	return 1; /* "fault" */
>>   }
>>   
>> +static bool arm64_do_kernel_sea(unsigned long addr, unsigned int esr,
>> +				     struct pt_regs *regs, int sig, int code)
>> +{
>> +	if (!IS_ENABLED(CONFIG_ARCH_HAS_COPY_MC))
>> +		return false;
>> +
>> +	if (user_mode(regs) || !current->mm)
>> +		return false;
>> +
>> +	if (apei_claim_sea(regs) < 0)
>> +		return false;
>> +
>> +	if (!fixup_exception_mc(regs))
>> +		return false;
>> +
>> +	set_thread_esr(0, esr);
>> +
>> +	arm64_force_sig_fault(sig, code, addr,
>> +		"Uncorrected hardware memory error in kernel-access\n");
>> +
>> +	return true;
>> +}
>> +
>>   static int do_sea(unsigned long far, unsigned int esr, struct pt_regs *regs)
>>   {
>>   	const struct fault_info *inf;
>> @@ -720,7 +743,9 @@ static int do_sea(unsigned long far, unsigned int esr, struct pt_regs *regs)
>>   		 */
>>   		siaddr  = untagged_addr(far);
>>   	}
>> -	arm64_notify_die(inf->name, regs, inf->sig, inf->code, siaddr, esr);
>> +
>> +	if (!arm64_do_kernel_sea(siaddr, esr, regs, inf->sig, inf->code))
>> +		arm64_notify_die(inf->name, regs, inf->sig, inf->code, siaddr, esr);
>>   
>>   	return 0;
>>   }
>> diff --git a/include/linux/uaccess.h b/include/linux/uaccess.h
>> index 546179418ffa..884661b29c17 100644
>> --- a/include/linux/uaccess.h
>> +++ b/include/linux/uaccess.h
>> @@ -174,6 +174,15 @@ copy_mc_to_kernel(void *dst, const void *src, size_t cnt)
>>   }
>>   #endif
>>   
>> +#ifndef copy_mc_to_user
>> +static inline unsigned long __must_check
>> +copy_mc_to_user(void *dst, const void *src, size_t cnt)
>> +{
>> +	check_object_size(src, cnt, true);
>> +	return raw_copy_to_user(dst, src, cnt);
>> +}
>> +#endif
> 
> Why do we need a special copy_mc_to_user() ?
> 
> Why are we not making *every* true uaccess recoverable? That way the
> regular copy_to_user() would just work.

Agreed, will fixed next version.

Thanks,
Tong.

> 
> Thanks,
> Mark.
> .

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH -next v4 4/7] arm64: add copy_{to, from}_user to machine check safe
  2022-05-13 15:31     ` Mark Rutland
  (?)
@ 2022-05-19  6:53       ` Tong Tiangen
  -1 siblings, 0 replies; 96+ messages in thread
From: Tong Tiangen @ 2022-05-19  6:53 UTC (permalink / raw)
  To: Mark Rutland
  Cc: James Morse, Andrew Morton, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Robin Murphy, Dave Hansen, Catalin Marinas,
	Will Deacon, Alexander Viro, Michael Ellerman,
	Benjamin Herrenschmidt, Paul Mackerras, x86, H . Peter Anvin,
	linuxppc-dev, linux-arm-kernel, linux-kernel, linux-mm,
	Kefeng Wang, Xie XiuQi, Guohanjun



在 2022/5/13 23:31, Mark Rutland 写道:
> On Wed, Apr 20, 2022 at 03:04:15AM +0000, Tong Tiangen wrote:
>> Add copy_{to, from}_user() to machine check safe.
>>
>> If copy fail due to hardware memory error, only the relevant processes are
>> affected, so killing the user process and isolate the user page with
>> hardware memory errors is a more reasonable choice than kernel panic.
>>
>> Add new extable type EX_TYPE_UACCESS_MC which can be used for uaccess that
>> can be recovered from hardware memory errors.
> 
> I don't understand why we need this.
> 
> If we apply EX_TYPE_UACCESS consistently to *all* user accesses, and
> *only* to user accesses, that would *always* indicate that we can
> recover, and that seems much simpler to deal with.
> 
> Today we use EX_TYPE_UACCESS_ERR_ZERO for kernel accesses in a couple of
> cases, which we should clean up, and we user EX_TYPE_FIXUP for a couple
> of user accesses, but those could easily be converted over.
> 
>> The x16 register is used to save the fixup type in copy_xxx_user which
>> used extable type EX_TYPE_UACCESS_MC.

This is dicussed on patch patch 3/7.

> 
> Why x16?
> 
> How is this intended to be consumed, and why is that behaviour different
> from any *other* fault?
> 
> Mark.

This is to distinguish EX_TYPE_FIXUP, if this exception is triggered, 
in fixup processing, it is needed to copy by byte, but if exception is 
triggered by machine check, the data does not need to be copied again.

So we need one place to store exception type, Therefore, X16 that is not 
currently used in copy_from/to_user is selected.

Maybe better to use exception_table_entry->data to pass the register 
that needs to be set?

Thanks,
Tong.


> 
>> Signed-off-by: Tong Tiangen <tongtiangen@huawei.com>
>> ---
>>   arch/arm64/include/asm/asm-extable.h | 14 ++++++++++++++
>>   arch/arm64/include/asm/asm-uaccess.h | 15 ++++++++++-----
>>   arch/arm64/lib/copy_from_user.S      | 18 +++++++++++-------
>>   arch/arm64/lib/copy_to_user.S        | 18 +++++++++++-------
>>   arch/arm64/mm/extable.c              | 18 ++++++++++++++----
>>   5 files changed, 60 insertions(+), 23 deletions(-)
>>
>> diff --git a/arch/arm64/include/asm/asm-extable.h b/arch/arm64/include/asm/asm-extable.h
>> index c39f2437e08e..75b2c00e9523 100644
>> --- a/arch/arm64/include/asm/asm-extable.h
>> +++ b/arch/arm64/include/asm/asm-extable.h
>> @@ -2,12 +2,18 @@
>>   #ifndef __ASM_ASM_EXTABLE_H
>>   #define __ASM_ASM_EXTABLE_H
>>   
>> +#define FIXUP_TYPE_NORMAL		0
>> +#define FIXUP_TYPE_MC			1
>> +
>>   #define EX_TYPE_NONE			0
>>   #define EX_TYPE_FIXUP			1
>>   #define EX_TYPE_BPF			2
>>   #define EX_TYPE_UACCESS_ERR_ZERO	3
>>   #define EX_TYPE_LOAD_UNALIGNED_ZEROPAD	4
>>   
>> +/* _MC indicates that can fixup from machine check errors */
>> +#define EX_TYPE_UACCESS_MC		5
>> +
>>   #ifdef __ASSEMBLY__
>>   
>>   #define __ASM_EXTABLE_RAW(insn, fixup, type, data)	\
>> @@ -27,6 +33,14 @@
>>   	__ASM_EXTABLE_RAW(\insn, \fixup, EX_TYPE_FIXUP, 0)
>>   	.endm
>>   
>> +/*
>> + * Create an exception table entry for `insn`, which will branch to `fixup`
>> + * when an unhandled fault(include sea fault) is taken.
>> + */
>> +	.macro          _asm_extable_uaccess_mc, insn, fixup
>> +	__ASM_EXTABLE_RAW(\insn, \fixup, EX_TYPE_UACCESS_MC, 0)
>> +	.endm
>> +
>>   /*
>>    * Create an exception table entry for `insn` if `fixup` is provided. Otherwise
>>    * do nothing.
>> diff --git a/arch/arm64/include/asm/asm-uaccess.h b/arch/arm64/include/asm/asm-uaccess.h
>> index 0557af834e03..6c23c138e1fc 100644
>> --- a/arch/arm64/include/asm/asm-uaccess.h
>> +++ b/arch/arm64/include/asm/asm-uaccess.h
>> @@ -63,6 +63,11 @@ alternative_else_nop_endif
>>   9999:	x;					\
>>   	_asm_extable	9999b, l
>>   
>> +
>> +#define USER_MC(l, x...)			\
>> +9999:	x;					\
>> +	_asm_extable_uaccess_mc	9999b, l
>> +
>>   /*
>>    * Generate the assembly for LDTR/STTR with exception table entries.
>>    * This is complicated as there is no post-increment or pair versions of the
>> @@ -73,8 +78,8 @@ alternative_else_nop_endif
>>   8889:		ldtr	\reg2, [\addr, #8];
>>   		add	\addr, \addr, \post_inc;
>>   
>> -		_asm_extable	8888b,\l;
>> -		_asm_extable	8889b,\l;
>> +		_asm_extable_uaccess_mc	8888b, \l;
>> +		_asm_extable_uaccess_mc	8889b, \l;
>>   	.endm
>>   
>>   	.macro user_stp l, reg1, reg2, addr, post_inc
>> @@ -82,14 +87,14 @@ alternative_else_nop_endif
>>   8889:		sttr	\reg2, [\addr, #8];
>>   		add	\addr, \addr, \post_inc;
>>   
>> -		_asm_extable	8888b,\l;
>> -		_asm_extable	8889b,\l;
>> +		_asm_extable_uaccess_mc	8888b,\l;
>> +		_asm_extable_uaccess_mc	8889b,\l;
>>   	.endm
>>   
>>   	.macro user_ldst l, inst, reg, addr, post_inc
>>   8888:		\inst		\reg, [\addr];
>>   		add		\addr, \addr, \post_inc;
>>   
>> -		_asm_extable	8888b,\l;
>> +		_asm_extable_uaccess_mc	8888b, \l;
>>   	.endm
>>   #endif
>> diff --git a/arch/arm64/lib/copy_from_user.S b/arch/arm64/lib/copy_from_user.S
>> index 34e317907524..480cc5ac0a8d 100644
>> --- a/arch/arm64/lib/copy_from_user.S
>> +++ b/arch/arm64/lib/copy_from_user.S
>> @@ -25,7 +25,7 @@
>>   	.endm
>>   
>>   	.macro strb1 reg, ptr, val
>> -	strb \reg, [\ptr], \val
>> +	USER_MC(9998f, strb \reg, [\ptr], \val)
>>   	.endm
>>   
>>   	.macro ldrh1 reg, ptr, val
>> @@ -33,7 +33,7 @@
>>   	.endm
>>   
>>   	.macro strh1 reg, ptr, val
>> -	strh \reg, [\ptr], \val
>> +	USER_MC(9998f, strh \reg, [\ptr], \val)
>>   	.endm
>>   
>>   	.macro ldr1 reg, ptr, val
>> @@ -41,7 +41,7 @@
>>   	.endm
>>   
>>   	.macro str1 reg, ptr, val
>> -	str \reg, [\ptr], \val
>> +	USER_MC(9998f, str \reg, [\ptr], \val)
>>   	.endm
>>   
>>   	.macro ldp1 reg1, reg2, ptr, val
>> @@ -49,11 +49,12 @@
>>   	.endm
>>   
>>   	.macro stp1 reg1, reg2, ptr, val
>> -	stp \reg1, \reg2, [\ptr], \val
>> +	USER_MC(9998f, stp \reg1, \reg2, [\ptr], \val)
>>   	.endm
>>   
>> -end	.req	x5
>> -srcin	.req	x15
>> +end		.req	x5
>> +srcin		.req	x15
>> +fixup_type	.req	x16
>>   SYM_FUNC_START(__arch_copy_from_user)
>>   	add	end, x0, x2
>>   	mov	srcin, x1
>> @@ -62,7 +63,10 @@ SYM_FUNC_START(__arch_copy_from_user)
>>   	ret
>>   
>>   	// Exception fixups
>> -9997:	cmp	dst, dstin
>> +	// x16: fixup type written by ex_handler_uaccess_mc
>> +9997:	cmp 	fixup_type, #FIXUP_TYPE_MC
>> +	b.eq	9998f
>> +	cmp	dst, dstin
>>   	b.ne	9998f
>>   	// Before being absolutely sure we couldn't copy anything, try harder
>>   USER(9998f, ldtrb tmp1w, [srcin])
>> diff --git a/arch/arm64/lib/copy_to_user.S b/arch/arm64/lib/copy_to_user.S
>> index 802231772608..021a7d27b3a4 100644
>> --- a/arch/arm64/lib/copy_to_user.S
>> +++ b/arch/arm64/lib/copy_to_user.S
>> @@ -20,7 +20,7 @@
>>    *	x0 - bytes not copied
>>    */
>>   	.macro ldrb1 reg, ptr, val
>> -	ldrb  \reg, [\ptr], \val
>> +	USER_MC(9998f, ldrb  \reg, [\ptr], \val)
>>   	.endm
>>   
>>   	.macro strb1 reg, ptr, val
>> @@ -28,7 +28,7 @@
>>   	.endm
>>   
>>   	.macro ldrh1 reg, ptr, val
>> -	ldrh  \reg, [\ptr], \val
>> +	USER_MC(9998f, ldrh  \reg, [\ptr], \val)
>>   	.endm
>>   
>>   	.macro strh1 reg, ptr, val
>> @@ -36,7 +36,7 @@
>>   	.endm
>>   
>>   	.macro ldr1 reg, ptr, val
>> -	ldr \reg, [\ptr], \val
>> +	USER_MC(9998f, ldr \reg, [\ptr], \val)
>>   	.endm
>>   
>>   	.macro str1 reg, ptr, val
>> @@ -44,15 +44,16 @@
>>   	.endm
>>   
>>   	.macro ldp1 reg1, reg2, ptr, val
>> -	ldp \reg1, \reg2, [\ptr], \val
>> +	USER_MC(9998f, ldp \reg1, \reg2, [\ptr], \val)
>>   	.endm
>>   
>>   	.macro stp1 reg1, reg2, ptr, val
>>   	user_stp 9997f, \reg1, \reg2, \ptr, \val
>>   	.endm
>>   
>> -end	.req	x5
>> -srcin	.req	x15
>> +end		.req	x5
>> +srcin		.req	x15
>> +fixup_type	.req	x16
>>   SYM_FUNC_START(__arch_copy_to_user)
>>   	add	end, x0, x2
>>   	mov	srcin, x1
>> @@ -61,7 +62,10 @@ SYM_FUNC_START(__arch_copy_to_user)
>>   	ret
>>   
>>   	// Exception fixups
>> -9997:	cmp	dst, dstin
>> +	// x16: fixup type written by ex_handler_uaccess_mc
>> +9997:	cmp 	fixup_type, #FIXUP_TYPE_MC
>> +	b.eq	9998f
>> +	cmp	dst, dstin
>>   	b.ne	9998f
>>   	// Before being absolutely sure we couldn't copy anything, try harder
>>   	ldrb	tmp1w, [srcin]
>> diff --git a/arch/arm64/mm/extable.c b/arch/arm64/mm/extable.c
>> index 4f0083a550d4..525876c3ebf4 100644
>> --- a/arch/arm64/mm/extable.c
>> +++ b/arch/arm64/mm/extable.c
>> @@ -24,6 +24,14 @@ static bool ex_handler_fixup(const struct exception_table_entry *ex,
>>   	return true;
>>   }
>>   
>> +static bool ex_handler_uaccess_type(const struct exception_table_entry *ex,
>> +			     struct pt_regs *regs,
>> +			     unsigned long fixup_type)
>> +{
>> +	regs->regs[16] = fixup_type;
>> +	return ex_handler_fixup(ex, regs);
>> +}
>> +
>>   static bool ex_handler_uaccess_err_zero(const struct exception_table_entry *ex,
>>   					struct pt_regs *regs)
>>   {
>> @@ -75,6 +83,8 @@ bool fixup_exception(struct pt_regs *regs)
>>   	switch (ex->type) {
>>   	case EX_TYPE_FIXUP:
>>   		return ex_handler_fixup(ex, regs);
>> +	case EX_TYPE_UACCESS_MC:
>> +		return ex_handler_uaccess_type(ex, regs, FIXUP_TYPE_NORMAL);
>>   	case EX_TYPE_BPF:
>>   		return ex_handler_bpf(ex, regs);
>>   	case EX_TYPE_UACCESS_ERR_ZERO:
>> @@ -94,10 +104,10 @@ bool fixup_exception_mc(struct pt_regs *regs)
>>   	if (!ex)
>>   		return false;
>>   
>> -	/*
>> -	 * This is not complete, More Machine check safe extable type can
>> -	 * be processed here.
>> -	 */
>> +	switch (ex->type) {
>> +	case EX_TYPE_UACCESS_MC:
>> +		return ex_handler_uaccess_type(ex, regs, FIXUP_TYPE_MC);
>> +	}
>>   
>>   	return false;
>>   }
>> -- 
>> 2.25.1
>>
> .

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH -next v4 4/7] arm64: add copy_{to, from}_user to machine check safe
@ 2022-05-19  6:53       ` Tong Tiangen
  0 siblings, 0 replies; 96+ messages in thread
From: Tong Tiangen @ 2022-05-19  6:53 UTC (permalink / raw)
  To: Mark Rutland
  Cc: Kefeng Wang, Dave Hansen, linux-mm, Paul Mackerras, Guohanjun,
	Will Deacon, H . Peter Anvin, x86, Ingo Molnar, Catalin Marinas,
	Xie XiuQi, Borislav Petkov, Alexander Viro, Thomas Gleixner,
	linux-arm-kernel, Robin Murphy, linux-kernel, James Morse,
	Andrew Morton, linuxppc-dev



在 2022/5/13 23:31, Mark Rutland 写道:
> On Wed, Apr 20, 2022 at 03:04:15AM +0000, Tong Tiangen wrote:
>> Add copy_{to, from}_user() to machine check safe.
>>
>> If copy fail due to hardware memory error, only the relevant processes are
>> affected, so killing the user process and isolate the user page with
>> hardware memory errors is a more reasonable choice than kernel panic.
>>
>> Add new extable type EX_TYPE_UACCESS_MC which can be used for uaccess that
>> can be recovered from hardware memory errors.
> 
> I don't understand why we need this.
> 
> If we apply EX_TYPE_UACCESS consistently to *all* user accesses, and
> *only* to user accesses, that would *always* indicate that we can
> recover, and that seems much simpler to deal with.
> 
> Today we use EX_TYPE_UACCESS_ERR_ZERO for kernel accesses in a couple of
> cases, which we should clean up, and we user EX_TYPE_FIXUP for a couple
> of user accesses, but those could easily be converted over.
> 
>> The x16 register is used to save the fixup type in copy_xxx_user which
>> used extable type EX_TYPE_UACCESS_MC.

This is dicussed on patch patch 3/7.

> 
> Why x16?
> 
> How is this intended to be consumed, and why is that behaviour different
> from any *other* fault?
> 
> Mark.

This is to distinguish EX_TYPE_FIXUP, if this exception is triggered, 
in fixup processing, it is needed to copy by byte, but if exception is 
triggered by machine check, the data does not need to be copied again.

So we need one place to store exception type, Therefore, X16 that is not 
currently used in copy_from/to_user is selected.

Maybe better to use exception_table_entry->data to pass the register 
that needs to be set?

Thanks,
Tong.


> 
>> Signed-off-by: Tong Tiangen <tongtiangen@huawei.com>
>> ---
>>   arch/arm64/include/asm/asm-extable.h | 14 ++++++++++++++
>>   arch/arm64/include/asm/asm-uaccess.h | 15 ++++++++++-----
>>   arch/arm64/lib/copy_from_user.S      | 18 +++++++++++-------
>>   arch/arm64/lib/copy_to_user.S        | 18 +++++++++++-------
>>   arch/arm64/mm/extable.c              | 18 ++++++++++++++----
>>   5 files changed, 60 insertions(+), 23 deletions(-)
>>
>> diff --git a/arch/arm64/include/asm/asm-extable.h b/arch/arm64/include/asm/asm-extable.h
>> index c39f2437e08e..75b2c00e9523 100644
>> --- a/arch/arm64/include/asm/asm-extable.h
>> +++ b/arch/arm64/include/asm/asm-extable.h
>> @@ -2,12 +2,18 @@
>>   #ifndef __ASM_ASM_EXTABLE_H
>>   #define __ASM_ASM_EXTABLE_H
>>   
>> +#define FIXUP_TYPE_NORMAL		0
>> +#define FIXUP_TYPE_MC			1
>> +
>>   #define EX_TYPE_NONE			0
>>   #define EX_TYPE_FIXUP			1
>>   #define EX_TYPE_BPF			2
>>   #define EX_TYPE_UACCESS_ERR_ZERO	3
>>   #define EX_TYPE_LOAD_UNALIGNED_ZEROPAD	4
>>   
>> +/* _MC indicates that can fixup from machine check errors */
>> +#define EX_TYPE_UACCESS_MC		5
>> +
>>   #ifdef __ASSEMBLY__
>>   
>>   #define __ASM_EXTABLE_RAW(insn, fixup, type, data)	\
>> @@ -27,6 +33,14 @@
>>   	__ASM_EXTABLE_RAW(\insn, \fixup, EX_TYPE_FIXUP, 0)
>>   	.endm
>>   
>> +/*
>> + * Create an exception table entry for `insn`, which will branch to `fixup`
>> + * when an unhandled fault(include sea fault) is taken.
>> + */
>> +	.macro          _asm_extable_uaccess_mc, insn, fixup
>> +	__ASM_EXTABLE_RAW(\insn, \fixup, EX_TYPE_UACCESS_MC, 0)
>> +	.endm
>> +
>>   /*
>>    * Create an exception table entry for `insn` if `fixup` is provided. Otherwise
>>    * do nothing.
>> diff --git a/arch/arm64/include/asm/asm-uaccess.h b/arch/arm64/include/asm/asm-uaccess.h
>> index 0557af834e03..6c23c138e1fc 100644
>> --- a/arch/arm64/include/asm/asm-uaccess.h
>> +++ b/arch/arm64/include/asm/asm-uaccess.h
>> @@ -63,6 +63,11 @@ alternative_else_nop_endif
>>   9999:	x;					\
>>   	_asm_extable	9999b, l
>>   
>> +
>> +#define USER_MC(l, x...)			\
>> +9999:	x;					\
>> +	_asm_extable_uaccess_mc	9999b, l
>> +
>>   /*
>>    * Generate the assembly for LDTR/STTR with exception table entries.
>>    * This is complicated as there is no post-increment or pair versions of the
>> @@ -73,8 +78,8 @@ alternative_else_nop_endif
>>   8889:		ldtr	\reg2, [\addr, #8];
>>   		add	\addr, \addr, \post_inc;
>>   
>> -		_asm_extable	8888b,\l;
>> -		_asm_extable	8889b,\l;
>> +		_asm_extable_uaccess_mc	8888b, \l;
>> +		_asm_extable_uaccess_mc	8889b, \l;
>>   	.endm
>>   
>>   	.macro user_stp l, reg1, reg2, addr, post_inc
>> @@ -82,14 +87,14 @@ alternative_else_nop_endif
>>   8889:		sttr	\reg2, [\addr, #8];
>>   		add	\addr, \addr, \post_inc;
>>   
>> -		_asm_extable	8888b,\l;
>> -		_asm_extable	8889b,\l;
>> +		_asm_extable_uaccess_mc	8888b,\l;
>> +		_asm_extable_uaccess_mc	8889b,\l;
>>   	.endm
>>   
>>   	.macro user_ldst l, inst, reg, addr, post_inc
>>   8888:		\inst		\reg, [\addr];
>>   		add		\addr, \addr, \post_inc;
>>   
>> -		_asm_extable	8888b,\l;
>> +		_asm_extable_uaccess_mc	8888b, \l;
>>   	.endm
>>   #endif
>> diff --git a/arch/arm64/lib/copy_from_user.S b/arch/arm64/lib/copy_from_user.S
>> index 34e317907524..480cc5ac0a8d 100644
>> --- a/arch/arm64/lib/copy_from_user.S
>> +++ b/arch/arm64/lib/copy_from_user.S
>> @@ -25,7 +25,7 @@
>>   	.endm
>>   
>>   	.macro strb1 reg, ptr, val
>> -	strb \reg, [\ptr], \val
>> +	USER_MC(9998f, strb \reg, [\ptr], \val)
>>   	.endm
>>   
>>   	.macro ldrh1 reg, ptr, val
>> @@ -33,7 +33,7 @@
>>   	.endm
>>   
>>   	.macro strh1 reg, ptr, val
>> -	strh \reg, [\ptr], \val
>> +	USER_MC(9998f, strh \reg, [\ptr], \val)
>>   	.endm
>>   
>>   	.macro ldr1 reg, ptr, val
>> @@ -41,7 +41,7 @@
>>   	.endm
>>   
>>   	.macro str1 reg, ptr, val
>> -	str \reg, [\ptr], \val
>> +	USER_MC(9998f, str \reg, [\ptr], \val)
>>   	.endm
>>   
>>   	.macro ldp1 reg1, reg2, ptr, val
>> @@ -49,11 +49,12 @@
>>   	.endm
>>   
>>   	.macro stp1 reg1, reg2, ptr, val
>> -	stp \reg1, \reg2, [\ptr], \val
>> +	USER_MC(9998f, stp \reg1, \reg2, [\ptr], \val)
>>   	.endm
>>   
>> -end	.req	x5
>> -srcin	.req	x15
>> +end		.req	x5
>> +srcin		.req	x15
>> +fixup_type	.req	x16
>>   SYM_FUNC_START(__arch_copy_from_user)
>>   	add	end, x0, x2
>>   	mov	srcin, x1
>> @@ -62,7 +63,10 @@ SYM_FUNC_START(__arch_copy_from_user)
>>   	ret
>>   
>>   	// Exception fixups
>> -9997:	cmp	dst, dstin
>> +	// x16: fixup type written by ex_handler_uaccess_mc
>> +9997:	cmp 	fixup_type, #FIXUP_TYPE_MC
>> +	b.eq	9998f
>> +	cmp	dst, dstin
>>   	b.ne	9998f
>>   	// Before being absolutely sure we couldn't copy anything, try harder
>>   USER(9998f, ldtrb tmp1w, [srcin])
>> diff --git a/arch/arm64/lib/copy_to_user.S b/arch/arm64/lib/copy_to_user.S
>> index 802231772608..021a7d27b3a4 100644
>> --- a/arch/arm64/lib/copy_to_user.S
>> +++ b/arch/arm64/lib/copy_to_user.S
>> @@ -20,7 +20,7 @@
>>    *	x0 - bytes not copied
>>    */
>>   	.macro ldrb1 reg, ptr, val
>> -	ldrb  \reg, [\ptr], \val
>> +	USER_MC(9998f, ldrb  \reg, [\ptr], \val)
>>   	.endm
>>   
>>   	.macro strb1 reg, ptr, val
>> @@ -28,7 +28,7 @@
>>   	.endm
>>   
>>   	.macro ldrh1 reg, ptr, val
>> -	ldrh  \reg, [\ptr], \val
>> +	USER_MC(9998f, ldrh  \reg, [\ptr], \val)
>>   	.endm
>>   
>>   	.macro strh1 reg, ptr, val
>> @@ -36,7 +36,7 @@
>>   	.endm
>>   
>>   	.macro ldr1 reg, ptr, val
>> -	ldr \reg, [\ptr], \val
>> +	USER_MC(9998f, ldr \reg, [\ptr], \val)
>>   	.endm
>>   
>>   	.macro str1 reg, ptr, val
>> @@ -44,15 +44,16 @@
>>   	.endm
>>   
>>   	.macro ldp1 reg1, reg2, ptr, val
>> -	ldp \reg1, \reg2, [\ptr], \val
>> +	USER_MC(9998f, ldp \reg1, \reg2, [\ptr], \val)
>>   	.endm
>>   
>>   	.macro stp1 reg1, reg2, ptr, val
>>   	user_stp 9997f, \reg1, \reg2, \ptr, \val
>>   	.endm
>>   
>> -end	.req	x5
>> -srcin	.req	x15
>> +end		.req	x5
>> +srcin		.req	x15
>> +fixup_type	.req	x16
>>   SYM_FUNC_START(__arch_copy_to_user)
>>   	add	end, x0, x2
>>   	mov	srcin, x1
>> @@ -61,7 +62,10 @@ SYM_FUNC_START(__arch_copy_to_user)
>>   	ret
>>   
>>   	// Exception fixups
>> -9997:	cmp	dst, dstin
>> +	// x16: fixup type written by ex_handler_uaccess_mc
>> +9997:	cmp 	fixup_type, #FIXUP_TYPE_MC
>> +	b.eq	9998f
>> +	cmp	dst, dstin
>>   	b.ne	9998f
>>   	// Before being absolutely sure we couldn't copy anything, try harder
>>   	ldrb	tmp1w, [srcin]
>> diff --git a/arch/arm64/mm/extable.c b/arch/arm64/mm/extable.c
>> index 4f0083a550d4..525876c3ebf4 100644
>> --- a/arch/arm64/mm/extable.c
>> +++ b/arch/arm64/mm/extable.c
>> @@ -24,6 +24,14 @@ static bool ex_handler_fixup(const struct exception_table_entry *ex,
>>   	return true;
>>   }
>>   
>> +static bool ex_handler_uaccess_type(const struct exception_table_entry *ex,
>> +			     struct pt_regs *regs,
>> +			     unsigned long fixup_type)
>> +{
>> +	regs->regs[16] = fixup_type;
>> +	return ex_handler_fixup(ex, regs);
>> +}
>> +
>>   static bool ex_handler_uaccess_err_zero(const struct exception_table_entry *ex,
>>   					struct pt_regs *regs)
>>   {
>> @@ -75,6 +83,8 @@ bool fixup_exception(struct pt_regs *regs)
>>   	switch (ex->type) {
>>   	case EX_TYPE_FIXUP:
>>   		return ex_handler_fixup(ex, regs);
>> +	case EX_TYPE_UACCESS_MC:
>> +		return ex_handler_uaccess_type(ex, regs, FIXUP_TYPE_NORMAL);
>>   	case EX_TYPE_BPF:
>>   		return ex_handler_bpf(ex, regs);
>>   	case EX_TYPE_UACCESS_ERR_ZERO:
>> @@ -94,10 +104,10 @@ bool fixup_exception_mc(struct pt_regs *regs)
>>   	if (!ex)
>>   		return false;
>>   
>> -	/*
>> -	 * This is not complete, More Machine check safe extable type can
>> -	 * be processed here.
>> -	 */
>> +	switch (ex->type) {
>> +	case EX_TYPE_UACCESS_MC:
>> +		return ex_handler_uaccess_type(ex, regs, FIXUP_TYPE_MC);
>> +	}
>>   
>>   	return false;
>>   }
>> -- 
>> 2.25.1
>>
> .

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH -next v4 4/7] arm64: add copy_{to, from}_user to machine check safe
@ 2022-05-19  6:53       ` Tong Tiangen
  0 siblings, 0 replies; 96+ messages in thread
From: Tong Tiangen @ 2022-05-19  6:53 UTC (permalink / raw)
  To: Mark Rutland
  Cc: James Morse, Andrew Morton, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Robin Murphy, Dave Hansen, Catalin Marinas,
	Will Deacon, Alexander Viro, Michael Ellerman,
	Benjamin Herrenschmidt, Paul Mackerras, x86, H . Peter Anvin,
	linuxppc-dev, linux-arm-kernel, linux-kernel, linux-mm,
	Kefeng Wang, Xie XiuQi, Guohanjun



在 2022/5/13 23:31, Mark Rutland 写道:
> On Wed, Apr 20, 2022 at 03:04:15AM +0000, Tong Tiangen wrote:
>> Add copy_{to, from}_user() to machine check safe.
>>
>> If copy fail due to hardware memory error, only the relevant processes are
>> affected, so killing the user process and isolate the user page with
>> hardware memory errors is a more reasonable choice than kernel panic.
>>
>> Add new extable type EX_TYPE_UACCESS_MC which can be used for uaccess that
>> can be recovered from hardware memory errors.
> 
> I don't understand why we need this.
> 
> If we apply EX_TYPE_UACCESS consistently to *all* user accesses, and
> *only* to user accesses, that would *always* indicate that we can
> recover, and that seems much simpler to deal with.
> 
> Today we use EX_TYPE_UACCESS_ERR_ZERO for kernel accesses in a couple of
> cases, which we should clean up, and we user EX_TYPE_FIXUP for a couple
> of user accesses, but those could easily be converted over.
> 
>> The x16 register is used to save the fixup type in copy_xxx_user which
>> used extable type EX_TYPE_UACCESS_MC.

This is dicussed on patch patch 3/7.

> 
> Why x16?
> 
> How is this intended to be consumed, and why is that behaviour different
> from any *other* fault?
> 
> Mark.

This is to distinguish EX_TYPE_FIXUP, if this exception is triggered, 
in fixup processing, it is needed to copy by byte, but if exception is 
triggered by machine check, the data does not need to be copied again.

So we need one place to store exception type, Therefore, X16 that is not 
currently used in copy_from/to_user is selected.

Maybe better to use exception_table_entry->data to pass the register 
that needs to be set?

Thanks,
Tong.


> 
>> Signed-off-by: Tong Tiangen <tongtiangen@huawei.com>
>> ---
>>   arch/arm64/include/asm/asm-extable.h | 14 ++++++++++++++
>>   arch/arm64/include/asm/asm-uaccess.h | 15 ++++++++++-----
>>   arch/arm64/lib/copy_from_user.S      | 18 +++++++++++-------
>>   arch/arm64/lib/copy_to_user.S        | 18 +++++++++++-------
>>   arch/arm64/mm/extable.c              | 18 ++++++++++++++----
>>   5 files changed, 60 insertions(+), 23 deletions(-)
>>
>> diff --git a/arch/arm64/include/asm/asm-extable.h b/arch/arm64/include/asm/asm-extable.h
>> index c39f2437e08e..75b2c00e9523 100644
>> --- a/arch/arm64/include/asm/asm-extable.h
>> +++ b/arch/arm64/include/asm/asm-extable.h
>> @@ -2,12 +2,18 @@
>>   #ifndef __ASM_ASM_EXTABLE_H
>>   #define __ASM_ASM_EXTABLE_H
>>   
>> +#define FIXUP_TYPE_NORMAL		0
>> +#define FIXUP_TYPE_MC			1
>> +
>>   #define EX_TYPE_NONE			0
>>   #define EX_TYPE_FIXUP			1
>>   #define EX_TYPE_BPF			2
>>   #define EX_TYPE_UACCESS_ERR_ZERO	3
>>   #define EX_TYPE_LOAD_UNALIGNED_ZEROPAD	4
>>   
>> +/* _MC indicates that can fixup from machine check errors */
>> +#define EX_TYPE_UACCESS_MC		5
>> +
>>   #ifdef __ASSEMBLY__
>>   
>>   #define __ASM_EXTABLE_RAW(insn, fixup, type, data)	\
>> @@ -27,6 +33,14 @@
>>   	__ASM_EXTABLE_RAW(\insn, \fixup, EX_TYPE_FIXUP, 0)
>>   	.endm
>>   
>> +/*
>> + * Create an exception table entry for `insn`, which will branch to `fixup`
>> + * when an unhandled fault(include sea fault) is taken.
>> + */
>> +	.macro          _asm_extable_uaccess_mc, insn, fixup
>> +	__ASM_EXTABLE_RAW(\insn, \fixup, EX_TYPE_UACCESS_MC, 0)
>> +	.endm
>> +
>>   /*
>>    * Create an exception table entry for `insn` if `fixup` is provided. Otherwise
>>    * do nothing.
>> diff --git a/arch/arm64/include/asm/asm-uaccess.h b/arch/arm64/include/asm/asm-uaccess.h
>> index 0557af834e03..6c23c138e1fc 100644
>> --- a/arch/arm64/include/asm/asm-uaccess.h
>> +++ b/arch/arm64/include/asm/asm-uaccess.h
>> @@ -63,6 +63,11 @@ alternative_else_nop_endif
>>   9999:	x;					\
>>   	_asm_extable	9999b, l
>>   
>> +
>> +#define USER_MC(l, x...)			\
>> +9999:	x;					\
>> +	_asm_extable_uaccess_mc	9999b, l
>> +
>>   /*
>>    * Generate the assembly for LDTR/STTR with exception table entries.
>>    * This is complicated as there is no post-increment or pair versions of the
>> @@ -73,8 +78,8 @@ alternative_else_nop_endif
>>   8889:		ldtr	\reg2, [\addr, #8];
>>   		add	\addr, \addr, \post_inc;
>>   
>> -		_asm_extable	8888b,\l;
>> -		_asm_extable	8889b,\l;
>> +		_asm_extable_uaccess_mc	8888b, \l;
>> +		_asm_extable_uaccess_mc	8889b, \l;
>>   	.endm
>>   
>>   	.macro user_stp l, reg1, reg2, addr, post_inc
>> @@ -82,14 +87,14 @@ alternative_else_nop_endif
>>   8889:		sttr	\reg2, [\addr, #8];
>>   		add	\addr, \addr, \post_inc;
>>   
>> -		_asm_extable	8888b,\l;
>> -		_asm_extable	8889b,\l;
>> +		_asm_extable_uaccess_mc	8888b,\l;
>> +		_asm_extable_uaccess_mc	8889b,\l;
>>   	.endm
>>   
>>   	.macro user_ldst l, inst, reg, addr, post_inc
>>   8888:		\inst		\reg, [\addr];
>>   		add		\addr, \addr, \post_inc;
>>   
>> -		_asm_extable	8888b,\l;
>> +		_asm_extable_uaccess_mc	8888b, \l;
>>   	.endm
>>   #endif
>> diff --git a/arch/arm64/lib/copy_from_user.S b/arch/arm64/lib/copy_from_user.S
>> index 34e317907524..480cc5ac0a8d 100644
>> --- a/arch/arm64/lib/copy_from_user.S
>> +++ b/arch/arm64/lib/copy_from_user.S
>> @@ -25,7 +25,7 @@
>>   	.endm
>>   
>>   	.macro strb1 reg, ptr, val
>> -	strb \reg, [\ptr], \val
>> +	USER_MC(9998f, strb \reg, [\ptr], \val)
>>   	.endm
>>   
>>   	.macro ldrh1 reg, ptr, val
>> @@ -33,7 +33,7 @@
>>   	.endm
>>   
>>   	.macro strh1 reg, ptr, val
>> -	strh \reg, [\ptr], \val
>> +	USER_MC(9998f, strh \reg, [\ptr], \val)
>>   	.endm
>>   
>>   	.macro ldr1 reg, ptr, val
>> @@ -41,7 +41,7 @@
>>   	.endm
>>   
>>   	.macro str1 reg, ptr, val
>> -	str \reg, [\ptr], \val
>> +	USER_MC(9998f, str \reg, [\ptr], \val)
>>   	.endm
>>   
>>   	.macro ldp1 reg1, reg2, ptr, val
>> @@ -49,11 +49,12 @@
>>   	.endm
>>   
>>   	.macro stp1 reg1, reg2, ptr, val
>> -	stp \reg1, \reg2, [\ptr], \val
>> +	USER_MC(9998f, stp \reg1, \reg2, [\ptr], \val)
>>   	.endm
>>   
>> -end	.req	x5
>> -srcin	.req	x15
>> +end		.req	x5
>> +srcin		.req	x15
>> +fixup_type	.req	x16
>>   SYM_FUNC_START(__arch_copy_from_user)
>>   	add	end, x0, x2
>>   	mov	srcin, x1
>> @@ -62,7 +63,10 @@ SYM_FUNC_START(__arch_copy_from_user)
>>   	ret
>>   
>>   	// Exception fixups
>> -9997:	cmp	dst, dstin
>> +	// x16: fixup type written by ex_handler_uaccess_mc
>> +9997:	cmp 	fixup_type, #FIXUP_TYPE_MC
>> +	b.eq	9998f
>> +	cmp	dst, dstin
>>   	b.ne	9998f
>>   	// Before being absolutely sure we couldn't copy anything, try harder
>>   USER(9998f, ldtrb tmp1w, [srcin])
>> diff --git a/arch/arm64/lib/copy_to_user.S b/arch/arm64/lib/copy_to_user.S
>> index 802231772608..021a7d27b3a4 100644
>> --- a/arch/arm64/lib/copy_to_user.S
>> +++ b/arch/arm64/lib/copy_to_user.S
>> @@ -20,7 +20,7 @@
>>    *	x0 - bytes not copied
>>    */
>>   	.macro ldrb1 reg, ptr, val
>> -	ldrb  \reg, [\ptr], \val
>> +	USER_MC(9998f, ldrb  \reg, [\ptr], \val)
>>   	.endm
>>   
>>   	.macro strb1 reg, ptr, val
>> @@ -28,7 +28,7 @@
>>   	.endm
>>   
>>   	.macro ldrh1 reg, ptr, val
>> -	ldrh  \reg, [\ptr], \val
>> +	USER_MC(9998f, ldrh  \reg, [\ptr], \val)
>>   	.endm
>>   
>>   	.macro strh1 reg, ptr, val
>> @@ -36,7 +36,7 @@
>>   	.endm
>>   
>>   	.macro ldr1 reg, ptr, val
>> -	ldr \reg, [\ptr], \val
>> +	USER_MC(9998f, ldr \reg, [\ptr], \val)
>>   	.endm
>>   
>>   	.macro str1 reg, ptr, val
>> @@ -44,15 +44,16 @@
>>   	.endm
>>   
>>   	.macro ldp1 reg1, reg2, ptr, val
>> -	ldp \reg1, \reg2, [\ptr], \val
>> +	USER_MC(9998f, ldp \reg1, \reg2, [\ptr], \val)
>>   	.endm
>>   
>>   	.macro stp1 reg1, reg2, ptr, val
>>   	user_stp 9997f, \reg1, \reg2, \ptr, \val
>>   	.endm
>>   
>> -end	.req	x5
>> -srcin	.req	x15
>> +end		.req	x5
>> +srcin		.req	x15
>> +fixup_type	.req	x16
>>   SYM_FUNC_START(__arch_copy_to_user)
>>   	add	end, x0, x2
>>   	mov	srcin, x1
>> @@ -61,7 +62,10 @@ SYM_FUNC_START(__arch_copy_to_user)
>>   	ret
>>   
>>   	// Exception fixups
>> -9997:	cmp	dst, dstin
>> +	// x16: fixup type written by ex_handler_uaccess_mc
>> +9997:	cmp 	fixup_type, #FIXUP_TYPE_MC
>> +	b.eq	9998f
>> +	cmp	dst, dstin
>>   	b.ne	9998f
>>   	// Before being absolutely sure we couldn't copy anything, try harder
>>   	ldrb	tmp1w, [srcin]
>> diff --git a/arch/arm64/mm/extable.c b/arch/arm64/mm/extable.c
>> index 4f0083a550d4..525876c3ebf4 100644
>> --- a/arch/arm64/mm/extable.c
>> +++ b/arch/arm64/mm/extable.c
>> @@ -24,6 +24,14 @@ static bool ex_handler_fixup(const struct exception_table_entry *ex,
>>   	return true;
>>   }
>>   
>> +static bool ex_handler_uaccess_type(const struct exception_table_entry *ex,
>> +			     struct pt_regs *regs,
>> +			     unsigned long fixup_type)
>> +{
>> +	regs->regs[16] = fixup_type;
>> +	return ex_handler_fixup(ex, regs);
>> +}
>> +
>>   static bool ex_handler_uaccess_err_zero(const struct exception_table_entry *ex,
>>   					struct pt_regs *regs)
>>   {
>> @@ -75,6 +83,8 @@ bool fixup_exception(struct pt_regs *regs)
>>   	switch (ex->type) {
>>   	case EX_TYPE_FIXUP:
>>   		return ex_handler_fixup(ex, regs);
>> +	case EX_TYPE_UACCESS_MC:
>> +		return ex_handler_uaccess_type(ex, regs, FIXUP_TYPE_NORMAL);
>>   	case EX_TYPE_BPF:
>>   		return ex_handler_bpf(ex, regs);
>>   	case EX_TYPE_UACCESS_ERR_ZERO:
>> @@ -94,10 +104,10 @@ bool fixup_exception_mc(struct pt_regs *regs)
>>   	if (!ex)
>>   		return false;
>>   
>> -	/*
>> -	 * This is not complete, More Machine check safe extable type can
>> -	 * be processed here.
>> -	 */
>> +	switch (ex->type) {
>> +	case EX_TYPE_UACCESS_MC:
>> +		return ex_handler_uaccess_type(ex, regs, FIXUP_TYPE_MC);
>> +	}
>>   
>>   	return false;
>>   }
>> -- 
>> 2.25.1
>>
> .

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH -next v4 6/7] arm64: add {get, put}_user to machine check safe
  2022-05-13 15:39     ` Mark Rutland
  (?)
@ 2022-05-19  7:09       ` Tong Tiangen
  -1 siblings, 0 replies; 96+ messages in thread
From: Tong Tiangen @ 2022-05-19  7:09 UTC (permalink / raw)
  To: Mark Rutland
  Cc: James Morse, Andrew Morton, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Robin Murphy, Dave Hansen, Catalin Marinas,
	Will Deacon, Alexander Viro, Michael Ellerman,
	Benjamin Herrenschmidt, Paul Mackerras, x86, H . Peter Anvin,
	linuxppc-dev, linux-arm-kernel, linux-kernel, linux-mm,
	Kefeng Wang, Xie XiuQi, Guohanjun



在 2022/5/13 23:39, Mark Rutland 写道:
> On Wed, Apr 20, 2022 at 03:04:17AM +0000, Tong Tiangen wrote:
>> Add {get, put}_user() to machine check safe.
>>
>> If get/put fail due to hardware memory error, only the relevant processes
>> are affected, so killing the user process and isolate the user page with
>> hardware memory errors is a more reasonable choice than kernel panic.
>>
>> Add new extable type EX_TYPE_UACCESS_MC_ERR_ZERO which can be used for
>> uaccess that can be recovered from hardware memory errors. The difference
>> from EX_TYPE_UACCESS_MC is that this type also sets additional two target
>> register which save error code and value needs to be set zero.
> 
> Why does this need to be in any way distinct from the existing
> EX_TYPE_UACCESS_ERR_ZERO ?
> 
> Other than the case where we currently (ab)use that for
> copy_{to,from}_kernel_nofault(), where do we *not* want to use
> EX_TYPE_UACCESS_ERR_ZERO and *not* recover from a memory error?
> 
> Thanks,
> Mark.

There are some cases (futex/__user_cache_maint()/__user_swpX_asm()) 
using EX_TYPE_UACCESS_ERR_ZERO, for these cases, whether to restore is 
not yet determined, let's discuss in patch 3/7.

Thanks,
Tong.

> 
>>
>> Signed-off-by: Tong Tiangen <tongtiangen@huawei.com>
>> ---
>>   arch/arm64/include/asm/asm-extable.h | 14 ++++++++++++++
>>   arch/arm64/include/asm/uaccess.h     |  4 ++--
>>   arch/arm64/mm/extable.c              |  4 ++++
>>   3 files changed, 20 insertions(+), 2 deletions(-)
>>
>> diff --git a/arch/arm64/include/asm/asm-extable.h b/arch/arm64/include/asm/asm-extable.h
>> index 75b2c00e9523..80410899a9ad 100644
>> --- a/arch/arm64/include/asm/asm-extable.h
>> +++ b/arch/arm64/include/asm/asm-extable.h
>> @@ -13,6 +13,7 @@
>>   
>>   /* _MC indicates that can fixup from machine check errors */
>>   #define EX_TYPE_UACCESS_MC		5
>> +#define EX_TYPE_UACCESS_MC_ERR_ZERO	6
>>   
>>   #ifdef __ASSEMBLY__
>>   
>> @@ -78,6 +79,15 @@
>>   #define EX_DATA_REG(reg, gpr)						\
>>   	"((.L__gpr_num_" #gpr ") << " __stringify(EX_DATA_REG_##reg##_SHIFT) ")"
>>   
>> +#define _ASM_EXTABLE_UACCESS_MC_ERR_ZERO(insn, fixup, err, zero)		\
>> +	__DEFINE_ASM_GPR_NUMS							\
>> +	__ASM_EXTABLE_RAW(#insn, #fixup,					\
>> +			  __stringify(EX_TYPE_UACCESS_MC_ERR_ZERO),		\
>> +			  "("							\
>> +			    EX_DATA_REG(ERR, err) " | "				\
>> +			    EX_DATA_REG(ZERO, zero)				\
>> +			  ")")
>> +
>>   #define _ASM_EXTABLE_UACCESS_ERR_ZERO(insn, fixup, err, zero)		\
>>   	__DEFINE_ASM_GPR_NUMS						\
>>   	__ASM_EXTABLE_RAW(#insn, #fixup, 				\
>> @@ -90,6 +100,10 @@
>>   #define _ASM_EXTABLE_UACCESS_ERR(insn, fixup, err)			\
>>   	_ASM_EXTABLE_UACCESS_ERR_ZERO(insn, fixup, err, wzr)
>>   
>> +
>> +#define _ASM_EXTABLE_UACCESS_MC_ERR(insn, fixup, err)			\
>> +	_ASM_EXTABLE_UACCESS_MC_ERR_ZERO(insn, fixup, err, wzr)
>> +
>>   #define EX_DATA_REG_DATA_SHIFT	0
>>   #define EX_DATA_REG_DATA	GENMASK(4, 0)
>>   #define EX_DATA_REG_ADDR_SHIFT	5
>> diff --git a/arch/arm64/include/asm/uaccess.h b/arch/arm64/include/asm/uaccess.h
>> index e8dce0cc5eaa..e41b47df48b0 100644
>> --- a/arch/arm64/include/asm/uaccess.h
>> +++ b/arch/arm64/include/asm/uaccess.h
>> @@ -236,7 +236,7 @@ static inline void __user *__uaccess_mask_ptr(const void __user *ptr)
>>   	asm volatile(							\
>>   	"1:	" load "	" reg "1, [%2]\n"			\
>>   	"2:\n"								\
>> -	_ASM_EXTABLE_UACCESS_ERR_ZERO(1b, 2b, %w0, %w1)			\
>> +	_ASM_EXTABLE_UACCESS_MC_ERR_ZERO(1b, 2b, %w0, %w1)		\
>>   	: "+r" (err), "=&r" (x)						\
>>   	: "r" (addr))
>>   
>> @@ -325,7 +325,7 @@ do {									\
>>   	asm volatile(							\
>>   	"1:	" store "	" reg "1, [%2]\n"			\
>>   	"2:\n"								\
>> -	_ASM_EXTABLE_UACCESS_ERR(1b, 2b, %w0)				\
>> +	_ASM_EXTABLE_UACCESS_MC_ERR(1b, 2b, %w0)			\
>>   	: "+r" (err)							\
>>   	: "r" (x), "r" (addr))
>>   
>> diff --git a/arch/arm64/mm/extable.c b/arch/arm64/mm/extable.c
>> index 525876c3ebf4..1023ccdb2f89 100644
>> --- a/arch/arm64/mm/extable.c
>> +++ b/arch/arm64/mm/extable.c
>> @@ -88,6 +88,7 @@ bool fixup_exception(struct pt_regs *regs)
>>   	case EX_TYPE_BPF:
>>   		return ex_handler_bpf(ex, regs);
>>   	case EX_TYPE_UACCESS_ERR_ZERO:
>> +	case EX_TYPE_UACCESS_MC_ERR_ZERO:
>>   		return ex_handler_uaccess_err_zero(ex, regs);
>>   	case EX_TYPE_LOAD_UNALIGNED_ZEROPAD:
>>   		return ex_handler_load_unaligned_zeropad(ex, regs);
>> @@ -107,6 +108,9 @@ bool fixup_exception_mc(struct pt_regs *regs)
>>   	switch (ex->type) {
>>   	case EX_TYPE_UACCESS_MC:
>>   		return ex_handler_uaccess_type(ex, regs, FIXUP_TYPE_MC);
>> +	case EX_TYPE_UACCESS_MC_ERR_ZERO:
>> +		return ex_handler_uaccess_err_zero(ex, regs);
>> +
>>   	}
>>   
>>   	return false;
>> -- 
>> 2.25.1
>>
> .

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH -next v4 6/7] arm64: add {get, put}_user to machine check safe
@ 2022-05-19  7:09       ` Tong Tiangen
  0 siblings, 0 replies; 96+ messages in thread
From: Tong Tiangen @ 2022-05-19  7:09 UTC (permalink / raw)
  To: Mark Rutland
  Cc: Kefeng Wang, Dave Hansen, linux-mm, Paul Mackerras, Guohanjun,
	Will Deacon, H . Peter Anvin, x86, Ingo Molnar, Catalin Marinas,
	Xie XiuQi, Borislav Petkov, Alexander Viro, Thomas Gleixner,
	linux-arm-kernel, Robin Murphy, linux-kernel, James Morse,
	Andrew Morton, linuxppc-dev



在 2022/5/13 23:39, Mark Rutland 写道:
> On Wed, Apr 20, 2022 at 03:04:17AM +0000, Tong Tiangen wrote:
>> Add {get, put}_user() to machine check safe.
>>
>> If get/put fail due to hardware memory error, only the relevant processes
>> are affected, so killing the user process and isolate the user page with
>> hardware memory errors is a more reasonable choice than kernel panic.
>>
>> Add new extable type EX_TYPE_UACCESS_MC_ERR_ZERO which can be used for
>> uaccess that can be recovered from hardware memory errors. The difference
>> from EX_TYPE_UACCESS_MC is that this type also sets additional two target
>> register which save error code and value needs to be set zero.
> 
> Why does this need to be in any way distinct from the existing
> EX_TYPE_UACCESS_ERR_ZERO ?
> 
> Other than the case where we currently (ab)use that for
> copy_{to,from}_kernel_nofault(), where do we *not* want to use
> EX_TYPE_UACCESS_ERR_ZERO and *not* recover from a memory error?
> 
> Thanks,
> Mark.

There are some cases (futex/__user_cache_maint()/__user_swpX_asm()) 
using EX_TYPE_UACCESS_ERR_ZERO, for these cases, whether to restore is 
not yet determined, let's discuss in patch 3/7.

Thanks,
Tong.

> 
>>
>> Signed-off-by: Tong Tiangen <tongtiangen@huawei.com>
>> ---
>>   arch/arm64/include/asm/asm-extable.h | 14 ++++++++++++++
>>   arch/arm64/include/asm/uaccess.h     |  4 ++--
>>   arch/arm64/mm/extable.c              |  4 ++++
>>   3 files changed, 20 insertions(+), 2 deletions(-)
>>
>> diff --git a/arch/arm64/include/asm/asm-extable.h b/arch/arm64/include/asm/asm-extable.h
>> index 75b2c00e9523..80410899a9ad 100644
>> --- a/arch/arm64/include/asm/asm-extable.h
>> +++ b/arch/arm64/include/asm/asm-extable.h
>> @@ -13,6 +13,7 @@
>>   
>>   /* _MC indicates that can fixup from machine check errors */
>>   #define EX_TYPE_UACCESS_MC		5
>> +#define EX_TYPE_UACCESS_MC_ERR_ZERO	6
>>   
>>   #ifdef __ASSEMBLY__
>>   
>> @@ -78,6 +79,15 @@
>>   #define EX_DATA_REG(reg, gpr)						\
>>   	"((.L__gpr_num_" #gpr ") << " __stringify(EX_DATA_REG_##reg##_SHIFT) ")"
>>   
>> +#define _ASM_EXTABLE_UACCESS_MC_ERR_ZERO(insn, fixup, err, zero)		\
>> +	__DEFINE_ASM_GPR_NUMS							\
>> +	__ASM_EXTABLE_RAW(#insn, #fixup,					\
>> +			  __stringify(EX_TYPE_UACCESS_MC_ERR_ZERO),		\
>> +			  "("							\
>> +			    EX_DATA_REG(ERR, err) " | "				\
>> +			    EX_DATA_REG(ZERO, zero)				\
>> +			  ")")
>> +
>>   #define _ASM_EXTABLE_UACCESS_ERR_ZERO(insn, fixup, err, zero)		\
>>   	__DEFINE_ASM_GPR_NUMS						\
>>   	__ASM_EXTABLE_RAW(#insn, #fixup, 				\
>> @@ -90,6 +100,10 @@
>>   #define _ASM_EXTABLE_UACCESS_ERR(insn, fixup, err)			\
>>   	_ASM_EXTABLE_UACCESS_ERR_ZERO(insn, fixup, err, wzr)
>>   
>> +
>> +#define _ASM_EXTABLE_UACCESS_MC_ERR(insn, fixup, err)			\
>> +	_ASM_EXTABLE_UACCESS_MC_ERR_ZERO(insn, fixup, err, wzr)
>> +
>>   #define EX_DATA_REG_DATA_SHIFT	0
>>   #define EX_DATA_REG_DATA	GENMASK(4, 0)
>>   #define EX_DATA_REG_ADDR_SHIFT	5
>> diff --git a/arch/arm64/include/asm/uaccess.h b/arch/arm64/include/asm/uaccess.h
>> index e8dce0cc5eaa..e41b47df48b0 100644
>> --- a/arch/arm64/include/asm/uaccess.h
>> +++ b/arch/arm64/include/asm/uaccess.h
>> @@ -236,7 +236,7 @@ static inline void __user *__uaccess_mask_ptr(const void __user *ptr)
>>   	asm volatile(							\
>>   	"1:	" load "	" reg "1, [%2]\n"			\
>>   	"2:\n"								\
>> -	_ASM_EXTABLE_UACCESS_ERR_ZERO(1b, 2b, %w0, %w1)			\
>> +	_ASM_EXTABLE_UACCESS_MC_ERR_ZERO(1b, 2b, %w0, %w1)		\
>>   	: "+r" (err), "=&r" (x)						\
>>   	: "r" (addr))
>>   
>> @@ -325,7 +325,7 @@ do {									\
>>   	asm volatile(							\
>>   	"1:	" store "	" reg "1, [%2]\n"			\
>>   	"2:\n"								\
>> -	_ASM_EXTABLE_UACCESS_ERR(1b, 2b, %w0)				\
>> +	_ASM_EXTABLE_UACCESS_MC_ERR(1b, 2b, %w0)			\
>>   	: "+r" (err)							\
>>   	: "r" (x), "r" (addr))
>>   
>> diff --git a/arch/arm64/mm/extable.c b/arch/arm64/mm/extable.c
>> index 525876c3ebf4..1023ccdb2f89 100644
>> --- a/arch/arm64/mm/extable.c
>> +++ b/arch/arm64/mm/extable.c
>> @@ -88,6 +88,7 @@ bool fixup_exception(struct pt_regs *regs)
>>   	case EX_TYPE_BPF:
>>   		return ex_handler_bpf(ex, regs);
>>   	case EX_TYPE_UACCESS_ERR_ZERO:
>> +	case EX_TYPE_UACCESS_MC_ERR_ZERO:
>>   		return ex_handler_uaccess_err_zero(ex, regs);
>>   	case EX_TYPE_LOAD_UNALIGNED_ZEROPAD:
>>   		return ex_handler_load_unaligned_zeropad(ex, regs);
>> @@ -107,6 +108,9 @@ bool fixup_exception_mc(struct pt_regs *regs)
>>   	switch (ex->type) {
>>   	case EX_TYPE_UACCESS_MC:
>>   		return ex_handler_uaccess_type(ex, regs, FIXUP_TYPE_MC);
>> +	case EX_TYPE_UACCESS_MC_ERR_ZERO:
>> +		return ex_handler_uaccess_err_zero(ex, regs);
>> +
>>   	}
>>   
>>   	return false;
>> -- 
>> 2.25.1
>>
> .

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH -next v4 6/7] arm64: add {get, put}_user to machine check safe
@ 2022-05-19  7:09       ` Tong Tiangen
  0 siblings, 0 replies; 96+ messages in thread
From: Tong Tiangen @ 2022-05-19  7:09 UTC (permalink / raw)
  To: Mark Rutland
  Cc: James Morse, Andrew Morton, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Robin Murphy, Dave Hansen, Catalin Marinas,
	Will Deacon, Alexander Viro, Michael Ellerman,
	Benjamin Herrenschmidt, Paul Mackerras, x86, H . Peter Anvin,
	linuxppc-dev, linux-arm-kernel, linux-kernel, linux-mm,
	Kefeng Wang, Xie XiuQi, Guohanjun



在 2022/5/13 23:39, Mark Rutland 写道:
> On Wed, Apr 20, 2022 at 03:04:17AM +0000, Tong Tiangen wrote:
>> Add {get, put}_user() to machine check safe.
>>
>> If get/put fail due to hardware memory error, only the relevant processes
>> are affected, so killing the user process and isolate the user page with
>> hardware memory errors is a more reasonable choice than kernel panic.
>>
>> Add new extable type EX_TYPE_UACCESS_MC_ERR_ZERO which can be used for
>> uaccess that can be recovered from hardware memory errors. The difference
>> from EX_TYPE_UACCESS_MC is that this type also sets additional two target
>> register which save error code and value needs to be set zero.
> 
> Why does this need to be in any way distinct from the existing
> EX_TYPE_UACCESS_ERR_ZERO ?
> 
> Other than the case where we currently (ab)use that for
> copy_{to,from}_kernel_nofault(), where do we *not* want to use
> EX_TYPE_UACCESS_ERR_ZERO and *not* recover from a memory error?
> 
> Thanks,
> Mark.

There are some cases (futex/__user_cache_maint()/__user_swpX_asm()) 
using EX_TYPE_UACCESS_ERR_ZERO, for these cases, whether to restore is 
not yet determined, let's discuss in patch 3/7.

Thanks,
Tong.

> 
>>
>> Signed-off-by: Tong Tiangen <tongtiangen@huawei.com>
>> ---
>>   arch/arm64/include/asm/asm-extable.h | 14 ++++++++++++++
>>   arch/arm64/include/asm/uaccess.h     |  4 ++--
>>   arch/arm64/mm/extable.c              |  4 ++++
>>   3 files changed, 20 insertions(+), 2 deletions(-)
>>
>> diff --git a/arch/arm64/include/asm/asm-extable.h b/arch/arm64/include/asm/asm-extable.h
>> index 75b2c00e9523..80410899a9ad 100644
>> --- a/arch/arm64/include/asm/asm-extable.h
>> +++ b/arch/arm64/include/asm/asm-extable.h
>> @@ -13,6 +13,7 @@
>>   
>>   /* _MC indicates that can fixup from machine check errors */
>>   #define EX_TYPE_UACCESS_MC		5
>> +#define EX_TYPE_UACCESS_MC_ERR_ZERO	6
>>   
>>   #ifdef __ASSEMBLY__
>>   
>> @@ -78,6 +79,15 @@
>>   #define EX_DATA_REG(reg, gpr)						\
>>   	"((.L__gpr_num_" #gpr ") << " __stringify(EX_DATA_REG_##reg##_SHIFT) ")"
>>   
>> +#define _ASM_EXTABLE_UACCESS_MC_ERR_ZERO(insn, fixup, err, zero)		\
>> +	__DEFINE_ASM_GPR_NUMS							\
>> +	__ASM_EXTABLE_RAW(#insn, #fixup,					\
>> +			  __stringify(EX_TYPE_UACCESS_MC_ERR_ZERO),		\
>> +			  "("							\
>> +			    EX_DATA_REG(ERR, err) " | "				\
>> +			    EX_DATA_REG(ZERO, zero)				\
>> +			  ")")
>> +
>>   #define _ASM_EXTABLE_UACCESS_ERR_ZERO(insn, fixup, err, zero)		\
>>   	__DEFINE_ASM_GPR_NUMS						\
>>   	__ASM_EXTABLE_RAW(#insn, #fixup, 				\
>> @@ -90,6 +100,10 @@
>>   #define _ASM_EXTABLE_UACCESS_ERR(insn, fixup, err)			\
>>   	_ASM_EXTABLE_UACCESS_ERR_ZERO(insn, fixup, err, wzr)
>>   
>> +
>> +#define _ASM_EXTABLE_UACCESS_MC_ERR(insn, fixup, err)			\
>> +	_ASM_EXTABLE_UACCESS_MC_ERR_ZERO(insn, fixup, err, wzr)
>> +
>>   #define EX_DATA_REG_DATA_SHIFT	0
>>   #define EX_DATA_REG_DATA	GENMASK(4, 0)
>>   #define EX_DATA_REG_ADDR_SHIFT	5
>> diff --git a/arch/arm64/include/asm/uaccess.h b/arch/arm64/include/asm/uaccess.h
>> index e8dce0cc5eaa..e41b47df48b0 100644
>> --- a/arch/arm64/include/asm/uaccess.h
>> +++ b/arch/arm64/include/asm/uaccess.h
>> @@ -236,7 +236,7 @@ static inline void __user *__uaccess_mask_ptr(const void __user *ptr)
>>   	asm volatile(							\
>>   	"1:	" load "	" reg "1, [%2]\n"			\
>>   	"2:\n"								\
>> -	_ASM_EXTABLE_UACCESS_ERR_ZERO(1b, 2b, %w0, %w1)			\
>> +	_ASM_EXTABLE_UACCESS_MC_ERR_ZERO(1b, 2b, %w0, %w1)		\
>>   	: "+r" (err), "=&r" (x)						\
>>   	: "r" (addr))
>>   
>> @@ -325,7 +325,7 @@ do {									\
>>   	asm volatile(							\
>>   	"1:	" store "	" reg "1, [%2]\n"			\
>>   	"2:\n"								\
>> -	_ASM_EXTABLE_UACCESS_ERR(1b, 2b, %w0)				\
>> +	_ASM_EXTABLE_UACCESS_MC_ERR(1b, 2b, %w0)			\
>>   	: "+r" (err)							\
>>   	: "r" (x), "r" (addr))
>>   
>> diff --git a/arch/arm64/mm/extable.c b/arch/arm64/mm/extable.c
>> index 525876c3ebf4..1023ccdb2f89 100644
>> --- a/arch/arm64/mm/extable.c
>> +++ b/arch/arm64/mm/extable.c
>> @@ -88,6 +88,7 @@ bool fixup_exception(struct pt_regs *regs)
>>   	case EX_TYPE_BPF:
>>   		return ex_handler_bpf(ex, regs);
>>   	case EX_TYPE_UACCESS_ERR_ZERO:
>> +	case EX_TYPE_UACCESS_MC_ERR_ZERO:
>>   		return ex_handler_uaccess_err_zero(ex, regs);
>>   	case EX_TYPE_LOAD_UNALIGNED_ZEROPAD:
>>   		return ex_handler_load_unaligned_zeropad(ex, regs);
>> @@ -107,6 +108,9 @@ bool fixup_exception_mc(struct pt_regs *regs)
>>   	switch (ex->type) {
>>   	case EX_TYPE_UACCESS_MC:
>>   		return ex_handler_uaccess_type(ex, regs, FIXUP_TYPE_MC);
>> +	case EX_TYPE_UACCESS_MC_ERR_ZERO:
>> +		return ex_handler_uaccess_err_zero(ex, regs);
>> +
>>   	}
>>   
>>   	return false;
>> -- 
>> 2.25.1
>>
> .

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH -next v4 7/7] arm64: add cow to machine check safe
  2022-05-13 15:44     ` Mark Rutland
  (?)
@ 2022-05-19 10:38       ` Tong Tiangen
  -1 siblings, 0 replies; 96+ messages in thread
From: Tong Tiangen @ 2022-05-19 10:38 UTC (permalink / raw)
  To: Mark Rutland
  Cc: James Morse, Andrew Morton, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Robin Murphy, Dave Hansen, Catalin Marinas,
	Will Deacon, Alexander Viro, Michael Ellerman,
	Benjamin Herrenschmidt, Paul Mackerras, x86, H . Peter Anvin,
	linuxppc-dev, linux-arm-kernel, linux-kernel, linux-mm,
	Kefeng Wang, Xie XiuQi, Guohanjun



在 2022/5/13 23:44, Mark Rutland 写道:
> On Wed, Apr 20, 2022 at 03:04:18AM +0000, Tong Tiangen wrote:
>> In the cow(copy on write) processing, the data of the user process is
>> copied, when hardware memory error is encountered during copy, only the
>> relevant processes are affected, so killing the user process and isolate
>> the user page with hardware memory errors is a more reasonable choice than
>> kernel panic.
> 
> There are plenty of other places we'll access user pages via a kernel
> alias (e.g. when performing IO), so why is this special?
> 
> To be clear, I am not entirely averse to this, but it seems like this is
> being done because it's easy to do rather than necessarily being all
> that useful, and I'm not keen on having to duplicate a bunch of logic
> for this.

Yeah, There are lots of cases, COW is selected because it is more 
general. In addition, this provides the machine check safe capability of 
page copy(copy_highpage_mc), valuable cases can be based on this step by 
step[1].

[1]https://lore.kernel.org/all/20220429000947.2172219-1-jiaqiyan@google.com/T/

Thanks,
Tong.

> 
>> Add new helper copy_page_mc() which provide a page copy implementation with
>> machine check safe. At present, only used in cow. In future, we can expand
>> more scenes. As long as the consequences of page copy failure are not
>> fatal(eg: only affect user process), we can use this helper.
>>
>> The copy_page_mc() in copy_page_mc.S is largely borrows from copy_page()
>> in copy_page.S and the main difference is copy_page_mc() add extable entry
>> to every load/store insn to support machine check safe. largely to keep the
>> patch simple. If needed those optimizations can be folded in.
>>
>> Add new extable type EX_TYPE_COPY_PAGE_MC which used in copy_page_mc().
>>
>> This type only be processed in fixup_exception_mc(), The reason is that
>> copy_page_mc() is consistent with copy_page() except machine check safe is
>> considered, and copy_page() do not need to consider exception fixup.
>>
>> Signed-off-by: Tong Tiangen <tongtiangen@huawei.com>
>> ---
>>   arch/arm64/include/asm/asm-extable.h |  5 ++
>>   arch/arm64/include/asm/page.h        | 10 ++++
>>   arch/arm64/lib/Makefile              |  2 +
>>   arch/arm64/lib/copy_page_mc.S        | 86 ++++++++++++++++++++++++++++
>>   arch/arm64/mm/copypage.c             | 36 ++++++++++--
>>   arch/arm64/mm/extable.c              |  2 +
>>   include/linux/highmem.h              |  8 +++
>>   mm/memory.c                          |  2 +-
>>   8 files changed, 144 insertions(+), 7 deletions(-)
>>   create mode 100644 arch/arm64/lib/copy_page_mc.S
>>
>> diff --git a/arch/arm64/include/asm/asm-extable.h b/arch/arm64/include/asm/asm-extable.h
>> index 80410899a9ad..74c056ddae15 100644
>> --- a/arch/arm64/include/asm/asm-extable.h
>> +++ b/arch/arm64/include/asm/asm-extable.h
>> @@ -14,6 +14,7 @@
>>   /* _MC indicates that can fixup from machine check errors */
>>   #define EX_TYPE_UACCESS_MC		5
>>   #define EX_TYPE_UACCESS_MC_ERR_ZERO	6
>> +#define EX_TYPE_COPY_PAGE_MC		7
>>   
>>   #ifdef __ASSEMBLY__
>>   
>> @@ -42,6 +43,10 @@
>>   	__ASM_EXTABLE_RAW(\insn, \fixup, EX_TYPE_UACCESS_MC, 0)
>>   	.endm
>>   
>> +	.macro          _asm_extable_copy_page_mc, insn, fixup
>> +	__ASM_EXTABLE_RAW(\insn, \fixup, EX_TYPE_COPY_PAGE_MC, 0)
>> +	.endm
>> +
>>   /*
>>    * Create an exception table entry for `insn` if `fixup` is provided. Otherwise
>>    * do nothing.
>> diff --git a/arch/arm64/include/asm/page.h b/arch/arm64/include/asm/page.h
>> index 993a27ea6f54..832571a7dddb 100644
>> --- a/arch/arm64/include/asm/page.h
>> +++ b/arch/arm64/include/asm/page.h
>> @@ -29,6 +29,16 @@ void copy_user_highpage(struct page *to, struct page *from,
>>   void copy_highpage(struct page *to, struct page *from);
>>   #define __HAVE_ARCH_COPY_HIGHPAGE
>>   
>> +#ifdef CONFIG_ARCH_HAS_COPY_MC
>> +extern void copy_page_mc(void *to, const void *from);
>> +void copy_highpage_mc(struct page *to, struct page *from);
>> +#define __HAVE_ARCH_COPY_HIGHPAGE_MC
>> +
>> +void copy_user_highpage_mc(struct page *to, struct page *from,
>> +		unsigned long vaddr, struct vm_area_struct *vma);
>> +#define __HAVE_ARCH_COPY_USER_HIGHPAGE_MC
>> +#endif
>> +
>>   struct page *alloc_zeroed_user_highpage_movable(struct vm_area_struct *vma,
>>   						unsigned long vaddr);
>>   #define __HAVE_ARCH_ALLOC_ZEROED_USER_HIGHPAGE_MOVABLE
>> diff --git a/arch/arm64/lib/Makefile b/arch/arm64/lib/Makefile
>> index 29490be2546b..0d9f292ef68a 100644
>> --- a/arch/arm64/lib/Makefile
>> +++ b/arch/arm64/lib/Makefile
>> @@ -15,6 +15,8 @@ endif
>>   
>>   lib-$(CONFIG_ARCH_HAS_UACCESS_FLUSHCACHE) += uaccess_flushcache.o
>>   
>> +lib-$(CONFIG_ARCH_HAS_COPY_MC) += copy_page_mc.o
>> +
>>   obj-$(CONFIG_CRC32) += crc32.o
>>   
>>   obj-$(CONFIG_FUNCTION_ERROR_INJECTION) += error-inject.o
>> diff --git a/arch/arm64/lib/copy_page_mc.S b/arch/arm64/lib/copy_page_mc.S
>> new file mode 100644
>> index 000000000000..655161363dcf
>> --- /dev/null
>> +++ b/arch/arm64/lib/copy_page_mc.S
>> @@ -0,0 +1,86 @@
>> +/* SPDX-License-Identifier: GPL-2.0-only */
>> +/*
>> + * Copyright (C) 2012 ARM Ltd.
>> + */
>> +
>> +#include <linux/linkage.h>
>> +#include <linux/const.h>
>> +#include <asm/assembler.h>
>> +#include <asm/page.h>
>> +#include <asm/cpufeature.h>
>> +#include <asm/alternative.h>
>> +#include <asm/asm-extable.h>
>> +
>> +#define CPY_MC(l, x...)		\
>> +9999:   x;			\
>> +	_asm_extable_copy_page_mc    9999b, l
>> +
>> +/*
>> + * Copy a page from src to dest (both are page aligned) with machine check
>> + *
>> + * Parameters:
>> + *	x0 - dest
>> + *	x1 - src
>> + */
>> +SYM_FUNC_START(__pi_copy_page_mc)
>> +alternative_if ARM64_HAS_NO_HW_PREFETCH
>> +	// Prefetch three cache lines ahead.
>> +	prfm	pldl1strm, [x1, #128]
>> +	prfm	pldl1strm, [x1, #256]
>> +	prfm	pldl1strm, [x1, #384]
>> +alternative_else_nop_endif
>> +
>> +CPY_MC(9998f, ldp	x2, x3, [x1])
>> +CPY_MC(9998f, ldp	x4, x5, [x1, #16])
>> +CPY_MC(9998f, ldp	x6, x7, [x1, #32])
>> +CPY_MC(9998f, ldp	x8, x9, [x1, #48])
>> +CPY_MC(9998f, ldp	x10, x11, [x1, #64])
>> +CPY_MC(9998f, ldp	x12, x13, [x1, #80])
>> +CPY_MC(9998f, ldp	x14, x15, [x1, #96])
>> +CPY_MC(9998f, ldp	x16, x17, [x1, #112])
>> +
>> +	add	x0, x0, #256
>> +	add	x1, x1, #128
>> +1:
>> +	tst	x0, #(PAGE_SIZE - 1)
>> +
>> +alternative_if ARM64_HAS_NO_HW_PREFETCH
>> +	prfm	pldl1strm, [x1, #384]
>> +alternative_else_nop_endif
>> +
>> +CPY_MC(9998f, stnp	x2, x3, [x0, #-256])
>> +CPY_MC(9998f, ldp	x2, x3, [x1])
>> +CPY_MC(9998f, stnp	x4, x5, [x0, #16 - 256])
>> +CPY_MC(9998f, ldp	x4, x5, [x1, #16])
>> +CPY_MC(9998f, stnp	x6, x7, [x0, #32 - 256])
>> +CPY_MC(9998f, ldp	x6, x7, [x1, #32])
>> +CPY_MC(9998f, stnp	x8, x9, [x0, #48 - 256])
>> +CPY_MC(9998f, ldp	x8, x9, [x1, #48])
>> +CPY_MC(9998f, stnp	x10, x11, [x0, #64 - 256])
>> +CPY_MC(9998f, ldp	x10, x11, [x1, #64])
>> +CPY_MC(9998f, stnp	x12, x13, [x0, #80 - 256])
>> +CPY_MC(9998f, ldp	x12, x13, [x1, #80])
>> +CPY_MC(9998f, stnp	x14, x15, [x0, #96 - 256])
>> +CPY_MC(9998f, ldp	x14, x15, [x1, #96])
>> +CPY_MC(9998f, stnp	x16, x17, [x0, #112 - 256])
>> +CPY_MC(9998f, ldp	x16, x17, [x1, #112])
>> +
>> +	add	x0, x0, #128
>> +	add	x1, x1, #128
>> +
>> +	b.ne	1b
>> +
>> +CPY_MC(9998f, stnp	x2, x3, [x0, #-256])
>> +CPY_MC(9998f, stnp	x4, x5, [x0, #16 - 256])
>> +CPY_MC(9998f, stnp	x6, x7, [x0, #32 - 256])
>> +CPY_MC(9998f, stnp	x8, x9, [x0, #48 - 256])
>> +CPY_MC(9998f, stnp	x10, x11, [x0, #64 - 256])
>> +CPY_MC(9998f, stnp	x12, x13, [x0, #80 - 256])
>> +CPY_MC(9998f, stnp	x14, x15, [x0, #96 - 256])
>> +CPY_MC(9998f, stnp	x16, x17, [x0, #112 - 256])
>> +
>> +9998:	ret
>> +
>> +SYM_FUNC_END(__pi_copy_page_mc)
>> +SYM_FUNC_ALIAS(copy_page_mc, __pi_copy_page_mc)
>> +EXPORT_SYMBOL(copy_page_mc)
>> diff --git a/arch/arm64/mm/copypage.c b/arch/arm64/mm/copypage.c
>> index 0dea80bf6de4..0f28edfcb234 100644
>> --- a/arch/arm64/mm/copypage.c
>> +++ b/arch/arm64/mm/copypage.c
>> @@ -14,13 +14,8 @@
>>   #include <asm/cpufeature.h>
>>   #include <asm/mte.h>
>>   
>> -void copy_highpage(struct page *to, struct page *from)
>> +static void do_mte(struct page *to, struct page *from, void *kto, void *kfrom)
>>   {
>> -	void *kto = page_address(to);
>> -	void *kfrom = page_address(from);
>> -
>> -	copy_page(kto, kfrom);
>> -
>>   	if (system_supports_mte() && test_bit(PG_mte_tagged, &from->flags)) {
>>   		set_bit(PG_mte_tagged, &to->flags);
>>   		page_kasan_tag_reset(to);
>> @@ -35,6 +30,15 @@ void copy_highpage(struct page *to, struct page *from)
>>   		mte_copy_page_tags(kto, kfrom);
>>   	}
>>   }
>> +
>> +void copy_highpage(struct page *to, struct page *from)
>> +{
>> +	void *kto = page_address(to);
>> +	void *kfrom = page_address(from);
>> +
>> +	copy_page(kto, kfrom);
>> +	do_mte(to, from, kto, kfrom);
>> +}
>>   EXPORT_SYMBOL(copy_highpage);
>>   
>>   void copy_user_highpage(struct page *to, struct page *from,
>> @@ -44,3 +48,23 @@ void copy_user_highpage(struct page *to, struct page *from,
>>   	flush_dcache_page(to);
>>   }
>>   EXPORT_SYMBOL_GPL(copy_user_highpage);
>> +
>> +#ifdef CONFIG_ARCH_HAS_COPY_MC
>> +void copy_highpage_mc(struct page *to, struct page *from)
>> +{
>> +	void *kto = page_address(to);
>> +	void *kfrom = page_address(from);
>> +
>> +	copy_page_mc(kto, kfrom);
>> +	do_mte(to, from, kto, kfrom);
>> +}
>> +EXPORT_SYMBOL(copy_highpage_mc);
> 
> IIUC the do_mte() portion won't handle mermoy errors, so this isn't
> actually going to recover safely.
> 
> Thanks,
> Mark.

OK, Missing that, do_mte needs to be handled.

Thanks,
Tong.

> 
>> +
>> +void copy_user_highpage_mc(struct page *to, struct page *from,
>> +			unsigned long vaddr, struct vm_area_struct *vma)
>> +{
>> +	copy_highpage_mc(to, from);
>> +	flush_dcache_page(to);
>> +}
>> +EXPORT_SYMBOL_GPL(copy_user_highpage_mc);
>> +#endif
>> diff --git a/arch/arm64/mm/extable.c b/arch/arm64/mm/extable.c
>> index 1023ccdb2f89..4c882d36dd64 100644
>> --- a/arch/arm64/mm/extable.c
>> +++ b/arch/arm64/mm/extable.c
>> @@ -110,6 +110,8 @@ bool fixup_exception_mc(struct pt_regs *regs)
>>   		return ex_handler_uaccess_type(ex, regs, FIXUP_TYPE_MC);
>>   	case EX_TYPE_UACCESS_MC_ERR_ZERO:
>>   		return ex_handler_uaccess_err_zero(ex, regs);
>> +	case EX_TYPE_COPY_PAGE_MC:
>> +		return ex_handler_fixup(ex, regs);
>>   
>>   	}
>>   
>> diff --git a/include/linux/highmem.h b/include/linux/highmem.h
>> index 39bb9b47fa9c..a9dbf331b038 100644
>> --- a/include/linux/highmem.h
>> +++ b/include/linux/highmem.h
>> @@ -283,6 +283,10 @@ static inline void copy_user_highpage(struct page *to, struct page *from,
>>   
>>   #endif
>>   
>> +#ifndef __HAVE_ARCH_COPY_USER_HIGHPAGE_MC
>> +#define copy_user_highpage_mc copy_user_highpage
>> +#endif
>> +
>>   #ifndef __HAVE_ARCH_COPY_HIGHPAGE
>>   
>>   static inline void copy_highpage(struct page *to, struct page *from)
>> @@ -298,6 +302,10 @@ static inline void copy_highpage(struct page *to, struct page *from)
>>   
>>   #endif
>>   
>> +#ifndef __HAVE_ARCH_COPY_HIGHPAGE_MC
>> +#define cop_highpage_mc copy_highpage
>> +#endif
>> +
>>   static inline void memcpy_page(struct page *dst_page, size_t dst_off,
>>   			       struct page *src_page, size_t src_off,
>>   			       size_t len)
>> diff --git a/mm/memory.c b/mm/memory.c
>> index 76e3af9639d9..d5f62234152d 100644
>> --- a/mm/memory.c
>> +++ b/mm/memory.c
>> @@ -2767,7 +2767,7 @@ static inline bool cow_user_page(struct page *dst, struct page *src,
>>   	unsigned long addr = vmf->address;
>>   
>>   	if (likely(src)) {
>> -		copy_user_highpage(dst, src, addr, vma);
>> +		copy_user_highpage_mc(dst, src, addr, vma);
>>   		return true;
>>   	}
>>   
>> -- 
>> 2.25.1
>>
> .

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH -next v4 7/7] arm64: add cow to machine check safe
@ 2022-05-19 10:38       ` Tong Tiangen
  0 siblings, 0 replies; 96+ messages in thread
From: Tong Tiangen @ 2022-05-19 10:38 UTC (permalink / raw)
  To: Mark Rutland
  Cc: Kefeng Wang, Dave Hansen, linux-mm, Paul Mackerras, Guohanjun,
	Will Deacon, H . Peter Anvin, x86, Ingo Molnar, Catalin Marinas,
	Xie XiuQi, Borislav Petkov, Alexander Viro, Thomas Gleixner,
	linux-arm-kernel, Robin Murphy, linux-kernel, James Morse,
	Andrew Morton, linuxppc-dev



在 2022/5/13 23:44, Mark Rutland 写道:
> On Wed, Apr 20, 2022 at 03:04:18AM +0000, Tong Tiangen wrote:
>> In the cow(copy on write) processing, the data of the user process is
>> copied, when hardware memory error is encountered during copy, only the
>> relevant processes are affected, so killing the user process and isolate
>> the user page with hardware memory errors is a more reasonable choice than
>> kernel panic.
> 
> There are plenty of other places we'll access user pages via a kernel
> alias (e.g. when performing IO), so why is this special?
> 
> To be clear, I am not entirely averse to this, but it seems like this is
> being done because it's easy to do rather than necessarily being all
> that useful, and I'm not keen on having to duplicate a bunch of logic
> for this.

Yeah, There are lots of cases, COW is selected because it is more 
general. In addition, this provides the machine check safe capability of 
page copy(copy_highpage_mc), valuable cases can be based on this step by 
step[1].

[1]https://lore.kernel.org/all/20220429000947.2172219-1-jiaqiyan@google.com/T/

Thanks,
Tong.

> 
>> Add new helper copy_page_mc() which provide a page copy implementation with
>> machine check safe. At present, only used in cow. In future, we can expand
>> more scenes. As long as the consequences of page copy failure are not
>> fatal(eg: only affect user process), we can use this helper.
>>
>> The copy_page_mc() in copy_page_mc.S is largely borrows from copy_page()
>> in copy_page.S and the main difference is copy_page_mc() add extable entry
>> to every load/store insn to support machine check safe. largely to keep the
>> patch simple. If needed those optimizations can be folded in.
>>
>> Add new extable type EX_TYPE_COPY_PAGE_MC which used in copy_page_mc().
>>
>> This type only be processed in fixup_exception_mc(), The reason is that
>> copy_page_mc() is consistent with copy_page() except machine check safe is
>> considered, and copy_page() do not need to consider exception fixup.
>>
>> Signed-off-by: Tong Tiangen <tongtiangen@huawei.com>
>> ---
>>   arch/arm64/include/asm/asm-extable.h |  5 ++
>>   arch/arm64/include/asm/page.h        | 10 ++++
>>   arch/arm64/lib/Makefile              |  2 +
>>   arch/arm64/lib/copy_page_mc.S        | 86 ++++++++++++++++++++++++++++
>>   arch/arm64/mm/copypage.c             | 36 ++++++++++--
>>   arch/arm64/mm/extable.c              |  2 +
>>   include/linux/highmem.h              |  8 +++
>>   mm/memory.c                          |  2 +-
>>   8 files changed, 144 insertions(+), 7 deletions(-)
>>   create mode 100644 arch/arm64/lib/copy_page_mc.S
>>
>> diff --git a/arch/arm64/include/asm/asm-extable.h b/arch/arm64/include/asm/asm-extable.h
>> index 80410899a9ad..74c056ddae15 100644
>> --- a/arch/arm64/include/asm/asm-extable.h
>> +++ b/arch/arm64/include/asm/asm-extable.h
>> @@ -14,6 +14,7 @@
>>   /* _MC indicates that can fixup from machine check errors */
>>   #define EX_TYPE_UACCESS_MC		5
>>   #define EX_TYPE_UACCESS_MC_ERR_ZERO	6
>> +#define EX_TYPE_COPY_PAGE_MC		7
>>   
>>   #ifdef __ASSEMBLY__
>>   
>> @@ -42,6 +43,10 @@
>>   	__ASM_EXTABLE_RAW(\insn, \fixup, EX_TYPE_UACCESS_MC, 0)
>>   	.endm
>>   
>> +	.macro          _asm_extable_copy_page_mc, insn, fixup
>> +	__ASM_EXTABLE_RAW(\insn, \fixup, EX_TYPE_COPY_PAGE_MC, 0)
>> +	.endm
>> +
>>   /*
>>    * Create an exception table entry for `insn` if `fixup` is provided. Otherwise
>>    * do nothing.
>> diff --git a/arch/arm64/include/asm/page.h b/arch/arm64/include/asm/page.h
>> index 993a27ea6f54..832571a7dddb 100644
>> --- a/arch/arm64/include/asm/page.h
>> +++ b/arch/arm64/include/asm/page.h
>> @@ -29,6 +29,16 @@ void copy_user_highpage(struct page *to, struct page *from,
>>   void copy_highpage(struct page *to, struct page *from);
>>   #define __HAVE_ARCH_COPY_HIGHPAGE
>>   
>> +#ifdef CONFIG_ARCH_HAS_COPY_MC
>> +extern void copy_page_mc(void *to, const void *from);
>> +void copy_highpage_mc(struct page *to, struct page *from);
>> +#define __HAVE_ARCH_COPY_HIGHPAGE_MC
>> +
>> +void copy_user_highpage_mc(struct page *to, struct page *from,
>> +		unsigned long vaddr, struct vm_area_struct *vma);
>> +#define __HAVE_ARCH_COPY_USER_HIGHPAGE_MC
>> +#endif
>> +
>>   struct page *alloc_zeroed_user_highpage_movable(struct vm_area_struct *vma,
>>   						unsigned long vaddr);
>>   #define __HAVE_ARCH_ALLOC_ZEROED_USER_HIGHPAGE_MOVABLE
>> diff --git a/arch/arm64/lib/Makefile b/arch/arm64/lib/Makefile
>> index 29490be2546b..0d9f292ef68a 100644
>> --- a/arch/arm64/lib/Makefile
>> +++ b/arch/arm64/lib/Makefile
>> @@ -15,6 +15,8 @@ endif
>>   
>>   lib-$(CONFIG_ARCH_HAS_UACCESS_FLUSHCACHE) += uaccess_flushcache.o
>>   
>> +lib-$(CONFIG_ARCH_HAS_COPY_MC) += copy_page_mc.o
>> +
>>   obj-$(CONFIG_CRC32) += crc32.o
>>   
>>   obj-$(CONFIG_FUNCTION_ERROR_INJECTION) += error-inject.o
>> diff --git a/arch/arm64/lib/copy_page_mc.S b/arch/arm64/lib/copy_page_mc.S
>> new file mode 100644
>> index 000000000000..655161363dcf
>> --- /dev/null
>> +++ b/arch/arm64/lib/copy_page_mc.S
>> @@ -0,0 +1,86 @@
>> +/* SPDX-License-Identifier: GPL-2.0-only */
>> +/*
>> + * Copyright (C) 2012 ARM Ltd.
>> + */
>> +
>> +#include <linux/linkage.h>
>> +#include <linux/const.h>
>> +#include <asm/assembler.h>
>> +#include <asm/page.h>
>> +#include <asm/cpufeature.h>
>> +#include <asm/alternative.h>
>> +#include <asm/asm-extable.h>
>> +
>> +#define CPY_MC(l, x...)		\
>> +9999:   x;			\
>> +	_asm_extable_copy_page_mc    9999b, l
>> +
>> +/*
>> + * Copy a page from src to dest (both are page aligned) with machine check
>> + *
>> + * Parameters:
>> + *	x0 - dest
>> + *	x1 - src
>> + */
>> +SYM_FUNC_START(__pi_copy_page_mc)
>> +alternative_if ARM64_HAS_NO_HW_PREFETCH
>> +	// Prefetch three cache lines ahead.
>> +	prfm	pldl1strm, [x1, #128]
>> +	prfm	pldl1strm, [x1, #256]
>> +	prfm	pldl1strm, [x1, #384]
>> +alternative_else_nop_endif
>> +
>> +CPY_MC(9998f, ldp	x2, x3, [x1])
>> +CPY_MC(9998f, ldp	x4, x5, [x1, #16])
>> +CPY_MC(9998f, ldp	x6, x7, [x1, #32])
>> +CPY_MC(9998f, ldp	x8, x9, [x1, #48])
>> +CPY_MC(9998f, ldp	x10, x11, [x1, #64])
>> +CPY_MC(9998f, ldp	x12, x13, [x1, #80])
>> +CPY_MC(9998f, ldp	x14, x15, [x1, #96])
>> +CPY_MC(9998f, ldp	x16, x17, [x1, #112])
>> +
>> +	add	x0, x0, #256
>> +	add	x1, x1, #128
>> +1:
>> +	tst	x0, #(PAGE_SIZE - 1)
>> +
>> +alternative_if ARM64_HAS_NO_HW_PREFETCH
>> +	prfm	pldl1strm, [x1, #384]
>> +alternative_else_nop_endif
>> +
>> +CPY_MC(9998f, stnp	x2, x3, [x0, #-256])
>> +CPY_MC(9998f, ldp	x2, x3, [x1])
>> +CPY_MC(9998f, stnp	x4, x5, [x0, #16 - 256])
>> +CPY_MC(9998f, ldp	x4, x5, [x1, #16])
>> +CPY_MC(9998f, stnp	x6, x7, [x0, #32 - 256])
>> +CPY_MC(9998f, ldp	x6, x7, [x1, #32])
>> +CPY_MC(9998f, stnp	x8, x9, [x0, #48 - 256])
>> +CPY_MC(9998f, ldp	x8, x9, [x1, #48])
>> +CPY_MC(9998f, stnp	x10, x11, [x0, #64 - 256])
>> +CPY_MC(9998f, ldp	x10, x11, [x1, #64])
>> +CPY_MC(9998f, stnp	x12, x13, [x0, #80 - 256])
>> +CPY_MC(9998f, ldp	x12, x13, [x1, #80])
>> +CPY_MC(9998f, stnp	x14, x15, [x0, #96 - 256])
>> +CPY_MC(9998f, ldp	x14, x15, [x1, #96])
>> +CPY_MC(9998f, stnp	x16, x17, [x0, #112 - 256])
>> +CPY_MC(9998f, ldp	x16, x17, [x1, #112])
>> +
>> +	add	x0, x0, #128
>> +	add	x1, x1, #128
>> +
>> +	b.ne	1b
>> +
>> +CPY_MC(9998f, stnp	x2, x3, [x0, #-256])
>> +CPY_MC(9998f, stnp	x4, x5, [x0, #16 - 256])
>> +CPY_MC(9998f, stnp	x6, x7, [x0, #32 - 256])
>> +CPY_MC(9998f, stnp	x8, x9, [x0, #48 - 256])
>> +CPY_MC(9998f, stnp	x10, x11, [x0, #64 - 256])
>> +CPY_MC(9998f, stnp	x12, x13, [x0, #80 - 256])
>> +CPY_MC(9998f, stnp	x14, x15, [x0, #96 - 256])
>> +CPY_MC(9998f, stnp	x16, x17, [x0, #112 - 256])
>> +
>> +9998:	ret
>> +
>> +SYM_FUNC_END(__pi_copy_page_mc)
>> +SYM_FUNC_ALIAS(copy_page_mc, __pi_copy_page_mc)
>> +EXPORT_SYMBOL(copy_page_mc)
>> diff --git a/arch/arm64/mm/copypage.c b/arch/arm64/mm/copypage.c
>> index 0dea80bf6de4..0f28edfcb234 100644
>> --- a/arch/arm64/mm/copypage.c
>> +++ b/arch/arm64/mm/copypage.c
>> @@ -14,13 +14,8 @@
>>   #include <asm/cpufeature.h>
>>   #include <asm/mte.h>
>>   
>> -void copy_highpage(struct page *to, struct page *from)
>> +static void do_mte(struct page *to, struct page *from, void *kto, void *kfrom)
>>   {
>> -	void *kto = page_address(to);
>> -	void *kfrom = page_address(from);
>> -
>> -	copy_page(kto, kfrom);
>> -
>>   	if (system_supports_mte() && test_bit(PG_mte_tagged, &from->flags)) {
>>   		set_bit(PG_mte_tagged, &to->flags);
>>   		page_kasan_tag_reset(to);
>> @@ -35,6 +30,15 @@ void copy_highpage(struct page *to, struct page *from)
>>   		mte_copy_page_tags(kto, kfrom);
>>   	}
>>   }
>> +
>> +void copy_highpage(struct page *to, struct page *from)
>> +{
>> +	void *kto = page_address(to);
>> +	void *kfrom = page_address(from);
>> +
>> +	copy_page(kto, kfrom);
>> +	do_mte(to, from, kto, kfrom);
>> +}
>>   EXPORT_SYMBOL(copy_highpage);
>>   
>>   void copy_user_highpage(struct page *to, struct page *from,
>> @@ -44,3 +48,23 @@ void copy_user_highpage(struct page *to, struct page *from,
>>   	flush_dcache_page(to);
>>   }
>>   EXPORT_SYMBOL_GPL(copy_user_highpage);
>> +
>> +#ifdef CONFIG_ARCH_HAS_COPY_MC
>> +void copy_highpage_mc(struct page *to, struct page *from)
>> +{
>> +	void *kto = page_address(to);
>> +	void *kfrom = page_address(from);
>> +
>> +	copy_page_mc(kto, kfrom);
>> +	do_mte(to, from, kto, kfrom);
>> +}
>> +EXPORT_SYMBOL(copy_highpage_mc);
> 
> IIUC the do_mte() portion won't handle mermoy errors, so this isn't
> actually going to recover safely.
> 
> Thanks,
> Mark.

OK, Missing that, do_mte needs to be handled.

Thanks,
Tong.

> 
>> +
>> +void copy_user_highpage_mc(struct page *to, struct page *from,
>> +			unsigned long vaddr, struct vm_area_struct *vma)
>> +{
>> +	copy_highpage_mc(to, from);
>> +	flush_dcache_page(to);
>> +}
>> +EXPORT_SYMBOL_GPL(copy_user_highpage_mc);
>> +#endif
>> diff --git a/arch/arm64/mm/extable.c b/arch/arm64/mm/extable.c
>> index 1023ccdb2f89..4c882d36dd64 100644
>> --- a/arch/arm64/mm/extable.c
>> +++ b/arch/arm64/mm/extable.c
>> @@ -110,6 +110,8 @@ bool fixup_exception_mc(struct pt_regs *regs)
>>   		return ex_handler_uaccess_type(ex, regs, FIXUP_TYPE_MC);
>>   	case EX_TYPE_UACCESS_MC_ERR_ZERO:
>>   		return ex_handler_uaccess_err_zero(ex, regs);
>> +	case EX_TYPE_COPY_PAGE_MC:
>> +		return ex_handler_fixup(ex, regs);
>>   
>>   	}
>>   
>> diff --git a/include/linux/highmem.h b/include/linux/highmem.h
>> index 39bb9b47fa9c..a9dbf331b038 100644
>> --- a/include/linux/highmem.h
>> +++ b/include/linux/highmem.h
>> @@ -283,6 +283,10 @@ static inline void copy_user_highpage(struct page *to, struct page *from,
>>   
>>   #endif
>>   
>> +#ifndef __HAVE_ARCH_COPY_USER_HIGHPAGE_MC
>> +#define copy_user_highpage_mc copy_user_highpage
>> +#endif
>> +
>>   #ifndef __HAVE_ARCH_COPY_HIGHPAGE
>>   
>>   static inline void copy_highpage(struct page *to, struct page *from)
>> @@ -298,6 +302,10 @@ static inline void copy_highpage(struct page *to, struct page *from)
>>   
>>   #endif
>>   
>> +#ifndef __HAVE_ARCH_COPY_HIGHPAGE_MC
>> +#define cop_highpage_mc copy_highpage
>> +#endif
>> +
>>   static inline void memcpy_page(struct page *dst_page, size_t dst_off,
>>   			       struct page *src_page, size_t src_off,
>>   			       size_t len)
>> diff --git a/mm/memory.c b/mm/memory.c
>> index 76e3af9639d9..d5f62234152d 100644
>> --- a/mm/memory.c
>> +++ b/mm/memory.c
>> @@ -2767,7 +2767,7 @@ static inline bool cow_user_page(struct page *dst, struct page *src,
>>   	unsigned long addr = vmf->address;
>>   
>>   	if (likely(src)) {
>> -		copy_user_highpage(dst, src, addr, vma);
>> +		copy_user_highpage_mc(dst, src, addr, vma);
>>   		return true;
>>   	}
>>   
>> -- 
>> 2.25.1
>>
> .

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH -next v4 7/7] arm64: add cow to machine check safe
@ 2022-05-19 10:38       ` Tong Tiangen
  0 siblings, 0 replies; 96+ messages in thread
From: Tong Tiangen @ 2022-05-19 10:38 UTC (permalink / raw)
  To: Mark Rutland
  Cc: James Morse, Andrew Morton, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Robin Murphy, Dave Hansen, Catalin Marinas,
	Will Deacon, Alexander Viro, Michael Ellerman,
	Benjamin Herrenschmidt, Paul Mackerras, x86, H . Peter Anvin,
	linuxppc-dev, linux-arm-kernel, linux-kernel, linux-mm,
	Kefeng Wang, Xie XiuQi, Guohanjun



在 2022/5/13 23:44, Mark Rutland 写道:
> On Wed, Apr 20, 2022 at 03:04:18AM +0000, Tong Tiangen wrote:
>> In the cow(copy on write) processing, the data of the user process is
>> copied, when hardware memory error is encountered during copy, only the
>> relevant processes are affected, so killing the user process and isolate
>> the user page with hardware memory errors is a more reasonable choice than
>> kernel panic.
> 
> There are plenty of other places we'll access user pages via a kernel
> alias (e.g. when performing IO), so why is this special?
> 
> To be clear, I am not entirely averse to this, but it seems like this is
> being done because it's easy to do rather than necessarily being all
> that useful, and I'm not keen on having to duplicate a bunch of logic
> for this.

Yeah, There are lots of cases, COW is selected because it is more 
general. In addition, this provides the machine check safe capability of 
page copy(copy_highpage_mc), valuable cases can be based on this step by 
step[1].

[1]https://lore.kernel.org/all/20220429000947.2172219-1-jiaqiyan@google.com/T/

Thanks,
Tong.

> 
>> Add new helper copy_page_mc() which provide a page copy implementation with
>> machine check safe. At present, only used in cow. In future, we can expand
>> more scenes. As long as the consequences of page copy failure are not
>> fatal(eg: only affect user process), we can use this helper.
>>
>> The copy_page_mc() in copy_page_mc.S is largely borrows from copy_page()
>> in copy_page.S and the main difference is copy_page_mc() add extable entry
>> to every load/store insn to support machine check safe. largely to keep the
>> patch simple. If needed those optimizations can be folded in.
>>
>> Add new extable type EX_TYPE_COPY_PAGE_MC which used in copy_page_mc().
>>
>> This type only be processed in fixup_exception_mc(), The reason is that
>> copy_page_mc() is consistent with copy_page() except machine check safe is
>> considered, and copy_page() do not need to consider exception fixup.
>>
>> Signed-off-by: Tong Tiangen <tongtiangen@huawei.com>
>> ---
>>   arch/arm64/include/asm/asm-extable.h |  5 ++
>>   arch/arm64/include/asm/page.h        | 10 ++++
>>   arch/arm64/lib/Makefile              |  2 +
>>   arch/arm64/lib/copy_page_mc.S        | 86 ++++++++++++++++++++++++++++
>>   arch/arm64/mm/copypage.c             | 36 ++++++++++--
>>   arch/arm64/mm/extable.c              |  2 +
>>   include/linux/highmem.h              |  8 +++
>>   mm/memory.c                          |  2 +-
>>   8 files changed, 144 insertions(+), 7 deletions(-)
>>   create mode 100644 arch/arm64/lib/copy_page_mc.S
>>
>> diff --git a/arch/arm64/include/asm/asm-extable.h b/arch/arm64/include/asm/asm-extable.h
>> index 80410899a9ad..74c056ddae15 100644
>> --- a/arch/arm64/include/asm/asm-extable.h
>> +++ b/arch/arm64/include/asm/asm-extable.h
>> @@ -14,6 +14,7 @@
>>   /* _MC indicates that can fixup from machine check errors */
>>   #define EX_TYPE_UACCESS_MC		5
>>   #define EX_TYPE_UACCESS_MC_ERR_ZERO	6
>> +#define EX_TYPE_COPY_PAGE_MC		7
>>   
>>   #ifdef __ASSEMBLY__
>>   
>> @@ -42,6 +43,10 @@
>>   	__ASM_EXTABLE_RAW(\insn, \fixup, EX_TYPE_UACCESS_MC, 0)
>>   	.endm
>>   
>> +	.macro          _asm_extable_copy_page_mc, insn, fixup
>> +	__ASM_EXTABLE_RAW(\insn, \fixup, EX_TYPE_COPY_PAGE_MC, 0)
>> +	.endm
>> +
>>   /*
>>    * Create an exception table entry for `insn` if `fixup` is provided. Otherwise
>>    * do nothing.
>> diff --git a/arch/arm64/include/asm/page.h b/arch/arm64/include/asm/page.h
>> index 993a27ea6f54..832571a7dddb 100644
>> --- a/arch/arm64/include/asm/page.h
>> +++ b/arch/arm64/include/asm/page.h
>> @@ -29,6 +29,16 @@ void copy_user_highpage(struct page *to, struct page *from,
>>   void copy_highpage(struct page *to, struct page *from);
>>   #define __HAVE_ARCH_COPY_HIGHPAGE
>>   
>> +#ifdef CONFIG_ARCH_HAS_COPY_MC
>> +extern void copy_page_mc(void *to, const void *from);
>> +void copy_highpage_mc(struct page *to, struct page *from);
>> +#define __HAVE_ARCH_COPY_HIGHPAGE_MC
>> +
>> +void copy_user_highpage_mc(struct page *to, struct page *from,
>> +		unsigned long vaddr, struct vm_area_struct *vma);
>> +#define __HAVE_ARCH_COPY_USER_HIGHPAGE_MC
>> +#endif
>> +
>>   struct page *alloc_zeroed_user_highpage_movable(struct vm_area_struct *vma,
>>   						unsigned long vaddr);
>>   #define __HAVE_ARCH_ALLOC_ZEROED_USER_HIGHPAGE_MOVABLE
>> diff --git a/arch/arm64/lib/Makefile b/arch/arm64/lib/Makefile
>> index 29490be2546b..0d9f292ef68a 100644
>> --- a/arch/arm64/lib/Makefile
>> +++ b/arch/arm64/lib/Makefile
>> @@ -15,6 +15,8 @@ endif
>>   
>>   lib-$(CONFIG_ARCH_HAS_UACCESS_FLUSHCACHE) += uaccess_flushcache.o
>>   
>> +lib-$(CONFIG_ARCH_HAS_COPY_MC) += copy_page_mc.o
>> +
>>   obj-$(CONFIG_CRC32) += crc32.o
>>   
>>   obj-$(CONFIG_FUNCTION_ERROR_INJECTION) += error-inject.o
>> diff --git a/arch/arm64/lib/copy_page_mc.S b/arch/arm64/lib/copy_page_mc.S
>> new file mode 100644
>> index 000000000000..655161363dcf
>> --- /dev/null
>> +++ b/arch/arm64/lib/copy_page_mc.S
>> @@ -0,0 +1,86 @@
>> +/* SPDX-License-Identifier: GPL-2.0-only */
>> +/*
>> + * Copyright (C) 2012 ARM Ltd.
>> + */
>> +
>> +#include <linux/linkage.h>
>> +#include <linux/const.h>
>> +#include <asm/assembler.h>
>> +#include <asm/page.h>
>> +#include <asm/cpufeature.h>
>> +#include <asm/alternative.h>
>> +#include <asm/asm-extable.h>
>> +
>> +#define CPY_MC(l, x...)		\
>> +9999:   x;			\
>> +	_asm_extable_copy_page_mc    9999b, l
>> +
>> +/*
>> + * Copy a page from src to dest (both are page aligned) with machine check
>> + *
>> + * Parameters:
>> + *	x0 - dest
>> + *	x1 - src
>> + */
>> +SYM_FUNC_START(__pi_copy_page_mc)
>> +alternative_if ARM64_HAS_NO_HW_PREFETCH
>> +	// Prefetch three cache lines ahead.
>> +	prfm	pldl1strm, [x1, #128]
>> +	prfm	pldl1strm, [x1, #256]
>> +	prfm	pldl1strm, [x1, #384]
>> +alternative_else_nop_endif
>> +
>> +CPY_MC(9998f, ldp	x2, x3, [x1])
>> +CPY_MC(9998f, ldp	x4, x5, [x1, #16])
>> +CPY_MC(9998f, ldp	x6, x7, [x1, #32])
>> +CPY_MC(9998f, ldp	x8, x9, [x1, #48])
>> +CPY_MC(9998f, ldp	x10, x11, [x1, #64])
>> +CPY_MC(9998f, ldp	x12, x13, [x1, #80])
>> +CPY_MC(9998f, ldp	x14, x15, [x1, #96])
>> +CPY_MC(9998f, ldp	x16, x17, [x1, #112])
>> +
>> +	add	x0, x0, #256
>> +	add	x1, x1, #128
>> +1:
>> +	tst	x0, #(PAGE_SIZE - 1)
>> +
>> +alternative_if ARM64_HAS_NO_HW_PREFETCH
>> +	prfm	pldl1strm, [x1, #384]
>> +alternative_else_nop_endif
>> +
>> +CPY_MC(9998f, stnp	x2, x3, [x0, #-256])
>> +CPY_MC(9998f, ldp	x2, x3, [x1])
>> +CPY_MC(9998f, stnp	x4, x5, [x0, #16 - 256])
>> +CPY_MC(9998f, ldp	x4, x5, [x1, #16])
>> +CPY_MC(9998f, stnp	x6, x7, [x0, #32 - 256])
>> +CPY_MC(9998f, ldp	x6, x7, [x1, #32])
>> +CPY_MC(9998f, stnp	x8, x9, [x0, #48 - 256])
>> +CPY_MC(9998f, ldp	x8, x9, [x1, #48])
>> +CPY_MC(9998f, stnp	x10, x11, [x0, #64 - 256])
>> +CPY_MC(9998f, ldp	x10, x11, [x1, #64])
>> +CPY_MC(9998f, stnp	x12, x13, [x0, #80 - 256])
>> +CPY_MC(9998f, ldp	x12, x13, [x1, #80])
>> +CPY_MC(9998f, stnp	x14, x15, [x0, #96 - 256])
>> +CPY_MC(9998f, ldp	x14, x15, [x1, #96])
>> +CPY_MC(9998f, stnp	x16, x17, [x0, #112 - 256])
>> +CPY_MC(9998f, ldp	x16, x17, [x1, #112])
>> +
>> +	add	x0, x0, #128
>> +	add	x1, x1, #128
>> +
>> +	b.ne	1b
>> +
>> +CPY_MC(9998f, stnp	x2, x3, [x0, #-256])
>> +CPY_MC(9998f, stnp	x4, x5, [x0, #16 - 256])
>> +CPY_MC(9998f, stnp	x6, x7, [x0, #32 - 256])
>> +CPY_MC(9998f, stnp	x8, x9, [x0, #48 - 256])
>> +CPY_MC(9998f, stnp	x10, x11, [x0, #64 - 256])
>> +CPY_MC(9998f, stnp	x12, x13, [x0, #80 - 256])
>> +CPY_MC(9998f, stnp	x14, x15, [x0, #96 - 256])
>> +CPY_MC(9998f, stnp	x16, x17, [x0, #112 - 256])
>> +
>> +9998:	ret
>> +
>> +SYM_FUNC_END(__pi_copy_page_mc)
>> +SYM_FUNC_ALIAS(copy_page_mc, __pi_copy_page_mc)
>> +EXPORT_SYMBOL(copy_page_mc)
>> diff --git a/arch/arm64/mm/copypage.c b/arch/arm64/mm/copypage.c
>> index 0dea80bf6de4..0f28edfcb234 100644
>> --- a/arch/arm64/mm/copypage.c
>> +++ b/arch/arm64/mm/copypage.c
>> @@ -14,13 +14,8 @@
>>   #include <asm/cpufeature.h>
>>   #include <asm/mte.h>
>>   
>> -void copy_highpage(struct page *to, struct page *from)
>> +static void do_mte(struct page *to, struct page *from, void *kto, void *kfrom)
>>   {
>> -	void *kto = page_address(to);
>> -	void *kfrom = page_address(from);
>> -
>> -	copy_page(kto, kfrom);
>> -
>>   	if (system_supports_mte() && test_bit(PG_mte_tagged, &from->flags)) {
>>   		set_bit(PG_mte_tagged, &to->flags);
>>   		page_kasan_tag_reset(to);
>> @@ -35,6 +30,15 @@ void copy_highpage(struct page *to, struct page *from)
>>   		mte_copy_page_tags(kto, kfrom);
>>   	}
>>   }
>> +
>> +void copy_highpage(struct page *to, struct page *from)
>> +{
>> +	void *kto = page_address(to);
>> +	void *kfrom = page_address(from);
>> +
>> +	copy_page(kto, kfrom);
>> +	do_mte(to, from, kto, kfrom);
>> +}
>>   EXPORT_SYMBOL(copy_highpage);
>>   
>>   void copy_user_highpage(struct page *to, struct page *from,
>> @@ -44,3 +48,23 @@ void copy_user_highpage(struct page *to, struct page *from,
>>   	flush_dcache_page(to);
>>   }
>>   EXPORT_SYMBOL_GPL(copy_user_highpage);
>> +
>> +#ifdef CONFIG_ARCH_HAS_COPY_MC
>> +void copy_highpage_mc(struct page *to, struct page *from)
>> +{
>> +	void *kto = page_address(to);
>> +	void *kfrom = page_address(from);
>> +
>> +	copy_page_mc(kto, kfrom);
>> +	do_mte(to, from, kto, kfrom);
>> +}
>> +EXPORT_SYMBOL(copy_highpage_mc);
> 
> IIUC the do_mte() portion won't handle mermoy errors, so this isn't
> actually going to recover safely.
> 
> Thanks,
> Mark.

OK, Missing that, do_mte needs to be handled.

Thanks,
Tong.

> 
>> +
>> +void copy_user_highpage_mc(struct page *to, struct page *from,
>> +			unsigned long vaddr, struct vm_area_struct *vma)
>> +{
>> +	copy_highpage_mc(to, from);
>> +	flush_dcache_page(to);
>> +}
>> +EXPORT_SYMBOL_GPL(copy_user_highpage_mc);
>> +#endif
>> diff --git a/arch/arm64/mm/extable.c b/arch/arm64/mm/extable.c
>> index 1023ccdb2f89..4c882d36dd64 100644
>> --- a/arch/arm64/mm/extable.c
>> +++ b/arch/arm64/mm/extable.c
>> @@ -110,6 +110,8 @@ bool fixup_exception_mc(struct pt_regs *regs)
>>   		return ex_handler_uaccess_type(ex, regs, FIXUP_TYPE_MC);
>>   	case EX_TYPE_UACCESS_MC_ERR_ZERO:
>>   		return ex_handler_uaccess_err_zero(ex, regs);
>> +	case EX_TYPE_COPY_PAGE_MC:
>> +		return ex_handler_fixup(ex, regs);
>>   
>>   	}
>>   
>> diff --git a/include/linux/highmem.h b/include/linux/highmem.h
>> index 39bb9b47fa9c..a9dbf331b038 100644
>> --- a/include/linux/highmem.h
>> +++ b/include/linux/highmem.h
>> @@ -283,6 +283,10 @@ static inline void copy_user_highpage(struct page *to, struct page *from,
>>   
>>   #endif
>>   
>> +#ifndef __HAVE_ARCH_COPY_USER_HIGHPAGE_MC
>> +#define copy_user_highpage_mc copy_user_highpage
>> +#endif
>> +
>>   #ifndef __HAVE_ARCH_COPY_HIGHPAGE
>>   
>>   static inline void copy_highpage(struct page *to, struct page *from)
>> @@ -298,6 +302,10 @@ static inline void copy_highpage(struct page *to, struct page *from)
>>   
>>   #endif
>>   
>> +#ifndef __HAVE_ARCH_COPY_HIGHPAGE_MC
>> +#define cop_highpage_mc copy_highpage
>> +#endif
>> +
>>   static inline void memcpy_page(struct page *dst_page, size_t dst_off,
>>   			       struct page *src_page, size_t src_off,
>>   			       size_t len)
>> diff --git a/mm/memory.c b/mm/memory.c
>> index 76e3af9639d9..d5f62234152d 100644
>> --- a/mm/memory.c
>> +++ b/mm/memory.c
>> @@ -2767,7 +2767,7 @@ static inline bool cow_user_page(struct page *dst, struct page *src,
>>   	unsigned long addr = vmf->address;
>>   
>>   	if (likely(src)) {
>> -		copy_user_highpage(dst, src, addr, vma);
>> +		copy_user_highpage_mc(dst, src, addr, vma);
>>   		return true;
>>   	}
>>   
>> -- 
>> 2.25.1
>>
> .

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH -next v4 3/7] arm64: add support for machine check error safe
  2022-05-19  6:29       ` Tong Tiangen
  (?)
@ 2022-05-25  8:30         ` Mark Rutland
  -1 siblings, 0 replies; 96+ messages in thread
From: Mark Rutland @ 2022-05-25  8:30 UTC (permalink / raw)
  To: Tong Tiangen
  Cc: James Morse, Andrew Morton, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Robin Murphy, Dave Hansen, Catalin Marinas,
	Will Deacon, Alexander Viro, Michael Ellerman,
	Benjamin Herrenschmidt, Paul Mackerras, x86, H . Peter Anvin,
	linuxppc-dev, linux-arm-kernel, linux-kernel, linux-mm,
	Kefeng Wang, Xie XiuQi, Guohanjun

On Thu, May 19, 2022 at 02:29:54PM +0800, Tong Tiangen wrote:
> 
> 
> 在 2022/5/13 23:26, Mark Rutland 写道:
> > On Wed, Apr 20, 2022 at 03:04:14AM +0000, Tong Tiangen wrote:
> > > During the processing of arm64 kernel hardware memory errors(do_sea()), if
> > > the errors is consumed in the kernel, the current processing is panic.
> > > However, it is not optimal.
> > > 
> > > Take uaccess for example, if the uaccess operation fails due to memory
> > > error, only the user process will be affected, kill the user process
> > > and isolate the user page with hardware memory errors is a better choice.
> > 
> > Conceptually, I'm fine with the idea of constraining what we do for a
> > true uaccess, but I don't like the implementation of this at all, and I
> > think we first need to clean up the arm64 extable usage to clearly
> > distinguish a uaccess from another access.
> 
> OK,using EX_TYPE_UACCESS and this extable type could be recover, this is
> more reasonable.

Great.

> For EX_TYPE_UACCESS_ERR_ZERO, today we use it for kernel accesses in a
> couple of cases, such as
> get_user/futex/__user_cache_maint()/__user_swpX_asm(), 

Those are all user accesses.

However, __get_kernel_nofault() and __put_kernel_nofault() use
EX_TYPE_UACCESS_ERR_ZERO by way of __{get,put}_mem_asm(), so we'd need to
refactor that code to split the user/kernel cases higher up the callchain.

> your suggestion is:
> get_user continues to use EX_TYPE_UACCESS_ERR_ZERO and the other cases use
> new type EX_TYPE_FIXUP_ERR_ZERO?

Yes, that's the rough shape. We could make the latter EX_TYPE_KACCESS_ERR_ZERO
to be clearly analogous to EX_TYPE_UACCESS_ERR_ZERO, and with that I susepct we
could remove EX_TYPE_FIXUP.

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH -next v4 3/7] arm64: add support for machine check error safe
@ 2022-05-25  8:30         ` Mark Rutland
  0 siblings, 0 replies; 96+ messages in thread
From: Mark Rutland @ 2022-05-25  8:30 UTC (permalink / raw)
  To: Tong Tiangen
  Cc: Kefeng Wang, Dave Hansen, linux-mm, Paul Mackerras, Guohanjun,
	Will Deacon, H . Peter Anvin, x86, Ingo Molnar, Catalin Marinas,
	Xie XiuQi, Borislav Petkov, Alexander Viro, Thomas Gleixner,
	linux-arm-kernel, Robin Murphy, linux-kernel, James Morse,
	Andrew Morton, linuxppc-dev

On Thu, May 19, 2022 at 02:29:54PM +0800, Tong Tiangen wrote:
> 
> 
> 在 2022/5/13 23:26, Mark Rutland 写道:
> > On Wed, Apr 20, 2022 at 03:04:14AM +0000, Tong Tiangen wrote:
> > > During the processing of arm64 kernel hardware memory errors(do_sea()), if
> > > the errors is consumed in the kernel, the current processing is panic.
> > > However, it is not optimal.
> > > 
> > > Take uaccess for example, if the uaccess operation fails due to memory
> > > error, only the user process will be affected, kill the user process
> > > and isolate the user page with hardware memory errors is a better choice.
> > 
> > Conceptually, I'm fine with the idea of constraining what we do for a
> > true uaccess, but I don't like the implementation of this at all, and I
> > think we first need to clean up the arm64 extable usage to clearly
> > distinguish a uaccess from another access.
> 
> OK,using EX_TYPE_UACCESS and this extable type could be recover, this is
> more reasonable.

Great.

> For EX_TYPE_UACCESS_ERR_ZERO, today we use it for kernel accesses in a
> couple of cases, such as
> get_user/futex/__user_cache_maint()/__user_swpX_asm(), 

Those are all user accesses.

However, __get_kernel_nofault() and __put_kernel_nofault() use
EX_TYPE_UACCESS_ERR_ZERO by way of __{get,put}_mem_asm(), so we'd need to
refactor that code to split the user/kernel cases higher up the callchain.

> your suggestion is:
> get_user continues to use EX_TYPE_UACCESS_ERR_ZERO and the other cases use
> new type EX_TYPE_FIXUP_ERR_ZERO?

Yes, that's the rough shape. We could make the latter EX_TYPE_KACCESS_ERR_ZERO
to be clearly analogous to EX_TYPE_UACCESS_ERR_ZERO, and with that I susepct we
could remove EX_TYPE_FIXUP.

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH -next v4 3/7] arm64: add support for machine check error safe
@ 2022-05-25  8:30         ` Mark Rutland
  0 siblings, 0 replies; 96+ messages in thread
From: Mark Rutland @ 2022-05-25  8:30 UTC (permalink / raw)
  To: Tong Tiangen
  Cc: James Morse, Andrew Morton, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Robin Murphy, Dave Hansen, Catalin Marinas,
	Will Deacon, Alexander Viro, Michael Ellerman,
	Benjamin Herrenschmidt, Paul Mackerras, x86, H . Peter Anvin,
	linuxppc-dev, linux-arm-kernel, linux-kernel, linux-mm,
	Kefeng Wang, Xie XiuQi, Guohanjun

On Thu, May 19, 2022 at 02:29:54PM +0800, Tong Tiangen wrote:
> 
> 
> 在 2022/5/13 23:26, Mark Rutland 写道:
> > On Wed, Apr 20, 2022 at 03:04:14AM +0000, Tong Tiangen wrote:
> > > During the processing of arm64 kernel hardware memory errors(do_sea()), if
> > > the errors is consumed in the kernel, the current processing is panic.
> > > However, it is not optimal.
> > > 
> > > Take uaccess for example, if the uaccess operation fails due to memory
> > > error, only the user process will be affected, kill the user process
> > > and isolate the user page with hardware memory errors is a better choice.
> > 
> > Conceptually, I'm fine with the idea of constraining what we do for a
> > true uaccess, but I don't like the implementation of this at all, and I
> > think we first need to clean up the arm64 extable usage to clearly
> > distinguish a uaccess from another access.
> 
> OK,using EX_TYPE_UACCESS and this extable type could be recover, this is
> more reasonable.

Great.

> For EX_TYPE_UACCESS_ERR_ZERO, today we use it for kernel accesses in a
> couple of cases, such as
> get_user/futex/__user_cache_maint()/__user_swpX_asm(), 

Those are all user accesses.

However, __get_kernel_nofault() and __put_kernel_nofault() use
EX_TYPE_UACCESS_ERR_ZERO by way of __{get,put}_mem_asm(), so we'd need to
refactor that code to split the user/kernel cases higher up the callchain.

> your suggestion is:
> get_user continues to use EX_TYPE_UACCESS_ERR_ZERO and the other cases use
> new type EX_TYPE_FIXUP_ERR_ZERO?

Yes, that's the rough shape. We could make the latter EX_TYPE_KACCESS_ERR_ZERO
to be clearly analogous to EX_TYPE_UACCESS_ERR_ZERO, and with that I susepct we
could remove EX_TYPE_FIXUP.

Thanks,
Mark.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH -next v4 3/7] arm64: add support for machine check error safe
  2022-05-25  8:30         ` Mark Rutland
  (?)
@ 2022-05-26  3:36           ` Tong Tiangen
  -1 siblings, 0 replies; 96+ messages in thread
From: Tong Tiangen @ 2022-05-26  3:36 UTC (permalink / raw)
  To: Mark Rutland
  Cc: James Morse, Andrew Morton, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Robin Murphy, Dave Hansen, Catalin Marinas,
	Will Deacon, Alexander Viro, Michael Ellerman,
	Benjamin Herrenschmidt, Paul Mackerras, x86, H . Peter Anvin,
	linuxppc-dev, linux-arm-kernel, linux-kernel, linux-mm,
	Kefeng Wang, Xie XiuQi, Guohanjun



在 2022/5/25 16:30, Mark Rutland 写道:
> On Thu, May 19, 2022 at 02:29:54PM +0800, Tong Tiangen wrote:
>>
>>
>> 在 2022/5/13 23:26, Mark Rutland 写道:
>>> On Wed, Apr 20, 2022 at 03:04:14AM +0000, Tong Tiangen wrote:
>>>> During the processing of arm64 kernel hardware memory errors(do_sea()), if
>>>> the errors is consumed in the kernel, the current processing is panic.
>>>> However, it is not optimal.
>>>>
>>>> Take uaccess for example, if the uaccess operation fails due to memory
>>>> error, only the user process will be affected, kill the user process
>>>> and isolate the user page with hardware memory errors is a better choice.
>>>
>>> Conceptually, I'm fine with the idea of constraining what we do for a
>>> true uaccess, but I don't like the implementation of this at all, and I
>>> think we first need to clean up the arm64 extable usage to clearly
>>> distinguish a uaccess from another access.
>>
>> OK,using EX_TYPE_UACCESS and this extable type could be recover, this is
>> more reasonable.
> 
> Great.
> 
>> For EX_TYPE_UACCESS_ERR_ZERO, today we use it for kernel accesses in a
>> couple of cases, such as
>> get_user/futex/__user_cache_maint()/__user_swpX_asm(),
> 
> Those are all user accesses.
> 
> However, __get_kernel_nofault() and __put_kernel_nofault() use
> EX_TYPE_UACCESS_ERR_ZERO by way of __{get,put}_mem_asm(), so we'd need to
> refactor that code to split the user/kernel cases higher up the callchain.
> 
>> your suggestion is:
>> get_user continues to use EX_TYPE_UACCESS_ERR_ZERO and the other cases use
>> new type EX_TYPE_FIXUP_ERR_ZERO?
> 
> Yes, that's the rough shape. We could make the latter EX_TYPE_KACCESS_ERR_ZERO
> to be clearly analogous to EX_TYPE_UACCESS_ERR_ZERO, and with that I susepct we
> could remove EX_TYPE_FIXUP.
> 
> Thanks,
> Mark.
According to your suggestion, i think the definition is like this:

#define EX_TYPE_NONE                    0
#define EX_TYPE_FIXUP                   1    --> delete
#define EX_TYPE_BPF                     2
#define EX_TYPE_UACCESS_ERR_ZERO        3
#define EX_TYPE_LOAD_UNALIGNED_ZEROPAD  4
#define EX_TYPE_UACCESS		        xx   --> add
#define EX_TYPE_KACCESS_ERR_ZERO        xx   --> add
[The value defined by the macro here is temporary]

There are two points to modify:

1、_get_kernel_nofault() and __put_kernel_nofault()  using 
EX_TYPE_KACCESS_ERR_ZERO, Other positions using 
EX_TYPE_UACCESS_ERR_ZERO keep unchanged.

2、delete EX_TYPE_FIXUP.

There is no doubt about others. As for EX_TYPE_FIXUP, I think it needs 
to be retained, _cond_extable(EX_TYPE_FIXUP) is still in use in assembler.h.

Thanks,
Tong.

> .

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH -next v4 3/7] arm64: add support for machine check error safe
@ 2022-05-26  3:36           ` Tong Tiangen
  0 siblings, 0 replies; 96+ messages in thread
From: Tong Tiangen @ 2022-05-26  3:36 UTC (permalink / raw)
  To: Mark Rutland
  Cc: Kefeng Wang, Dave Hansen, linux-mm, Paul Mackerras, Guohanjun,
	Will Deacon, H . Peter Anvin, x86, Ingo Molnar, Catalin Marinas,
	Xie XiuQi, Borislav Petkov, Alexander Viro, Thomas Gleixner,
	linux-arm-kernel, Robin Murphy, linux-kernel, James Morse,
	Andrew Morton, linuxppc-dev



在 2022/5/25 16:30, Mark Rutland 写道:
> On Thu, May 19, 2022 at 02:29:54PM +0800, Tong Tiangen wrote:
>>
>>
>> 在 2022/5/13 23:26, Mark Rutland 写道:
>>> On Wed, Apr 20, 2022 at 03:04:14AM +0000, Tong Tiangen wrote:
>>>> During the processing of arm64 kernel hardware memory errors(do_sea()), if
>>>> the errors is consumed in the kernel, the current processing is panic.
>>>> However, it is not optimal.
>>>>
>>>> Take uaccess for example, if the uaccess operation fails due to memory
>>>> error, only the user process will be affected, kill the user process
>>>> and isolate the user page with hardware memory errors is a better choice.
>>>
>>> Conceptually, I'm fine with the idea of constraining what we do for a
>>> true uaccess, but I don't like the implementation of this at all, and I
>>> think we first need to clean up the arm64 extable usage to clearly
>>> distinguish a uaccess from another access.
>>
>> OK,using EX_TYPE_UACCESS and this extable type could be recover, this is
>> more reasonable.
> 
> Great.
> 
>> For EX_TYPE_UACCESS_ERR_ZERO, today we use it for kernel accesses in a
>> couple of cases, such as
>> get_user/futex/__user_cache_maint()/__user_swpX_asm(),
> 
> Those are all user accesses.
> 
> However, __get_kernel_nofault() and __put_kernel_nofault() use
> EX_TYPE_UACCESS_ERR_ZERO by way of __{get,put}_mem_asm(), so we'd need to
> refactor that code to split the user/kernel cases higher up the callchain.
> 
>> your suggestion is:
>> get_user continues to use EX_TYPE_UACCESS_ERR_ZERO and the other cases use
>> new type EX_TYPE_FIXUP_ERR_ZERO?
> 
> Yes, that's the rough shape. We could make the latter EX_TYPE_KACCESS_ERR_ZERO
> to be clearly analogous to EX_TYPE_UACCESS_ERR_ZERO, and with that I susepct we
> could remove EX_TYPE_FIXUP.
> 
> Thanks,
> Mark.
According to your suggestion, i think the definition is like this:

#define EX_TYPE_NONE                    0
#define EX_TYPE_FIXUP                   1    --> delete
#define EX_TYPE_BPF                     2
#define EX_TYPE_UACCESS_ERR_ZERO        3
#define EX_TYPE_LOAD_UNALIGNED_ZEROPAD  4
#define EX_TYPE_UACCESS		        xx   --> add
#define EX_TYPE_KACCESS_ERR_ZERO        xx   --> add
[The value defined by the macro here is temporary]

There are two points to modify:

1、_get_kernel_nofault() and __put_kernel_nofault()  using 
EX_TYPE_KACCESS_ERR_ZERO, Other positions using 
EX_TYPE_UACCESS_ERR_ZERO keep unchanged.

2、delete EX_TYPE_FIXUP.

There is no doubt about others. As for EX_TYPE_FIXUP, I think it needs 
to be retained, _cond_extable(EX_TYPE_FIXUP) is still in use in assembler.h.

Thanks,
Tong.

> .

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH -next v4 3/7] arm64: add support for machine check error safe
@ 2022-05-26  3:36           ` Tong Tiangen
  0 siblings, 0 replies; 96+ messages in thread
From: Tong Tiangen @ 2022-05-26  3:36 UTC (permalink / raw)
  To: Mark Rutland
  Cc: James Morse, Andrew Morton, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Robin Murphy, Dave Hansen, Catalin Marinas,
	Will Deacon, Alexander Viro, Michael Ellerman,
	Benjamin Herrenschmidt, Paul Mackerras, x86, H . Peter Anvin,
	linuxppc-dev, linux-arm-kernel, linux-kernel, linux-mm,
	Kefeng Wang, Xie XiuQi, Guohanjun



在 2022/5/25 16:30, Mark Rutland 写道:
> On Thu, May 19, 2022 at 02:29:54PM +0800, Tong Tiangen wrote:
>>
>>
>> 在 2022/5/13 23:26, Mark Rutland 写道:
>>> On Wed, Apr 20, 2022 at 03:04:14AM +0000, Tong Tiangen wrote:
>>>> During the processing of arm64 kernel hardware memory errors(do_sea()), if
>>>> the errors is consumed in the kernel, the current processing is panic.
>>>> However, it is not optimal.
>>>>
>>>> Take uaccess for example, if the uaccess operation fails due to memory
>>>> error, only the user process will be affected, kill the user process
>>>> and isolate the user page with hardware memory errors is a better choice.
>>>
>>> Conceptually, I'm fine with the idea of constraining what we do for a
>>> true uaccess, but I don't like the implementation of this at all, and I
>>> think we first need to clean up the arm64 extable usage to clearly
>>> distinguish a uaccess from another access.
>>
>> OK,using EX_TYPE_UACCESS and this extable type could be recover, this is
>> more reasonable.
> 
> Great.
> 
>> For EX_TYPE_UACCESS_ERR_ZERO, today we use it for kernel accesses in a
>> couple of cases, such as
>> get_user/futex/__user_cache_maint()/__user_swpX_asm(),
> 
> Those are all user accesses.
> 
> However, __get_kernel_nofault() and __put_kernel_nofault() use
> EX_TYPE_UACCESS_ERR_ZERO by way of __{get,put}_mem_asm(), so we'd need to
> refactor that code to split the user/kernel cases higher up the callchain.
> 
>> your suggestion is:
>> get_user continues to use EX_TYPE_UACCESS_ERR_ZERO and the other cases use
>> new type EX_TYPE_FIXUP_ERR_ZERO?
> 
> Yes, that's the rough shape. We could make the latter EX_TYPE_KACCESS_ERR_ZERO
> to be clearly analogous to EX_TYPE_UACCESS_ERR_ZERO, and with that I susepct we
> could remove EX_TYPE_FIXUP.
> 
> Thanks,
> Mark.
According to your suggestion, i think the definition is like this:

#define EX_TYPE_NONE                    0
#define EX_TYPE_FIXUP                   1    --> delete
#define EX_TYPE_BPF                     2
#define EX_TYPE_UACCESS_ERR_ZERO        3
#define EX_TYPE_LOAD_UNALIGNED_ZEROPAD  4
#define EX_TYPE_UACCESS		        xx   --> add
#define EX_TYPE_KACCESS_ERR_ZERO        xx   --> add
[The value defined by the macro here is temporary]

There are two points to modify:

1、_get_kernel_nofault() and __put_kernel_nofault()  using 
EX_TYPE_KACCESS_ERR_ZERO, Other positions using 
EX_TYPE_UACCESS_ERR_ZERO keep unchanged.

2、delete EX_TYPE_FIXUP.

There is no doubt about others. As for EX_TYPE_FIXUP, I think it needs 
to be retained, _cond_extable(EX_TYPE_FIXUP) is still in use in assembler.h.

Thanks,
Tong.

> .

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH -next v4 3/7] arm64: add support for machine check error safe
  2022-05-26  3:36           ` Tong Tiangen
  (?)
@ 2022-05-26  9:50             ` Mark Rutland
  -1 siblings, 0 replies; 96+ messages in thread
From: Mark Rutland @ 2022-05-26  9:50 UTC (permalink / raw)
  To: Tong Tiangen
  Cc: James Morse, Andrew Morton, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Robin Murphy, Dave Hansen, Catalin Marinas,
	Will Deacon, Alexander Viro, Michael Ellerman,
	Benjamin Herrenschmidt, Paul Mackerras, x86, H . Peter Anvin,
	linuxppc-dev, linux-arm-kernel, linux-kernel, linux-mm,
	Kefeng Wang, Xie XiuQi, Guohanjun

On Thu, May 26, 2022 at 11:36:41AM +0800, Tong Tiangen wrote:
> 
> 
> 在 2022/5/25 16:30, Mark Rutland 写道:
> > On Thu, May 19, 2022 at 02:29:54PM +0800, Tong Tiangen wrote:
> > > 
> > > 
> > > 在 2022/5/13 23:26, Mark Rutland 写道:
> > > > On Wed, Apr 20, 2022 at 03:04:14AM +0000, Tong Tiangen wrote:
> > > > > During the processing of arm64 kernel hardware memory errors(do_sea()), if
> > > > > the errors is consumed in the kernel, the current processing is panic.
> > > > > However, it is not optimal.
> > > > > 
> > > > > Take uaccess for example, if the uaccess operation fails due to memory
> > > > > error, only the user process will be affected, kill the user process
> > > > > and isolate the user page with hardware memory errors is a better choice.
> > > > 
> > > > Conceptually, I'm fine with the idea of constraining what we do for a
> > > > true uaccess, but I don't like the implementation of this at all, and I
> > > > think we first need to clean up the arm64 extable usage to clearly
> > > > distinguish a uaccess from another access.
> > > 
> > > OK,using EX_TYPE_UACCESS and this extable type could be recover, this is
> > > more reasonable.
> > 
> > Great.
> > 
> > > For EX_TYPE_UACCESS_ERR_ZERO, today we use it for kernel accesses in a
> > > couple of cases, such as
> > > get_user/futex/__user_cache_maint()/__user_swpX_asm(),
> > 
> > Those are all user accesses.
> > 
> > However, __get_kernel_nofault() and __put_kernel_nofault() use
> > EX_TYPE_UACCESS_ERR_ZERO by way of __{get,put}_mem_asm(), so we'd need to
> > refactor that code to split the user/kernel cases higher up the callchain.
> > 
> > > your suggestion is:
> > > get_user continues to use EX_TYPE_UACCESS_ERR_ZERO and the other cases use
> > > new type EX_TYPE_FIXUP_ERR_ZERO?
> > 
> > Yes, that's the rough shape. We could make the latter EX_TYPE_KACCESS_ERR_ZERO
> > to be clearly analogous to EX_TYPE_UACCESS_ERR_ZERO, and with that I susepct we
> > could remove EX_TYPE_FIXUP.
> > 
> > Thanks,
> > Mark.
> According to your suggestion, i think the definition is like this:
> 
> #define EX_TYPE_NONE                    0
> #define EX_TYPE_FIXUP                   1    --> delete
> #define EX_TYPE_BPF                     2
> #define EX_TYPE_UACCESS_ERR_ZERO        3
> #define EX_TYPE_LOAD_UNALIGNED_ZEROPAD  4
> #define EX_TYPE_UACCESS		        xx   --> add
> #define EX_TYPE_KACCESS_ERR_ZERO        xx   --> add
> [The value defined by the macro here is temporary]

Almost; you don't need to add EX_TYPE_UACCESS here, as you can use
EX_TYPE_UACCESS_ERR_ZERO for that.

We already have:

| #define _ASM_EXTABLE_UACCESS_ERR(insn, fixup, err)		\
|         _ASM_EXTABLE_UACCESS_ERR_ZERO(insn, fixup, err, wzr)

... and we can add:

| #define _ASM_EXTABLE_UACCESS(insn, fixup)			\
|         _ASM_EXTABLE_UACCESS_ERR_ZERO(insn, fixup, wzr, wzr)


... and maybe we should use 'xzr' rather than 'wzr' for clarity.

> There are two points to modify:
> 
> 1、_get_kernel_nofault() and __put_kernel_nofault()  using
> EX_TYPE_KACCESS_ERR_ZERO, Other positions using EX_TYPE_UACCESS_ERR_ZERO
> keep unchanged.

That sounds right to me. This will require refactoring __raw_{get,put}_mem()
and __{get,put}_mem_asm().

> 2、delete EX_TYPE_FIXUP.
> 
> There is no doubt about others. As for EX_TYPE_FIXUP, I think it needs to be
> retained, _cond_extable(EX_TYPE_FIXUP) is still in use in assembler.h.

We use _cond_extable for cache maintenance uaccesses, so those should be moved
over to to EX_TYPE_UACCESS_ERR_ZERO. We can rename _cond_extable to
_cond_uaccess_extable for clarity.

That will require restructuring asm-extable.h a bit. If that turns out to be
painful I'm happy to take a look.

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH -next v4 3/7] arm64: add support for machine check error safe
@ 2022-05-26  9:50             ` Mark Rutland
  0 siblings, 0 replies; 96+ messages in thread
From: Mark Rutland @ 2022-05-26  9:50 UTC (permalink / raw)
  To: Tong Tiangen
  Cc: Kefeng Wang, Dave Hansen, linux-mm, Paul Mackerras, Guohanjun,
	Will Deacon, H . Peter Anvin, x86, Ingo Molnar, Catalin Marinas,
	Xie XiuQi, Borislav Petkov, Alexander Viro, Thomas Gleixner,
	linux-arm-kernel, Robin Murphy, linux-kernel, James Morse,
	Andrew Morton, linuxppc-dev

On Thu, May 26, 2022 at 11:36:41AM +0800, Tong Tiangen wrote:
> 
> 
> 在 2022/5/25 16:30, Mark Rutland 写道:
> > On Thu, May 19, 2022 at 02:29:54PM +0800, Tong Tiangen wrote:
> > > 
> > > 
> > > 在 2022/5/13 23:26, Mark Rutland 写道:
> > > > On Wed, Apr 20, 2022 at 03:04:14AM +0000, Tong Tiangen wrote:
> > > > > During the processing of arm64 kernel hardware memory errors(do_sea()), if
> > > > > the errors is consumed in the kernel, the current processing is panic.
> > > > > However, it is not optimal.
> > > > > 
> > > > > Take uaccess for example, if the uaccess operation fails due to memory
> > > > > error, only the user process will be affected, kill the user process
> > > > > and isolate the user page with hardware memory errors is a better choice.
> > > > 
> > > > Conceptually, I'm fine with the idea of constraining what we do for a
> > > > true uaccess, but I don't like the implementation of this at all, and I
> > > > think we first need to clean up the arm64 extable usage to clearly
> > > > distinguish a uaccess from another access.
> > > 
> > > OK,using EX_TYPE_UACCESS and this extable type could be recover, this is
> > > more reasonable.
> > 
> > Great.
> > 
> > > For EX_TYPE_UACCESS_ERR_ZERO, today we use it for kernel accesses in a
> > > couple of cases, such as
> > > get_user/futex/__user_cache_maint()/__user_swpX_asm(),
> > 
> > Those are all user accesses.
> > 
> > However, __get_kernel_nofault() and __put_kernel_nofault() use
> > EX_TYPE_UACCESS_ERR_ZERO by way of __{get,put}_mem_asm(), so we'd need to
> > refactor that code to split the user/kernel cases higher up the callchain.
> > 
> > > your suggestion is:
> > > get_user continues to use EX_TYPE_UACCESS_ERR_ZERO and the other cases use
> > > new type EX_TYPE_FIXUP_ERR_ZERO?
> > 
> > Yes, that's the rough shape. We could make the latter EX_TYPE_KACCESS_ERR_ZERO
> > to be clearly analogous to EX_TYPE_UACCESS_ERR_ZERO, and with that I susepct we
> > could remove EX_TYPE_FIXUP.
> > 
> > Thanks,
> > Mark.
> According to your suggestion, i think the definition is like this:
> 
> #define EX_TYPE_NONE                    0
> #define EX_TYPE_FIXUP                   1    --> delete
> #define EX_TYPE_BPF                     2
> #define EX_TYPE_UACCESS_ERR_ZERO        3
> #define EX_TYPE_LOAD_UNALIGNED_ZEROPAD  4
> #define EX_TYPE_UACCESS		        xx   --> add
> #define EX_TYPE_KACCESS_ERR_ZERO        xx   --> add
> [The value defined by the macro here is temporary]

Almost; you don't need to add EX_TYPE_UACCESS here, as you can use
EX_TYPE_UACCESS_ERR_ZERO for that.

We already have:

| #define _ASM_EXTABLE_UACCESS_ERR(insn, fixup, err)		\
|         _ASM_EXTABLE_UACCESS_ERR_ZERO(insn, fixup, err, wzr)

... and we can add:

| #define _ASM_EXTABLE_UACCESS(insn, fixup)			\
|         _ASM_EXTABLE_UACCESS_ERR_ZERO(insn, fixup, wzr, wzr)


... and maybe we should use 'xzr' rather than 'wzr' for clarity.

> There are two points to modify:
> 
> 1、_get_kernel_nofault() and __put_kernel_nofault()  using
> EX_TYPE_KACCESS_ERR_ZERO, Other positions using EX_TYPE_UACCESS_ERR_ZERO
> keep unchanged.

That sounds right to me. This will require refactoring __raw_{get,put}_mem()
and __{get,put}_mem_asm().

> 2、delete EX_TYPE_FIXUP.
> 
> There is no doubt about others. As for EX_TYPE_FIXUP, I think it needs to be
> retained, _cond_extable(EX_TYPE_FIXUP) is still in use in assembler.h.

We use _cond_extable for cache maintenance uaccesses, so those should be moved
over to to EX_TYPE_UACCESS_ERR_ZERO. We can rename _cond_extable to
_cond_uaccess_extable for clarity.

That will require restructuring asm-extable.h a bit. If that turns out to be
painful I'm happy to take a look.

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH -next v4 3/7] arm64: add support for machine check error safe
@ 2022-05-26  9:50             ` Mark Rutland
  0 siblings, 0 replies; 96+ messages in thread
From: Mark Rutland @ 2022-05-26  9:50 UTC (permalink / raw)
  To: Tong Tiangen
  Cc: James Morse, Andrew Morton, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Robin Murphy, Dave Hansen, Catalin Marinas,
	Will Deacon, Alexander Viro, Michael Ellerman,
	Benjamin Herrenschmidt, Paul Mackerras, x86, H . Peter Anvin,
	linuxppc-dev, linux-arm-kernel, linux-kernel, linux-mm,
	Kefeng Wang, Xie XiuQi, Guohanjun

On Thu, May 26, 2022 at 11:36:41AM +0800, Tong Tiangen wrote:
> 
> 
> 在 2022/5/25 16:30, Mark Rutland 写道:
> > On Thu, May 19, 2022 at 02:29:54PM +0800, Tong Tiangen wrote:
> > > 
> > > 
> > > 在 2022/5/13 23:26, Mark Rutland 写道:
> > > > On Wed, Apr 20, 2022 at 03:04:14AM +0000, Tong Tiangen wrote:
> > > > > During the processing of arm64 kernel hardware memory errors(do_sea()), if
> > > > > the errors is consumed in the kernel, the current processing is panic.
> > > > > However, it is not optimal.
> > > > > 
> > > > > Take uaccess for example, if the uaccess operation fails due to memory
> > > > > error, only the user process will be affected, kill the user process
> > > > > and isolate the user page with hardware memory errors is a better choice.
> > > > 
> > > > Conceptually, I'm fine with the idea of constraining what we do for a
> > > > true uaccess, but I don't like the implementation of this at all, and I
> > > > think we first need to clean up the arm64 extable usage to clearly
> > > > distinguish a uaccess from another access.
> > > 
> > > OK,using EX_TYPE_UACCESS and this extable type could be recover, this is
> > > more reasonable.
> > 
> > Great.
> > 
> > > For EX_TYPE_UACCESS_ERR_ZERO, today we use it for kernel accesses in a
> > > couple of cases, such as
> > > get_user/futex/__user_cache_maint()/__user_swpX_asm(),
> > 
> > Those are all user accesses.
> > 
> > However, __get_kernel_nofault() and __put_kernel_nofault() use
> > EX_TYPE_UACCESS_ERR_ZERO by way of __{get,put}_mem_asm(), so we'd need to
> > refactor that code to split the user/kernel cases higher up the callchain.
> > 
> > > your suggestion is:
> > > get_user continues to use EX_TYPE_UACCESS_ERR_ZERO and the other cases use
> > > new type EX_TYPE_FIXUP_ERR_ZERO?
> > 
> > Yes, that's the rough shape. We could make the latter EX_TYPE_KACCESS_ERR_ZERO
> > to be clearly analogous to EX_TYPE_UACCESS_ERR_ZERO, and with that I susepct we
> > could remove EX_TYPE_FIXUP.
> > 
> > Thanks,
> > Mark.
> According to your suggestion, i think the definition is like this:
> 
> #define EX_TYPE_NONE                    0
> #define EX_TYPE_FIXUP                   1    --> delete
> #define EX_TYPE_BPF                     2
> #define EX_TYPE_UACCESS_ERR_ZERO        3
> #define EX_TYPE_LOAD_UNALIGNED_ZEROPAD  4
> #define EX_TYPE_UACCESS		        xx   --> add
> #define EX_TYPE_KACCESS_ERR_ZERO        xx   --> add
> [The value defined by the macro here is temporary]

Almost; you don't need to add EX_TYPE_UACCESS here, as you can use
EX_TYPE_UACCESS_ERR_ZERO for that.

We already have:

| #define _ASM_EXTABLE_UACCESS_ERR(insn, fixup, err)		\
|         _ASM_EXTABLE_UACCESS_ERR_ZERO(insn, fixup, err, wzr)

... and we can add:

| #define _ASM_EXTABLE_UACCESS(insn, fixup)			\
|         _ASM_EXTABLE_UACCESS_ERR_ZERO(insn, fixup, wzr, wzr)


... and maybe we should use 'xzr' rather than 'wzr' for clarity.

> There are two points to modify:
> 
> 1、_get_kernel_nofault() and __put_kernel_nofault()  using
> EX_TYPE_KACCESS_ERR_ZERO, Other positions using EX_TYPE_UACCESS_ERR_ZERO
> keep unchanged.

That sounds right to me. This will require refactoring __raw_{get,put}_mem()
and __{get,put}_mem_asm().

> 2、delete EX_TYPE_FIXUP.
> 
> There is no doubt about others. As for EX_TYPE_FIXUP, I think it needs to be
> retained, _cond_extable(EX_TYPE_FIXUP) is still in use in assembler.h.

We use _cond_extable for cache maintenance uaccesses, so those should be moved
over to to EX_TYPE_UACCESS_ERR_ZERO. We can rename _cond_extable to
_cond_uaccess_extable for clarity.

That will require restructuring asm-extable.h a bit. If that turns out to be
painful I'm happy to take a look.

Thanks,
Mark.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH -next v4 3/7] arm64: add support for machine check error safe
  2022-05-26  9:50             ` Mark Rutland
  (?)
@ 2022-05-27  1:40               ` Tong Tiangen
  -1 siblings, 0 replies; 96+ messages in thread
From: Tong Tiangen @ 2022-05-27  1:40 UTC (permalink / raw)
  To: Mark Rutland
  Cc: James Morse, Andrew Morton, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Robin Murphy, Dave Hansen, Catalin Marinas,
	Will Deacon, Alexander Viro, Michael Ellerman,
	Benjamin Herrenschmidt, Paul Mackerras, x86, H . Peter Anvin,
	linuxppc-dev, linux-arm-kernel, linux-kernel, linux-mm,
	Kefeng Wang, Xie XiuQi, Guohanjun



在 2022/5/26 17:50, Mark Rutland 写道:
> On Thu, May 26, 2022 at 11:36:41AM +0800, Tong Tiangen wrote:
>>
>>
>> 在 2022/5/25 16:30, Mark Rutland 写道:
>>> On Thu, May 19, 2022 at 02:29:54PM +0800, Tong Tiangen wrote:
>>>>
>>>>
>>>> 在 2022/5/13 23:26, Mark Rutland 写道:
>>>>> On Wed, Apr 20, 2022 at 03:04:14AM +0000, Tong Tiangen wrote:
>>>>>> During the processing of arm64 kernel hardware memory errors(do_sea()), if
>>>>>> the errors is consumed in the kernel, the current processing is panic.
>>>>>> However, it is not optimal.
>>>>>>
>>>>>> Take uaccess for example, if the uaccess operation fails due to memory
>>>>>> error, only the user process will be affected, kill the user process
>>>>>> and isolate the user page with hardware memory errors is a better choice.
>>>>>
>>>>> Conceptually, I'm fine with the idea of constraining what we do for a
>>>>> true uaccess, but I don't like the implementation of this at all, and I
>>>>> think we first need to clean up the arm64 extable usage to clearly
>>>>> distinguish a uaccess from another access.
>>>>
>>>> OK,using EX_TYPE_UACCESS and this extable type could be recover, this is
>>>> more reasonable.
>>>
>>> Great.
>>>
>>>> For EX_TYPE_UACCESS_ERR_ZERO, today we use it for kernel accesses in a
>>>> couple of cases, such as
>>>> get_user/futex/__user_cache_maint()/__user_swpX_asm(),
>>>
>>> Those are all user accesses.
>>>
>>> However, __get_kernel_nofault() and __put_kernel_nofault() use
>>> EX_TYPE_UACCESS_ERR_ZERO by way of __{get,put}_mem_asm(), so we'd need to
>>> refactor that code to split the user/kernel cases higher up the callchain.
>>>
>>>> your suggestion is:
>>>> get_user continues to use EX_TYPE_UACCESS_ERR_ZERO and the other cases use
>>>> new type EX_TYPE_FIXUP_ERR_ZERO?
>>>
>>> Yes, that's the rough shape. We could make the latter EX_TYPE_KACCESS_ERR_ZERO
>>> to be clearly analogous to EX_TYPE_UACCESS_ERR_ZERO, and with that I susepct we
>>> could remove EX_TYPE_FIXUP.
>>>
>>> Thanks,
>>> Mark.
>> According to your suggestion, i think the definition is like this:
>>
>> #define EX_TYPE_NONE                    0
>> #define EX_TYPE_FIXUP                   1    --> delete
>> #define EX_TYPE_BPF                     2
>> #define EX_TYPE_UACCESS_ERR_ZERO        3
>> #define EX_TYPE_LOAD_UNALIGNED_ZEROPAD  4
>> #define EX_TYPE_UACCESS		        xx   --> add
>> #define EX_TYPE_KACCESS_ERR_ZERO        xx   --> add
>> [The value defined by the macro here is temporary]
> 
> Almost; you don't need to add EX_TYPE_UACCESS here, as you can use
> EX_TYPE_UACCESS_ERR_ZERO for that.
> 
> We already have:
> 
> | #define _ASM_EXTABLE_UACCESS_ERR(insn, fixup, err)		\
> |         _ASM_EXTABLE_UACCESS_ERR_ZERO(insn, fixup, err, wzr)
> 
> ... and we can add:
> 
> | #define _ASM_EXTABLE_UACCESS(insn, fixup)			\
> |         _ASM_EXTABLE_UACCESS_ERR_ZERO(insn, fixup, wzr, wzr)
> 
> 
> ... and maybe we should use 'xzr' rather than 'wzr' for clarity.
> 
>> There are two points to modify:
>>
>> 1、_get_kernel_nofault() and __put_kernel_nofault()  using
>> EX_TYPE_KACCESS_ERR_ZERO, Other positions using EX_TYPE_UACCESS_ERR_ZERO
>> keep unchanged.
> 
> That sounds right to me. This will require refactoring __raw_{get,put}_mem()
> and __{get,put}_mem_asm().
> 
>> 2、delete EX_TYPE_FIXUP.
>>
>> There is no doubt about others. As for EX_TYPE_FIXUP, I think it needs to be
>> retained, _cond_extable(EX_TYPE_FIXUP) is still in use in assembler.h.
> 
> We use _cond_extable for cache maintenance uaccesses, so those should be moved
> over to to EX_TYPE_UACCESS_ERR_ZERO. We can rename _cond_extable to
> _cond_uaccess_extable for clarity.
> 
> That will require restructuring asm-extable.h a bit. If that turns out to be
> painful I'm happy to take a look.
> 
> Thanks,
> Mark.

OK, I'll do it these days, thanks a lot.

> .

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH -next v4 3/7] arm64: add support for machine check error safe
@ 2022-05-27  1:40               ` Tong Tiangen
  0 siblings, 0 replies; 96+ messages in thread
From: Tong Tiangen @ 2022-05-27  1:40 UTC (permalink / raw)
  To: Mark Rutland
  Cc: Kefeng Wang, Dave Hansen, linux-mm, Paul Mackerras, Guohanjun,
	Will Deacon, H . Peter Anvin, x86, Ingo Molnar, Catalin Marinas,
	Xie XiuQi, Borislav Petkov, Alexander Viro, Thomas Gleixner,
	linux-arm-kernel, Robin Murphy, linux-kernel, James Morse,
	Andrew Morton, linuxppc-dev



在 2022/5/26 17:50, Mark Rutland 写道:
> On Thu, May 26, 2022 at 11:36:41AM +0800, Tong Tiangen wrote:
>>
>>
>> 在 2022/5/25 16:30, Mark Rutland 写道:
>>> On Thu, May 19, 2022 at 02:29:54PM +0800, Tong Tiangen wrote:
>>>>
>>>>
>>>> 在 2022/5/13 23:26, Mark Rutland 写道:
>>>>> On Wed, Apr 20, 2022 at 03:04:14AM +0000, Tong Tiangen wrote:
>>>>>> During the processing of arm64 kernel hardware memory errors(do_sea()), if
>>>>>> the errors is consumed in the kernel, the current processing is panic.
>>>>>> However, it is not optimal.
>>>>>>
>>>>>> Take uaccess for example, if the uaccess operation fails due to memory
>>>>>> error, only the user process will be affected, kill the user process
>>>>>> and isolate the user page with hardware memory errors is a better choice.
>>>>>
>>>>> Conceptually, I'm fine with the idea of constraining what we do for a
>>>>> true uaccess, but I don't like the implementation of this at all, and I
>>>>> think we first need to clean up the arm64 extable usage to clearly
>>>>> distinguish a uaccess from another access.
>>>>
>>>> OK,using EX_TYPE_UACCESS and this extable type could be recover, this is
>>>> more reasonable.
>>>
>>> Great.
>>>
>>>> For EX_TYPE_UACCESS_ERR_ZERO, today we use it for kernel accesses in a
>>>> couple of cases, such as
>>>> get_user/futex/__user_cache_maint()/__user_swpX_asm(),
>>>
>>> Those are all user accesses.
>>>
>>> However, __get_kernel_nofault() and __put_kernel_nofault() use
>>> EX_TYPE_UACCESS_ERR_ZERO by way of __{get,put}_mem_asm(), so we'd need to
>>> refactor that code to split the user/kernel cases higher up the callchain.
>>>
>>>> your suggestion is:
>>>> get_user continues to use EX_TYPE_UACCESS_ERR_ZERO and the other cases use
>>>> new type EX_TYPE_FIXUP_ERR_ZERO?
>>>
>>> Yes, that's the rough shape. We could make the latter EX_TYPE_KACCESS_ERR_ZERO
>>> to be clearly analogous to EX_TYPE_UACCESS_ERR_ZERO, and with that I susepct we
>>> could remove EX_TYPE_FIXUP.
>>>
>>> Thanks,
>>> Mark.
>> According to your suggestion, i think the definition is like this:
>>
>> #define EX_TYPE_NONE                    0
>> #define EX_TYPE_FIXUP                   1    --> delete
>> #define EX_TYPE_BPF                     2
>> #define EX_TYPE_UACCESS_ERR_ZERO        3
>> #define EX_TYPE_LOAD_UNALIGNED_ZEROPAD  4
>> #define EX_TYPE_UACCESS		        xx   --> add
>> #define EX_TYPE_KACCESS_ERR_ZERO        xx   --> add
>> [The value defined by the macro here is temporary]
> 
> Almost; you don't need to add EX_TYPE_UACCESS here, as you can use
> EX_TYPE_UACCESS_ERR_ZERO for that.
> 
> We already have:
> 
> | #define _ASM_EXTABLE_UACCESS_ERR(insn, fixup, err)		\
> |         _ASM_EXTABLE_UACCESS_ERR_ZERO(insn, fixup, err, wzr)
> 
> ... and we can add:
> 
> | #define _ASM_EXTABLE_UACCESS(insn, fixup)			\
> |         _ASM_EXTABLE_UACCESS_ERR_ZERO(insn, fixup, wzr, wzr)
> 
> 
> ... and maybe we should use 'xzr' rather than 'wzr' for clarity.
> 
>> There are two points to modify:
>>
>> 1、_get_kernel_nofault() and __put_kernel_nofault()  using
>> EX_TYPE_KACCESS_ERR_ZERO, Other positions using EX_TYPE_UACCESS_ERR_ZERO
>> keep unchanged.
> 
> That sounds right to me. This will require refactoring __raw_{get,put}_mem()
> and __{get,put}_mem_asm().
> 
>> 2、delete EX_TYPE_FIXUP.
>>
>> There is no doubt about others. As for EX_TYPE_FIXUP, I think it needs to be
>> retained, _cond_extable(EX_TYPE_FIXUP) is still in use in assembler.h.
> 
> We use _cond_extable for cache maintenance uaccesses, so those should be moved
> over to to EX_TYPE_UACCESS_ERR_ZERO. We can rename _cond_extable to
> _cond_uaccess_extable for clarity.
> 
> That will require restructuring asm-extable.h a bit. If that turns out to be
> painful I'm happy to take a look.
> 
> Thanks,
> Mark.

OK, I'll do it these days, thanks a lot.

> .

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH -next v4 3/7] arm64: add support for machine check error safe
@ 2022-05-27  1:40               ` Tong Tiangen
  0 siblings, 0 replies; 96+ messages in thread
From: Tong Tiangen @ 2022-05-27  1:40 UTC (permalink / raw)
  To: Mark Rutland
  Cc: James Morse, Andrew Morton, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Robin Murphy, Dave Hansen, Catalin Marinas,
	Will Deacon, Alexander Viro, Michael Ellerman,
	Benjamin Herrenschmidt, Paul Mackerras, x86, H . Peter Anvin,
	linuxppc-dev, linux-arm-kernel, linux-kernel, linux-mm,
	Kefeng Wang, Xie XiuQi, Guohanjun



在 2022/5/26 17:50, Mark Rutland 写道:
> On Thu, May 26, 2022 at 11:36:41AM +0800, Tong Tiangen wrote:
>>
>>
>> 在 2022/5/25 16:30, Mark Rutland 写道:
>>> On Thu, May 19, 2022 at 02:29:54PM +0800, Tong Tiangen wrote:
>>>>
>>>>
>>>> 在 2022/5/13 23:26, Mark Rutland 写道:
>>>>> On Wed, Apr 20, 2022 at 03:04:14AM +0000, Tong Tiangen wrote:
>>>>>> During the processing of arm64 kernel hardware memory errors(do_sea()), if
>>>>>> the errors is consumed in the kernel, the current processing is panic.
>>>>>> However, it is not optimal.
>>>>>>
>>>>>> Take uaccess for example, if the uaccess operation fails due to memory
>>>>>> error, only the user process will be affected, kill the user process
>>>>>> and isolate the user page with hardware memory errors is a better choice.
>>>>>
>>>>> Conceptually, I'm fine with the idea of constraining what we do for a
>>>>> true uaccess, but I don't like the implementation of this at all, and I
>>>>> think we first need to clean up the arm64 extable usage to clearly
>>>>> distinguish a uaccess from another access.
>>>>
>>>> OK,using EX_TYPE_UACCESS and this extable type could be recover, this is
>>>> more reasonable.
>>>
>>> Great.
>>>
>>>> For EX_TYPE_UACCESS_ERR_ZERO, today we use it for kernel accesses in a
>>>> couple of cases, such as
>>>> get_user/futex/__user_cache_maint()/__user_swpX_asm(),
>>>
>>> Those are all user accesses.
>>>
>>> However, __get_kernel_nofault() and __put_kernel_nofault() use
>>> EX_TYPE_UACCESS_ERR_ZERO by way of __{get,put}_mem_asm(), so we'd need to
>>> refactor that code to split the user/kernel cases higher up the callchain.
>>>
>>>> your suggestion is:
>>>> get_user continues to use EX_TYPE_UACCESS_ERR_ZERO and the other cases use
>>>> new type EX_TYPE_FIXUP_ERR_ZERO?
>>>
>>> Yes, that's the rough shape. We could make the latter EX_TYPE_KACCESS_ERR_ZERO
>>> to be clearly analogous to EX_TYPE_UACCESS_ERR_ZERO, and with that I susepct we
>>> could remove EX_TYPE_FIXUP.
>>>
>>> Thanks,
>>> Mark.
>> According to your suggestion, i think the definition is like this:
>>
>> #define EX_TYPE_NONE                    0
>> #define EX_TYPE_FIXUP                   1    --> delete
>> #define EX_TYPE_BPF                     2
>> #define EX_TYPE_UACCESS_ERR_ZERO        3
>> #define EX_TYPE_LOAD_UNALIGNED_ZEROPAD  4
>> #define EX_TYPE_UACCESS		        xx   --> add
>> #define EX_TYPE_KACCESS_ERR_ZERO        xx   --> add
>> [The value defined by the macro here is temporary]
> 
> Almost; you don't need to add EX_TYPE_UACCESS here, as you can use
> EX_TYPE_UACCESS_ERR_ZERO for that.
> 
> We already have:
> 
> | #define _ASM_EXTABLE_UACCESS_ERR(insn, fixup, err)		\
> |         _ASM_EXTABLE_UACCESS_ERR_ZERO(insn, fixup, err, wzr)
> 
> ... and we can add:
> 
> | #define _ASM_EXTABLE_UACCESS(insn, fixup)			\
> |         _ASM_EXTABLE_UACCESS_ERR_ZERO(insn, fixup, wzr, wzr)
> 
> 
> ... and maybe we should use 'xzr' rather than 'wzr' for clarity.
> 
>> There are two points to modify:
>>
>> 1、_get_kernel_nofault() and __put_kernel_nofault()  using
>> EX_TYPE_KACCESS_ERR_ZERO, Other positions using EX_TYPE_UACCESS_ERR_ZERO
>> keep unchanged.
> 
> That sounds right to me. This will require refactoring __raw_{get,put}_mem()
> and __{get,put}_mem_asm().
> 
>> 2、delete EX_TYPE_FIXUP.
>>
>> There is no doubt about others. As for EX_TYPE_FIXUP, I think it needs to be
>> retained, _cond_extable(EX_TYPE_FIXUP) is still in use in assembler.h.
> 
> We use _cond_extable for cache maintenance uaccesses, so those should be moved
> over to to EX_TYPE_UACCESS_ERR_ZERO. We can rename _cond_extable to
> _cond_uaccess_extable for clarity.
> 
> That will require restructuring asm-extable.h a bit. If that turns out to be
> painful I'm happy to take a look.
> 
> Thanks,
> Mark.

OK, I'll do it these days, thanks a lot.

> .

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 96+ messages in thread

end of thread, other threads:[~2022-05-27  1:42 UTC | newest]

Thread overview: 96+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-04-20  3:04 [PATCH -next v4 0/7]arm64: add machine check safe support Tong Tiangen
2022-04-20  3:04 ` Tong Tiangen
2022-04-20  3:04 ` Tong Tiangen
2022-04-20  3:04 ` [PATCH -next v4 1/7] x86, powerpc: fix function define in copy_mc_to_user Tong Tiangen
2022-04-20  3:04   ` Tong Tiangen
2022-04-20  3:04   ` Tong Tiangen
2022-04-22  9:45   ` Michael Ellerman
2022-04-22  9:45     ` Michael Ellerman
2022-04-22  9:45     ` Michael Ellerman
2022-04-24  1:16     ` Tong Tiangen
2022-04-24  1:16       ` Tong Tiangen
2022-04-24  1:16       ` Tong Tiangen
2022-05-02 14:24   ` Christophe Leroy
2022-05-02 14:24     ` Christophe Leroy
2022-05-03  1:06     ` Tong Tiangen
2022-05-03  1:06       ` Tong Tiangen
2022-05-05  1:21       ` Kefeng Wang
2022-05-05  1:21         ` Kefeng Wang
2022-04-20  3:04 ` [PATCH -next v4 2/7] arm64: fix types in copy_highpage() Tong Tiangen
2022-04-20  3:04   ` Tong Tiangen
2022-04-20  3:04   ` Tong Tiangen
2022-04-20  3:04 ` [PATCH -next v4 3/7] arm64: add support for machine check error safe Tong Tiangen
2022-04-20  3:04   ` Tong Tiangen
2022-04-20  3:04   ` Tong Tiangen
2022-05-13 15:26   ` Mark Rutland
2022-05-13 15:26     ` Mark Rutland
2022-05-13 15:26     ` Mark Rutland
2022-05-19  6:29     ` Tong Tiangen
2022-05-19  6:29       ` Tong Tiangen
2022-05-19  6:29       ` Tong Tiangen
2022-05-25  8:30       ` Mark Rutland
2022-05-25  8:30         ` Mark Rutland
2022-05-25  8:30         ` Mark Rutland
2022-05-26  3:36         ` Tong Tiangen
2022-05-26  3:36           ` Tong Tiangen
2022-05-26  3:36           ` Tong Tiangen
2022-05-26  9:50           ` Mark Rutland
2022-05-26  9:50             ` Mark Rutland
2022-05-26  9:50             ` Mark Rutland
2022-05-27  1:40             ` Tong Tiangen
2022-05-27  1:40               ` Tong Tiangen
2022-05-27  1:40               ` Tong Tiangen
2022-04-20  3:04 ` [PATCH -next v4 4/7] arm64: add copy_{to, from}_user to machine check safe Tong Tiangen
2022-04-20  3:04   ` Tong Tiangen
2022-04-20  3:04   ` Tong Tiangen
2022-05-04 10:26   ` Catalin Marinas
2022-05-04 10:26     ` Catalin Marinas
2022-05-04 10:26     ` Catalin Marinas
2022-05-05  6:39     ` Tong Tiangen
2022-05-05  6:39       ` Tong Tiangen
2022-05-05  6:39       ` Tong Tiangen
2022-05-05 13:41       ` Catalin Marinas
2022-05-05 13:41         ` Catalin Marinas
2022-05-05 13:41         ` Catalin Marinas
2022-05-05 14:33         ` Tong Tiangen
2022-05-05 14:33           ` Tong Tiangen
2022-05-05 14:33           ` Tong Tiangen
2022-05-13 15:31   ` Mark Rutland
2022-05-13 15:31     ` Mark Rutland
2022-05-13 15:31     ` Mark Rutland
2022-05-19  6:53     ` Tong Tiangen
2022-05-19  6:53       ` Tong Tiangen
2022-05-19  6:53       ` Tong Tiangen
2022-04-20  3:04 ` [PATCH -next v4 5/7] arm64: mte: Clean up user tag accessors Tong Tiangen
2022-04-20  3:04   ` Tong Tiangen
2022-04-20  3:04   ` Tong Tiangen
2022-05-13 15:36   ` Mark Rutland
2022-05-13 15:36     ` Mark Rutland
2022-05-13 15:36     ` Mark Rutland
2022-04-20  3:04 ` [PATCH -next v4 6/7] arm64: add {get, put}_user to machine check safe Tong Tiangen
2022-04-20  3:04   ` Tong Tiangen
2022-04-20  3:04   ` Tong Tiangen
2022-05-13 15:39   ` Mark Rutland
2022-05-13 15:39     ` Mark Rutland
2022-05-13 15:39     ` Mark Rutland
2022-05-19  7:09     ` Tong Tiangen
2022-05-19  7:09       ` Tong Tiangen
2022-05-19  7:09       ` Tong Tiangen
2022-04-20  3:04 ` [PATCH -next v4 7/7] arm64: add cow " Tong Tiangen
2022-04-20  3:04   ` Tong Tiangen
2022-04-20  3:04   ` Tong Tiangen
2022-05-13 15:44   ` Mark Rutland
2022-05-13 15:44     ` Mark Rutland
2022-05-13 15:44     ` Mark Rutland
2022-05-19 10:38     ` Tong Tiangen
2022-05-19 10:38       ` Tong Tiangen
2022-05-19 10:38       ` Tong Tiangen
2022-04-27  9:09 ` [PATCH -next v4 0/7]arm64: add machine check safe support Tong Tiangen
2022-04-27  9:09   ` Tong Tiangen
2022-04-27  9:09   ` Tong Tiangen
2022-05-04 19:58 ` (subset) " Catalin Marinas
2022-05-04 19:58   ` Catalin Marinas
2022-05-04 19:58   ` Catalin Marinas
2022-05-16 18:45 ` Catalin Marinas
2022-05-16 18:45   ` Catalin Marinas
2022-05-16 18:45   ` Catalin Marinas

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.