* [PATCH] x86: mce: fix kernel panic when check_interval is changed
@ 2018-02-23 10:13 ` Seunghun Han
0 siblings, 0 replies; 12+ messages in thread
From: Seunghun Han @ 2018-02-23 10:13 UTC (permalink / raw)
To: Tony Luck, Borislav Petkov
Cc: linux-edac, linux-kernel, Greg Kroah-Hartman, Seunghun Han
I am Seunghun Han and a senior security researcher at National Security
Research Institute of South Korea.
I found a critical security issue which can make kernel panic in userspace.
After analyzing the issue carefully, I found that MCE driver in the kernel
has a problem which can be occurred in SMP environment.
The check_interval file in
/sys/devices/system/machinecheck/machinecheck<cpu number> directory is a
global timer value for MCE polling. If it is changed by one CPU, MCE driver
in kernel calls mce_restart() function and broadcasts the event to other
CPUs to delete and restart MCE polling timer.
The __mcheck_cpu_init_timer() function which is called by mce_restart()
function initializes the mce_timer variable, and the "lock" in mce_timer is
also reinitialized. If more than one CPU write a specific value to
check_interval file concurrently, one can initialize the "lock" in mce_timer
while the others are handling "lock" in mce_timer. This problem causes some
synchronization errors such as kernel panic and kernel hang.
It is a critical security problem because the attacker can make kernel panic
by writing a value to the check_interval file in userspace, and it can be
used for Denial-of-Service (DoS) attack.
To fix this problem, I changed the __mcheck_cpu_init_timer() function to
reuse mce_timer instead of initializing it. The purpose of the function is
to restart the timer and it can be archived by calling
Signed-off-by: Seunghun Han <kkamagui@gmail.com>
---
arch/x86/kernel/cpu/mcheck/mce.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c
index 3d8c573a9a27..d72f2f74f4d7 100644
--- a/arch/x86/kernel/cpu/mcheck/mce.c
+++ b/arch/x86/kernel/cpu/mcheck/mce.c
@@ -1771,7 +1771,6 @@ static void __mcheck_cpu_init_timer(void)
{
struct timer_list *t = this_cpu_ptr(&mce_timer);
- timer_setup(t, mce_timer_fn, TIMER_PINNED);
mce_start_timer(t);
}
@@ -2029,8 +2028,10 @@ static void mce_enable_ce(void *all)
return;
cmci_reenable();
cmci_recheck();
- if (all)
+ if (all) {
+ del_timer_sync(this_cpu_ptr(&mce_timer));
__mcheck_cpu_init_timer();
+ }
}
static struct bus_type mce_subsys = {
--
2.11.0
^ permalink raw reply related [flat|nested] 12+ messages in thread
* x86: mce: fix kernel panic when check_interval is changed
@ 2018-02-23 10:13 ` Seunghun Han
0 siblings, 0 replies; 12+ messages in thread
From: Seunghun Han @ 2018-02-23 10:13 UTC (permalink / raw)
To: Tony Luck, Borislav Petkov
Cc: linux-edac, linux-kernel, Greg Kroah-Hartman, Seunghun Han
I am Seunghun Han and a senior security researcher at National Security
Research Institute of South Korea.
I found a critical security issue which can make kernel panic in userspace.
After analyzing the issue carefully, I found that MCE driver in the kernel
has a problem which can be occurred in SMP environment.
The check_interval file in
/sys/devices/system/machinecheck/machinecheck<cpu number> directory is a
global timer value for MCE polling. If it is changed by one CPU, MCE driver
in kernel calls mce_restart() function and broadcasts the event to other
CPUs to delete and restart MCE polling timer.
The __mcheck_cpu_init_timer() function which is called by mce_restart()
function initializes the mce_timer variable, and the "lock" in mce_timer is
also reinitialized. If more than one CPU write a specific value to
check_interval file concurrently, one can initialize the "lock" in mce_timer
while the others are handling "lock" in mce_timer. This problem causes some
synchronization errors such as kernel panic and kernel hang.
It is a critical security problem because the attacker can make kernel panic
by writing a value to the check_interval file in userspace, and it can be
used for Denial-of-Service (DoS) attack.
To fix this problem, I changed the __mcheck_cpu_init_timer() function to
reuse mce_timer instead of initializing it. The purpose of the function is
to restart the timer and it can be archived by calling
Signed-off-by: Seunghun Han <kkamagui@gmail.com>
---
arch/x86/kernel/cpu/mcheck/mce.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c
index 3d8c573a9a27..d72f2f74f4d7 100644
--- a/arch/x86/kernel/cpu/mcheck/mce.c
+++ b/arch/x86/kernel/cpu/mcheck/mce.c
@@ -1771,7 +1771,6 @@ static void __mcheck_cpu_init_timer(void)
{
struct timer_list *t = this_cpu_ptr(&mce_timer);
- timer_setup(t, mce_timer_fn, TIMER_PINNED);
mce_start_timer(t);
}
@@ -2029,8 +2028,10 @@ static void mce_enable_ce(void *all)
return;
cmci_reenable();
cmci_recheck();
- if (all)
+ if (all) {
+ del_timer_sync(this_cpu_ptr(&mce_timer));
__mcheck_cpu_init_timer();
+ }
}
static struct bus_type mce_subsys = {
^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH] x86: mce: fix kernel panic when check_interval is changed
@ 2018-02-23 10:42 ` Borislav Petkov
0 siblings, 0 replies; 12+ messages in thread
From: Borislav Petkov @ 2018-02-23 10:42 UTC (permalink / raw)
To: Seunghun Han, Tony Luck; +Cc: linux-edac, linux-kernel, Greg Kroah-Hartman
On Fri, Feb 23, 2018 at 07:13:50PM +0900, Seunghun Han wrote:
> I am Seunghun Han and a senior security researcher at National Security
> Research Institute of South Korea.
>
> I found a critical security issue which can make kernel panic in userspace.
> After analyzing the issue carefully, I found that MCE driver in the kernel
> has a problem which can be occurred in SMP environment.
>
> The check_interval file in
> /sys/devices/system/machinecheck/machinecheck<cpu number> directory is a
> global timer value for MCE polling. If it is changed by one CPU, MCE driver
> in kernel calls mce_restart() function and broadcasts the event to other
Right, so I'm thinking that doing that per-CPU configuration doesn't
make a whole lot of sense. It is not something that needs to happen very
often and it is done globally anyway.
So what we should do here, IMO, is make mce_restart() grab a mutex and
thus serialize all those sysfs writes. It will naturally also slow down
any hammering from userspace which we should not allow anyway.
Tony, what do you think?
--
Regards/Gruss,
Boris.
Good mailing practices for 400: avoid top-posting and trim the reply.
^ permalink raw reply [flat|nested] 12+ messages in thread
* x86: mce: fix kernel panic when check_interval is changed
@ 2018-02-23 10:42 ` Borislav Petkov
0 siblings, 0 replies; 12+ messages in thread
From: Borislav Petkov @ 2018-02-23 10:42 UTC (permalink / raw)
To: Seunghun Han, Tony Luck; +Cc: linux-edac, linux-kernel, Greg Kroah-Hartman
On Fri, Feb 23, 2018 at 07:13:50PM +0900, Seunghun Han wrote:
> I am Seunghun Han and a senior security researcher at National Security
> Research Institute of South Korea.
>
> I found a critical security issue which can make kernel panic in userspace.
> After analyzing the issue carefully, I found that MCE driver in the kernel
> has a problem which can be occurred in SMP environment.
>
> The check_interval file in
> /sys/devices/system/machinecheck/machinecheck<cpu number> directory is a
> global timer value for MCE polling. If it is changed by one CPU, MCE driver
> in kernel calls mce_restart() function and broadcasts the event to other
Right, so I'm thinking that doing that per-CPU configuration doesn't
make a whole lot of sense. It is not something that needs to happen very
often and it is done globally anyway.
So what we should do here, IMO, is make mce_restart() grab a mutex and
thus serialize all those sysfs writes. It will naturally also slow down
any hammering from userspace which we should not allow anyway.
Tony, what do you think?
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] x86: mce: fix kernel panic when check_interval is changed
@ 2018-02-23 10:52 ` Greg Kroah-Hartman
0 siblings, 0 replies; 12+ messages in thread
From: Greg Kroah-Hartman @ 2018-02-23 10:52 UTC (permalink / raw)
To: Seunghun Han; +Cc: Tony Luck, Borislav Petkov, linux-edac, linux-kernel
On Fri, Feb 23, 2018 at 07:13:50PM +0900, Seunghun Han wrote:
> I am Seunghun Han and a senior security researcher at National Security
> Research Institute of South Korea.
>
> I found a critical security issue which can make kernel panic in userspace.
> After analyzing the issue carefully, I found that MCE driver in the kernel
> has a problem which can be occurred in SMP environment.
>
> The check_interval file in
> /sys/devices/system/machinecheck/machinecheck<cpu number> directory is a
> global timer value for MCE polling. If it is changed by one CPU, MCE driver
> in kernel calls mce_restart() function and broadcasts the event to other
> CPUs to delete and restart MCE polling timer.
>
> The __mcheck_cpu_init_timer() function which is called by mce_restart()
> function initializes the mce_timer variable, and the "lock" in mce_timer is
> also reinitialized. If more than one CPU write a specific value to
> check_interval file concurrently, one can initialize the "lock" in mce_timer
> while the others are handling "lock" in mce_timer. This problem causes some
> synchronization errors such as kernel panic and kernel hang.
>
> It is a critical security problem because the attacker can make kernel panic
> by writing a value to the check_interval file in userspace, and it can be
> used for Denial-of-Service (DoS) attack.
As only root can write to that file, it's not that critical of an issue,
but yes, this is a problem. Nice find and fix.
>
> To fix this problem, I changed the __mcheck_cpu_init_timer() function to
> reuse mce_timer instead of initializing it. The purpose of the function is
> to restart the timer and it can be archived by calling
>
> Signed-off-by: Seunghun Han <kkamagui@gmail.com>
Cc: stable <stable@vger.kernel.org>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
^ permalink raw reply [flat|nested] 12+ messages in thread
* x86: mce: fix kernel panic when check_interval is changed
@ 2018-02-23 10:52 ` Greg Kroah-Hartman
0 siblings, 0 replies; 12+ messages in thread
From: Greg Kroah-Hartman @ 2018-02-23 10:52 UTC (permalink / raw)
To: Seunghun Han; +Cc: Tony Luck, Borislav Petkov, linux-edac, linux-kernel
On Fri, Feb 23, 2018 at 07:13:50PM +0900, Seunghun Han wrote:
> I am Seunghun Han and a senior security researcher at National Security
> Research Institute of South Korea.
>
> I found a critical security issue which can make kernel panic in userspace.
> After analyzing the issue carefully, I found that MCE driver in the kernel
> has a problem which can be occurred in SMP environment.
>
> The check_interval file in
> /sys/devices/system/machinecheck/machinecheck<cpu number> directory is a
> global timer value for MCE polling. If it is changed by one CPU, MCE driver
> in kernel calls mce_restart() function and broadcasts the event to other
> CPUs to delete and restart MCE polling timer.
>
> The __mcheck_cpu_init_timer() function which is called by mce_restart()
> function initializes the mce_timer variable, and the "lock" in mce_timer is
> also reinitialized. If more than one CPU write a specific value to
> check_interval file concurrently, one can initialize the "lock" in mce_timer
> while the others are handling "lock" in mce_timer. This problem causes some
> synchronization errors such as kernel panic and kernel hang.
>
> It is a critical security problem because the attacker can make kernel panic
> by writing a value to the check_interval file in userspace, and it can be
> used for Denial-of-Service (DoS) attack.
As only root can write to that file, it's not that critical of an issue,
but yes, this is a problem. Nice find and fix.
>
> To fix this problem, I changed the __mcheck_cpu_init_timer() function to
> reuse mce_timer instead of initializing it. The purpose of the function is
> to restart the timer and it can be archived by calling
>
> Signed-off-by: Seunghun Han <kkamagui@gmail.com>
Cc: stable <stable@vger.kernel.org>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
To unsubscribe from this list: send the line "unsubscribe linux-edac" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] x86: mce: fix kernel panic when check_interval is changed
@ 2018-02-25 20:05 ` Seunghun Han
0 siblings, 0 replies; 12+ messages in thread
From: Seunghun Han @ 2018-02-25 20:05 UTC (permalink / raw)
To: Greg Kroah-Hartman
Cc: Tony Luck, Borislav Petkov, linux-edac, Linux Kernel Mailing List
Hello, Greg.
2018-02-23 19:52 GMT+09:00 Greg Kroah-Hartman <gregkh@linuxfoundation.org>:
> On Fri, Feb 23, 2018 at 07:13:50PM +0900, Seunghun Han wrote:
>> I am Seunghun Han and a senior security researcher at National Security
>> Research Institute of South Korea.
>>
>> I found a critical security issue which can make kernel panic in userspace.
>> After analyzing the issue carefully, I found that MCE driver in the kernel
>> has a problem which can be occurred in SMP environment.
>>
>> The check_interval file in
>> /sys/devices/system/machinecheck/machinecheck<cpu number> directory is a
>> global timer value for MCE polling. If it is changed by one CPU, MCE driver
>> in kernel calls mce_restart() function and broadcasts the event to other
>> CPUs to delete and restart MCE polling timer.
>>
>> The __mcheck_cpu_init_timer() function which is called by mce_restart()
>> function initializes the mce_timer variable, and the "lock" in mce_timer is
>> also reinitialized. If more than one CPU write a specific value to
>> check_interval file concurrently, one can initialize the "lock" in mce_timer
>> while the others are handling "lock" in mce_timer. This problem causes some
>> synchronization errors such as kernel panic and kernel hang.
>>
>> It is a critical security problem because the attacker can make kernel panic
>> by writing a value to the check_interval file in userspace, and it can be
>> used for Denial-of-Service (DoS) attack.
>
> As only root can write to that file, it's not that critical of an issue,
> but yes, this is a problem. Nice find and fix.
I agree with your opinion.
Thank you for your advice.
Best regards.
Seunghun.
>>
>> To fix this problem, I changed the __mcheck_cpu_init_timer() function to
>> reuse mce_timer instead of initializing it. The purpose of the function is
>> to restart the timer and it can be archived by calling
>>
>> Signed-off-by: Seunghun Han <kkamagui@gmail.com>
>
> Cc: stable <stable@vger.kernel.org>
> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
>
^ permalink raw reply [flat|nested] 12+ messages in thread
* x86: mce: fix kernel panic when check_interval is changed
@ 2018-02-25 20:05 ` Seunghun Han
0 siblings, 0 replies; 12+ messages in thread
From: Seunghun Han @ 2018-02-25 20:05 UTC (permalink / raw)
To: Greg Kroah-Hartman
Cc: Tony Luck, Borislav Petkov, linux-edac, Linux Kernel Mailing List
Hello, Greg.
2018-02-23 19:52 GMT+09:00 Greg Kroah-Hartman <gregkh@linuxfoundation.org>:
> On Fri, Feb 23, 2018 at 07:13:50PM +0900, Seunghun Han wrote:
>> I am Seunghun Han and a senior security researcher at National Security
>> Research Institute of South Korea.
>>
>> I found a critical security issue which can make kernel panic in userspace.
>> After analyzing the issue carefully, I found that MCE driver in the kernel
>> has a problem which can be occurred in SMP environment.
>>
>> The check_interval file in
>> /sys/devices/system/machinecheck/machinecheck<cpu number> directory is a
>> global timer value for MCE polling. If it is changed by one CPU, MCE driver
>> in kernel calls mce_restart() function and broadcasts the event to other
>> CPUs to delete and restart MCE polling timer.
>>
>> The __mcheck_cpu_init_timer() function which is called by mce_restart()
>> function initializes the mce_timer variable, and the "lock" in mce_timer is
>> also reinitialized. If more than one CPU write a specific value to
>> check_interval file concurrently, one can initialize the "lock" in mce_timer
>> while the others are handling "lock" in mce_timer. This problem causes some
>> synchronization errors such as kernel panic and kernel hang.
>>
>> It is a critical security problem because the attacker can make kernel panic
>> by writing a value to the check_interval file in userspace, and it can be
>> used for Denial-of-Service (DoS) attack.
>
> As only root can write to that file, it's not that critical of an issue,
> but yes, this is a problem. Nice find and fix.
I agree with your opinion.
Thank you for your advice.
Best regards.
Seunghun.
>>
>> To fix this problem, I changed the __mcheck_cpu_init_timer() function to
>> reuse mce_timer instead of initializing it. The purpose of the function is
>> to restart the timer and it can be archived by calling
>>
>> Signed-off-by: Seunghun Han <kkamagui@gmail.com>
>
> Cc: stable <stable@vger.kernel.org>
> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
>
---
To unsubscribe from this list: send the line "unsubscribe linux-edac" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] x86: mce: fix kernel panic when check_interval is changed
@ 2018-02-28 9:32 ` Borislav Petkov
0 siblings, 0 replies; 12+ messages in thread
From: Borislav Petkov @ 2018-02-28 9:32 UTC (permalink / raw)
To: Seunghun Han
Cc: Greg Kroah-Hartman, Tony Luck, linux-edac, Linux Kernel Mailing List
On Mon, Feb 26, 2018 at 05:05:04AM +0900, Seunghun Han wrote:
> >> It is a critical security problem because the attacker can make kernel panic
> >> by writing a value to the check_interval file in userspace, and it can be
> >> used for Denial-of-Service (DoS) attack.
> >
> > As only root can write to that file, it's not that critical of an issue,
> > but yes, this is a problem. Nice find and fix.
This is still the wrong fix. You need to:
1. check the old value of check_interval in store_int_with_restart() and
exit early if it is the same.
2. have mce_restart() grab a newly defined mutex, say, mce_sysfs_mutex
or so, which synchronizes all CPUs so that their timers get deleted and
reinitialized in the proper order.
Thx.
--
Regards/Gruss,
Boris.
Good mailing practices for 400: avoid top-posting and trim the reply.
^ permalink raw reply [flat|nested] 12+ messages in thread
* x86: mce: fix kernel panic when check_interval is changed
@ 2018-02-28 9:32 ` Borislav Petkov
0 siblings, 0 replies; 12+ messages in thread
From: Borislav Petkov @ 2018-02-28 9:32 UTC (permalink / raw)
To: Seunghun Han
Cc: Greg Kroah-Hartman, Tony Luck, linux-edac, Linux Kernel Mailing List
On Mon, Feb 26, 2018 at 05:05:04AM +0900, Seunghun Han wrote:
> >> It is a critical security problem because the attacker can make kernel panic
> >> by writing a value to the check_interval file in userspace, and it can be
> >> used for Denial-of-Service (DoS) attack.
> >
> > As only root can write to that file, it's not that critical of an issue,
> > but yes, this is a problem. Nice find and fix.
This is still the wrong fix. You need to:
1. check the old value of check_interval in store_int_with_restart() and
exit early if it is the same.
2. have mce_restart() grab a newly defined mutex, say, mce_sysfs_mutex
or so, which synchronizes all CPUs so that their timers get deleted and
reinitialized in the proper order.
Thx.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] x86: mce: fix kernel panic when check_interval is changed
@ 2018-02-28 9:48 ` Seunghun Han
0 siblings, 0 replies; 12+ messages in thread
From: Seunghun Han @ 2018-02-28 9:48 UTC (permalink / raw)
To: Borislav Petkov
Cc: Greg Kroah-Hartman, Tony Luck, linux-edac, Linux Kernel Mailing List
Hello, Borislav.
2018-02-28 18:32 GMT+09:00 Borislav Petkov <bp@alien8.de>:
> On Mon, Feb 26, 2018 at 05:05:04AM +0900, Seunghun Han wrote:
>> >> It is a critical security problem because the attacker can make kernel panic
>> >> by writing a value to the check_interval file in userspace, and it can be
>> >> used for Denial-of-Service (DoS) attack.
>> >
>> > As only root can write to that file, it's not that critical of an issue,
>> > but yes, this is a problem. Nice find and fix.
>
> This is still the wrong fix. You need to:
>
> 1. check the old value of check_interval in store_int_with_restart() and
> exit early if it is the same.
>
> 2. have mce_restart() grab a newly defined mutex, say, mce_sysfs_mutex
> or so, which synchronizes all CPUs so that their timers get deleted and
> reinitialized in the proper order.
Thank you for your advice.
I will change my patch like that and send it again.
Best regard.
Seunghun.
>
> Thx.
>
> --
> Regards/Gruss,
> Boris.
>
> Good mailing practices for 400: avoid top-posting and trim the reply.
^ permalink raw reply [flat|nested] 12+ messages in thread
* x86: mce: fix kernel panic when check_interval is changed
@ 2018-02-28 9:48 ` Seunghun Han
0 siblings, 0 replies; 12+ messages in thread
From: Seunghun Han @ 2018-02-28 9:48 UTC (permalink / raw)
To: Borislav Petkov
Cc: Greg Kroah-Hartman, Tony Luck, linux-edac, Linux Kernel Mailing List
Hello, Borislav.
2018-02-28 18:32 GMT+09:00 Borislav Petkov <bp@alien8.de>:
> On Mon, Feb 26, 2018 at 05:05:04AM +0900, Seunghun Han wrote:
>> >> It is a critical security problem because the attacker can make kernel panic
>> >> by writing a value to the check_interval file in userspace, and it can be
>> >> used for Denial-of-Service (DoS) attack.
>> >
>> > As only root can write to that file, it's not that critical of an issue,
>> > but yes, this is a problem. Nice find and fix.
>
> This is still the wrong fix. You need to:
>
> 1. check the old value of check_interval in store_int_with_restart() and
> exit early if it is the same.
>
> 2. have mce_restart() grab a newly defined mutex, say, mce_sysfs_mutex
> or so, which synchronizes all CPUs so that their timers get deleted and
> reinitialized in the proper order.
Thank you for your advice.
I will change my patch like that and send it again.
Best regard.
Seunghun.
>
> Thx.
>
> --
> Regards/Gruss,
> Boris.
>
> Good mailing practices for 400: avoid top-posting and trim the reply.
---
To unsubscribe from this list: send the line "unsubscribe linux-edac" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2018-02-28 9:48 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-02-23 10:13 [PATCH] x86: mce: fix kernel panic when check_interval is changed Seunghun Han
2018-02-23 10:13 ` Seunghun Han
2018-02-23 10:42 ` [PATCH] " Borislav Petkov
2018-02-23 10:42 ` Borislav Petkov
2018-02-23 10:52 ` [PATCH] " Greg Kroah-Hartman
2018-02-23 10:52 ` Greg Kroah-Hartman
2018-02-25 20:05 ` [PATCH] " Seunghun Han
2018-02-25 20:05 ` Seunghun Han
2018-02-28 9:32 ` [PATCH] " Borislav Petkov
2018-02-28 9:32 ` Borislav Petkov
2018-02-28 9:48 ` [PATCH] " Seunghun Han
2018-02-28 9:48 ` Seunghun Han
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.