All of lore.kernel.org
 help / color / mirror / Atom feed
* Getting interrupt every million cache misses
@ 2016-10-26 20:54 Pavel Machek
  2016-10-27  8:28 ` Peter Zijlstra
  0 siblings, 1 reply; 79+ messages in thread
From: Pavel Machek @ 2016-10-26 20:54 UTC (permalink / raw)
  To: acme, kernel list; +Cc: peterz, mingo, alexander.shishkin

[-- Attachment #1: Type: text/plain, Size: 492 bytes --]

Hi!

I'd like to get an interrupt every million cache misses... to do a
printk() or something like that. As far as I can tell, modern hardware
should allow me to do that. AFAICT performance events subsystem can do
something like that, but I can't figure out where the code is / what I
should call.

Can someone help?

Thanks,
									Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: Getting interrupt every million cache misses
  2016-10-26 20:54 Getting interrupt every million cache misses Pavel Machek
@ 2016-10-27  8:28 ` Peter Zijlstra
  2016-10-27  8:46   ` Pavel Machek
  2016-10-27  9:11   ` Pavel Machek
  0 siblings, 2 replies; 79+ messages in thread
From: Peter Zijlstra @ 2016-10-27  8:28 UTC (permalink / raw)
  To: Pavel Machek; +Cc: acme, kernel list, mingo, alexander.shishkin

On Wed, Oct 26, 2016 at 10:54:16PM +0200, Pavel Machek wrote:
> Hi!
> 
> I'd like to get an interrupt every million cache misses... to do a
> printk() or something like that. As far as I can tell, modern hardware
> should allow me to do that. AFAICT performance events subsystem can do
> something like that, but I can't figure out where the code is / what I
> should call.
> 
> Can someone help?

Can you go back one step and explain why you would want this? What use
is a printk() on every 1e6-th cache miss.

That is, why doesn't:

 $ perf record -e cache-misses -c 1000000 -a -- sleep 5

suffice?

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: Getting interrupt every million cache misses
  2016-10-27  8:28 ` Peter Zijlstra
@ 2016-10-27  8:46   ` Pavel Machek
  2016-10-27  9:15     ` Peter Zijlstra
  2016-10-27  9:11   ` Pavel Machek
  1 sibling, 1 reply; 79+ messages in thread
From: Pavel Machek @ 2016-10-27  8:46 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: acme, kernel list, mingo, alexander.shishkin

[-- Attachment #1: Type: text/plain, Size: 1439 bytes --]

Hi!

> > I'd like to get an interrupt every million cache misses... to do a
> > printk() or something like that. As far as I can tell, modern hardware
> > should allow me to do that. AFAICT performance events subsystem can do
> > something like that, but I can't figure out where the code is / what I
> > should call.
> > 
> > Can someone help?
> 
> Can you go back one step and explain why you would want this? What use
> is a printk() on every 1e6-th cache miss.

First, thanks for quick reply.

And actually, printk() is not needed, udelay(50msec) is. Reason is,
that DRAM becomes unreliable if about milion cache misses happen in
under 64msec -- so I'd like to slow the system down in such cases to
prevent bug from biting me.

(Details are here
https://googleprojectzero.blogspot.cz/2015/03/exploiting-dram-rowhammer-bug-to-gain.html
). Bug is exploitable to get local root; it is also exploitable to
gain local code execution from javascript... so it is rather severe.

> That is, why doesn't:
> 
>  $ perf record -e cache-misses -c 1000000 -a -- sleep 5
> 
> suffice?

Thanks for the pointer... I'd really like to do this from kernel, so
that I can "almost synchronously" stop the execution when excessive
cache isses happen.

Best regards,
									Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: Getting interrupt every million cache misses
  2016-10-27  8:28 ` Peter Zijlstra
  2016-10-27  8:46   ` Pavel Machek
@ 2016-10-27  9:11   ` Pavel Machek
  2016-10-27  9:33     ` Peter Zijlstra
  1 sibling, 1 reply; 79+ messages in thread
From: Pavel Machek @ 2016-10-27  9:11 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: acme, kernel list, mingo, alexander.shishkin

[-- Attachment #1: Type: text/plain, Size: 1454 bytes --]

On Thu 2016-10-27 10:28:01, Peter Zijlstra wrote:
> On Wed, Oct 26, 2016 at 10:54:16PM +0200, Pavel Machek wrote:
> > Hi!
> > 
> > I'd like to get an interrupt every million cache misses... to do a
> > printk() or something like that. As far as I can tell, modern hardware
> > should allow me to do that. AFAICT performance events subsystem can do
> > something like that, but I can't figure out where the code is / what I
> > should call.
> > 
> > Can someone help?
> 
> Can you go back one step and explain why you would want this? What use
> is a printk() on every 1e6-th cache miss.
> 
> That is, why doesn't:
> 
>  $ perf record -e cache-misses -c 1000000 -a -- sleep 5
> 
> suffice?

How to work around rowhammer, break my system _and_ make kernel perf
maintainers scream at the same time: (:-) )

I think I got the place now. Let me try...

Thanks,
								Pavel


diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c
index d31735f..ce83f5e 100644
--- a/arch/x86/events/core.c
+++ b/arch/x86/events/core.c
@@ -1495,6 +1495,11 @@ perf_event_nmi_handler(unsigned int cmd, struct pt_regs *regs)
 
 	perf_sample_event_took(finish_clock - start_clock);
 
+	/* Here */
+	{
+		udelay(58000);
+	}
+
 	return ret;
 }
 NOKPROBE_SYMBOL(perf_event_nmi_handler);


-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* Re: Getting interrupt every million cache misses
  2016-10-27  8:46   ` Pavel Machek
@ 2016-10-27  9:15     ` Peter Zijlstra
  0 siblings, 0 replies; 79+ messages in thread
From: Peter Zijlstra @ 2016-10-27  9:15 UTC (permalink / raw)
  To: Pavel Machek; +Cc: acme, kernel list, mingo, alexander.shishkin

On Thu, Oct 27, 2016 at 10:46:38AM +0200, Pavel Machek wrote:

> And actually, printk() is not needed, udelay(50msec) is. Reason is,
> that DRAM becomes unreliable if about milion cache misses happen in
> under 64msec -- so I'd like to slow the system down in such cases to
> prevent bug from biting me.
> 
> (Details are here
> https://googleprojectzero.blogspot.cz/2015/03/exploiting-dram-rowhammer-bug-to-gain.html
> ). Bug is exploitable to get local root; it is also exploitable to
> gain local code execution from javascript... so it is rather severe.

Cute, a rowhammer defence.

So we can do in-kernel perf events too, see for example
kernel/watchdog.c:wd_hw_attr and its users.

I suppose you want PERF_COUNT_HW_CACHE_MISSES as config, although
depending on platform you could use better (u-arch specific) events.

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: Getting interrupt every million cache misses
  2016-10-27  9:11   ` Pavel Machek
@ 2016-10-27  9:33     ` Peter Zijlstra
  2016-10-27 20:40         ` [kernel-hardening] " Kees Cook
  0 siblings, 1 reply; 79+ messages in thread
From: Peter Zijlstra @ 2016-10-27  9:33 UTC (permalink / raw)
  To: Pavel Machek; +Cc: acme, kernel list, mingo, alexander.shishkin

On Thu, Oct 27, 2016 at 11:11:04AM +0200, Pavel Machek wrote:
> How to work around rowhammer, break my system _and_ make kernel perf
> maintainers scream at the same time: (:-) )
> 
> I think I got the place now. Let me try...

Lol ;-)

> 
> diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c
> index d31735f..ce83f5e 100644
> --- a/arch/x86/events/core.c
> +++ b/arch/x86/events/core.c
> @@ -1495,6 +1495,11 @@ perf_event_nmi_handler(unsigned int cmd, struct pt_regs *regs)
>  
>  	perf_sample_event_took(finish_clock - start_clock);
>  
> +	/* Here */
> +	{
> +		udelay(58000);
> +	}
> +
>  	return ret;
>  }
>  NOKPROBE_SYMBOL(perf_event_nmi_handler);

Like you guess, not quite ;-)


I think you want to register a custom overflow handler with your event.

So you get something like:


struct perf_event_attr rh_attr = {
	.type	= PERF_TYPE_HARDWARE,
	.config = PERF_COUNT_HW_CACHE_MISSES,
	.size	= sizeof(struct perf_event_attr),
	.pinned	= 1,
	.sample_period = 1000000,
};

static DEFINE_PER_CPU(struct perf_event *, rh_event);
static DEFINE_PER_CPU(u64, rh_timestamp);

static void rh_overflow(struct perf_event *event, struct perf_sample_data *data, struct pt_regs *regs)
{
	u64 *ts = this_cpu_ptr(&rh_timestamp); /* this is NMI context */
	u64 now = ktime_get_mono_fast_ns();
	s64 delta = now - *ts;

	*ts = now;

	if (delta > 64 * NSEC_PER_USEC)
		udelay(58000);
}

__init int my_module_init()
{
	int cpu;

	/* XXX borken vs hotplug */

	for_each_online_cpu(cpu) {
		struct perf_event *event = per_cpu(event, cpu);

		event = perf_event_create_kernel_counter(&rh_attr, cpu, NULL, rh_overflow, NULL);
		if (!event)
			/* meh */
			;

	}
}

__exit void my_module_exit()
{
	int cpu;

	for_each_online_cpu(cpu) {
		struct perf_event *event = per_cpu(event, cpu);

		if (event)
			perf_event_release_kernel(event);
	}
}

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: Getting interrupt every million cache misses
  2016-10-27  9:33     ` Peter Zijlstra
@ 2016-10-27 20:40         ` Kees Cook
  0 siblings, 0 replies; 79+ messages in thread
From: Kees Cook @ 2016-10-27 20:40 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Pavel Machek, Arnaldo Carvalho de Melo, kernel list, Ingo Molnar,
	Alexander Shishkin, kernel-hardening

On Thu, Oct 27, 2016 at 2:33 AM, Peter Zijlstra <peterz@infradead.org> wrote:
> On Thu, Oct 27, 2016 at 11:11:04AM +0200, Pavel Machek wrote:
>> How to work around rowhammer, break my system _and_ make kernel perf
>> maintainers scream at the same time: (:-) )
>>
>> I think I got the place now. Let me try...
>
> Lol ;-)
>
>>
>> diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c
>> index d31735f..ce83f5e 100644
>> --- a/arch/x86/events/core.c
>> +++ b/arch/x86/events/core.c
>> @@ -1495,6 +1495,11 @@ perf_event_nmi_handler(unsigned int cmd, struct pt_regs *regs)
>>
>>       perf_sample_event_took(finish_clock - start_clock);
>>
>> +     /* Here */
>> +     {
>> +             udelay(58000);
>> +     }
>> +
>>       return ret;
>>  }
>>  NOKPROBE_SYMBOL(perf_event_nmi_handler);
>
> Like you guess, not quite ;-)
>
>
> I think you want to register a custom overflow handler with your event.
>
> So you get something like:
>
>
> struct perf_event_attr rh_attr = {
>         .type   = PERF_TYPE_HARDWARE,
>         .config = PERF_COUNT_HW_CACHE_MISSES,
>         .size   = sizeof(struct perf_event_attr),
>         .pinned = 1,
>         .sample_period = 1000000,
> };
>
> static DEFINE_PER_CPU(struct perf_event *, rh_event);
> static DEFINE_PER_CPU(u64, rh_timestamp);
>
> static void rh_overflow(struct perf_event *event, struct perf_sample_data *data, struct pt_regs *regs)
> {
>         u64 *ts = this_cpu_ptr(&rh_timestamp); /* this is NMI context */
>         u64 now = ktime_get_mono_fast_ns();
>         s64 delta = now - *ts;
>
>         *ts = now;
>
>         if (delta > 64 * NSEC_PER_USEC)
>                 udelay(58000);
> }
>
> __init int my_module_init()
> {
>         int cpu;
>
>         /* XXX borken vs hotplug */
>
>         for_each_online_cpu(cpu) {
>                 struct perf_event *event = per_cpu(event, cpu);
>
>                 event = perf_event_create_kernel_counter(&rh_attr, cpu, NULL, rh_overflow, NULL);
>                 if (!event)
>                         /* meh */
>                         ;
>
>         }
> }
>
> __exit void my_module_exit()
> {
>         int cpu;
>
>         for_each_online_cpu(cpu) {
>                 struct perf_event *event = per_cpu(event, cpu);
>
>                 if (event)
>                         perf_event_release_kernel(event);
>         }
> }

This is pretty cool. Are there workloads other than rowhammer that
could trip this, and if so, how bad would this delay be for them?

At the very least, this could be behind a CONFIG for people that don't
have a way to fix their RAM refresh timings, etc.

-Kees

-- 
Kees Cook
Nexus Security

^ permalink raw reply	[flat|nested] 79+ messages in thread

* [kernel-hardening] Re: Getting interrupt every million cache misses
@ 2016-10-27 20:40         ` Kees Cook
  0 siblings, 0 replies; 79+ messages in thread
From: Kees Cook @ 2016-10-27 20:40 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Pavel Machek, Arnaldo Carvalho de Melo, kernel list, Ingo Molnar,
	Alexander Shishkin, kernel-hardening

On Thu, Oct 27, 2016 at 2:33 AM, Peter Zijlstra <peterz@infradead.org> wrote:
> On Thu, Oct 27, 2016 at 11:11:04AM +0200, Pavel Machek wrote:
>> How to work around rowhammer, break my system _and_ make kernel perf
>> maintainers scream at the same time: (:-) )
>>
>> I think I got the place now. Let me try...
>
> Lol ;-)
>
>>
>> diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c
>> index d31735f..ce83f5e 100644
>> --- a/arch/x86/events/core.c
>> +++ b/arch/x86/events/core.c
>> @@ -1495,6 +1495,11 @@ perf_event_nmi_handler(unsigned int cmd, struct pt_regs *regs)
>>
>>       perf_sample_event_took(finish_clock - start_clock);
>>
>> +     /* Here */
>> +     {
>> +             udelay(58000);
>> +     }
>> +
>>       return ret;
>>  }
>>  NOKPROBE_SYMBOL(perf_event_nmi_handler);
>
> Like you guess, not quite ;-)
>
>
> I think you want to register a custom overflow handler with your event.
>
> So you get something like:
>
>
> struct perf_event_attr rh_attr = {
>         .type   = PERF_TYPE_HARDWARE,
>         .config = PERF_COUNT_HW_CACHE_MISSES,
>         .size   = sizeof(struct perf_event_attr),
>         .pinned = 1,
>         .sample_period = 1000000,
> };
>
> static DEFINE_PER_CPU(struct perf_event *, rh_event);
> static DEFINE_PER_CPU(u64, rh_timestamp);
>
> static void rh_overflow(struct perf_event *event, struct perf_sample_data *data, struct pt_regs *regs)
> {
>         u64 *ts = this_cpu_ptr(&rh_timestamp); /* this is NMI context */
>         u64 now = ktime_get_mono_fast_ns();
>         s64 delta = now - *ts;
>
>         *ts = now;
>
>         if (delta > 64 * NSEC_PER_USEC)
>                 udelay(58000);
> }
>
> __init int my_module_init()
> {
>         int cpu;
>
>         /* XXX borken vs hotplug */
>
>         for_each_online_cpu(cpu) {
>                 struct perf_event *event = per_cpu(event, cpu);
>
>                 event = perf_event_create_kernel_counter(&rh_attr, cpu, NULL, rh_overflow, NULL);
>                 if (!event)
>                         /* meh */
>                         ;
>
>         }
> }
>
> __exit void my_module_exit()
> {
>         int cpu;
>
>         for_each_online_cpu(cpu) {
>                 struct perf_event *event = per_cpu(event, cpu);
>
>                 if (event)
>                         perf_event_release_kernel(event);
>         }
> }

This is pretty cool. Are there workloads other than rowhammer that
could trip this, and if so, how bad would this delay be for them?

At the very least, this could be behind a CONFIG for people that don't
have a way to fix their RAM refresh timings, etc.

-Kees

-- 
Kees Cook
Nexus Security

^ permalink raw reply	[flat|nested] 79+ messages in thread

* rowhammer protection [was Re: Getting interrupt every million cache misses]
  2016-10-27 20:40         ` [kernel-hardening] " Kees Cook
@ 2016-10-27 21:27           ` Pavel Machek
  -1 siblings, 0 replies; 79+ messages in thread
From: Pavel Machek @ 2016-10-27 21:27 UTC (permalink / raw)
  To: Kees Cook
  Cc: Peter Zijlstra, Arnaldo Carvalho de Melo, kernel list,
	Ingo Molnar, Alexander Shishkin, kernel-hardening

[-- Attachment #1: Type: text/plain, Size: 3913 bytes --]

Hi!

> >                 if (event)
> >                         perf_event_release_kernel(event);
> >         }
> > }
> 
> This is pretty cool. Are there workloads other than rowhammer that
> could trip this, and if so, how bad would this delay be for them?
> 
> At the very least, this could be behind a CONFIG for people that don't
> have a way to fix their RAM refresh timings, etc.

Yes, CONFIG_ is next.

Here's the patch, notice that I reversed the time handling logic -- it
should be correct now.

We can't tell cache misses on different addresses from cache misses on
same address (rowhammer), so this will have false positive. But so
far, my machine seems to work.

Unfortunately, I don't have machine suitable for testing nearby. Can
someone help with testing? [On the other hand... testing this is not
going to be easy. This will probably make problem way harder to
reproduce it in any case...]

I did run rowhammer, and yes, this did trigger and it was getting
delayed -- by factor of 2. That is slightly low -- delay should be
factor of 8 to get guarantees, if I understand things correctly.

Oh and NMI gets quite angry, but that was to be expected.

[  112.476009] perf: interrupt took too long (23660454 > 23654965),
lowering ker
nel.perf_event_max_sample_rate to 250
[  170.224007] INFO: NMI handler (perf_event_nmi_handler) took too
long to run: 55.844 msecs
[  191.872007] INFO: NMI handler (perf_event_nmi_handler) took too
long to run: 55.845 msecs

Best regards,
								Pavel

diff --git a/kernel/events/Makefile b/kernel/events/Makefile
index 2925188..130a185 100644
--- a/kernel/events/Makefile
+++ b/kernel/events/Makefile
@@ -2,7 +2,7 @@ ifdef CONFIG_FUNCTION_TRACER
 CFLAGS_REMOVE_core.o = $(CC_FLAGS_FTRACE)
 endif
 
-obj-y := core.o ring_buffer.o callchain.o
+obj-y := core.o ring_buffer.o callchain.o nohammer.o
 
 obj-$(CONFIG_HAVE_HW_BREAKPOINT) += hw_breakpoint.o
 obj-$(CONFIG_UPROBES) += uprobes.o
diff --git a/kernel/events/nohammer.c b/kernel/events/nohammer.c
new file mode 100644
index 0000000..01844d2
--- /dev/null
+++ b/kernel/events/nohammer.c
@@ -0,0 +1,66 @@
+/*
+ * Thanks to Peter Zijlstra <peterz@infradead.org>.
+ */
+
+#include <linux/perf_event.h>
+#include <linux/module.h>
+#include <linux/delay.h>
+
+struct perf_event_attr rh_attr = {
+	.type	= PERF_TYPE_HARDWARE,
+	.config = PERF_COUNT_HW_CACHE_MISSES,
+	.size	= sizeof(struct perf_event_attr),
+	.pinned	= 1,
+	/* FIXME: it is 1000000 per cpu. */
+	.sample_period = 500000,
+};
+
+static DEFINE_PER_CPU(struct perf_event *, rh_event);
+static DEFINE_PER_CPU(u64, rh_timestamp);
+
+static void rh_overflow(struct perf_event *event, struct perf_sample_data *data, struct pt_regs *regs)
+{
+	u64 *ts = this_cpu_ptr(&rh_timestamp); /* this is NMI context */
+	u64 now = ktime_get_mono_fast_ns();
+	s64 delta = now - *ts;
+
+	*ts = now;
+
+	/* FIXME msec per usec, reverse logic? */
+	if (delta < 64 * NSEC_PER_MSEC)
+		mdelay(56);
+}
+
+static __init int my_module_init(void)
+{
+	int cpu;
+
+	/* XXX borken vs hotplug */
+
+	for_each_online_cpu(cpu) {
+		struct perf_event *event = per_cpu(rh_event, cpu);
+
+		event = perf_event_create_kernel_counter(&rh_attr, cpu, NULL, rh_overflow, NULL);
+		if (!event)
+			pr_err("Not enough resources to initialize nohammer on cpu %d\n", cpu);
+		pr_info("Nohammer initialized on cpu %d\n", cpu);
+		
+	}
+	return 0;
+}
+
+static __exit void my_module_exit(void)
+{
+	int cpu;
+
+	for_each_online_cpu(cpu) {
+		struct perf_event *event = per_cpu(rh_event, cpu);
+
+		if (event)
+			perf_event_release_kernel(event);
+	}
+	return;
+}
+
+module_init(my_module_init);
+module_exit(my_module_exit);


-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [kernel-hardening] rowhammer protection [was Re: Getting interrupt every million cache misses]
@ 2016-10-27 21:27           ` Pavel Machek
  0 siblings, 0 replies; 79+ messages in thread
From: Pavel Machek @ 2016-10-27 21:27 UTC (permalink / raw)
  To: Kees Cook
  Cc: Peter Zijlstra, Arnaldo Carvalho de Melo, kernel list,
	Ingo Molnar, Alexander Shishkin, kernel-hardening

[-- Attachment #1: Type: text/plain, Size: 3913 bytes --]

Hi!

> >                 if (event)
> >                         perf_event_release_kernel(event);
> >         }
> > }
> 
> This is pretty cool. Are there workloads other than rowhammer that
> could trip this, and if so, how bad would this delay be for them?
> 
> At the very least, this could be behind a CONFIG for people that don't
> have a way to fix their RAM refresh timings, etc.

Yes, CONFIG_ is next.

Here's the patch, notice that I reversed the time handling logic -- it
should be correct now.

We can't tell cache misses on different addresses from cache misses on
same address (rowhammer), so this will have false positive. But so
far, my machine seems to work.

Unfortunately, I don't have machine suitable for testing nearby. Can
someone help with testing? [On the other hand... testing this is not
going to be easy. This will probably make problem way harder to
reproduce it in any case...]

I did run rowhammer, and yes, this did trigger and it was getting
delayed -- by factor of 2. That is slightly low -- delay should be
factor of 8 to get guarantees, if I understand things correctly.

Oh and NMI gets quite angry, but that was to be expected.

[  112.476009] perf: interrupt took too long (23660454 > 23654965),
lowering ker
nel.perf_event_max_sample_rate to 250
[  170.224007] INFO: NMI handler (perf_event_nmi_handler) took too
long to run: 55.844 msecs
[  191.872007] INFO: NMI handler (perf_event_nmi_handler) took too
long to run: 55.845 msecs

Best regards,
								Pavel

diff --git a/kernel/events/Makefile b/kernel/events/Makefile
index 2925188..130a185 100644
--- a/kernel/events/Makefile
+++ b/kernel/events/Makefile
@@ -2,7 +2,7 @@ ifdef CONFIG_FUNCTION_TRACER
 CFLAGS_REMOVE_core.o = $(CC_FLAGS_FTRACE)
 endif
 
-obj-y := core.o ring_buffer.o callchain.o
+obj-y := core.o ring_buffer.o callchain.o nohammer.o
 
 obj-$(CONFIG_HAVE_HW_BREAKPOINT) += hw_breakpoint.o
 obj-$(CONFIG_UPROBES) += uprobes.o
diff --git a/kernel/events/nohammer.c b/kernel/events/nohammer.c
new file mode 100644
index 0000000..01844d2
--- /dev/null
+++ b/kernel/events/nohammer.c
@@ -0,0 +1,66 @@
+/*
+ * Thanks to Peter Zijlstra <peterz@infradead.org>.
+ */
+
+#include <linux/perf_event.h>
+#include <linux/module.h>
+#include <linux/delay.h>
+
+struct perf_event_attr rh_attr = {
+	.type	= PERF_TYPE_HARDWARE,
+	.config = PERF_COUNT_HW_CACHE_MISSES,
+	.size	= sizeof(struct perf_event_attr),
+	.pinned	= 1,
+	/* FIXME: it is 1000000 per cpu. */
+	.sample_period = 500000,
+};
+
+static DEFINE_PER_CPU(struct perf_event *, rh_event);
+static DEFINE_PER_CPU(u64, rh_timestamp);
+
+static void rh_overflow(struct perf_event *event, struct perf_sample_data *data, struct pt_regs *regs)
+{
+	u64 *ts = this_cpu_ptr(&rh_timestamp); /* this is NMI context */
+	u64 now = ktime_get_mono_fast_ns();
+	s64 delta = now - *ts;
+
+	*ts = now;
+
+	/* FIXME msec per usec, reverse logic? */
+	if (delta < 64 * NSEC_PER_MSEC)
+		mdelay(56);
+}
+
+static __init int my_module_init(void)
+{
+	int cpu;
+
+	/* XXX borken vs hotplug */
+
+	for_each_online_cpu(cpu) {
+		struct perf_event *event = per_cpu(rh_event, cpu);
+
+		event = perf_event_create_kernel_counter(&rh_attr, cpu, NULL, rh_overflow, NULL);
+		if (!event)
+			pr_err("Not enough resources to initialize nohammer on cpu %d\n", cpu);
+		pr_info("Nohammer initialized on cpu %d\n", cpu);
+		
+	}
+	return 0;
+}
+
+static __exit void my_module_exit(void)
+{
+	int cpu;
+
+	for_each_online_cpu(cpu) {
+		struct perf_event *event = per_cpu(rh_event, cpu);
+
+		if (event)
+			perf_event_release_kernel(event);
+	}
+	return;
+}
+
+module_init(my_module_init);
+module_exit(my_module_exit);


-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* Re: rowhammer protection [was Re: Getting interrupt every million cache misses]
  2016-10-27 21:27           ` [kernel-hardening] " Pavel Machek
@ 2016-10-28  7:07             ` Ingo Molnar
  -1 siblings, 0 replies; 79+ messages in thread
From: Ingo Molnar @ 2016-10-28  7:07 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Kees Cook, Peter Zijlstra, Arnaldo Carvalho de Melo, kernel list,
	Ingo Molnar, Alexander Shishkin, kernel-hardening


* Pavel Machek <pavel@ucw.cz> wrote:

> +static void rh_overflow(struct perf_event *event, struct perf_sample_data *data, struct pt_regs *regs)
> +{
> +	u64 *ts = this_cpu_ptr(&rh_timestamp); /* this is NMI context */
> +	u64 now = ktime_get_mono_fast_ns();
> +	s64 delta = now - *ts;
> +
> +	*ts = now;
> +
> +	/* FIXME msec per usec, reverse logic? */
> +	if (delta < 64 * NSEC_PER_MSEC)
> +		mdelay(56);
> +}

I'd suggest making the absolute delay sysctl tunable, because 'wait 56 msecs' is 
very magic, and do we know it 100% that 56 msecs is what is needed everywhere?

Plus I'd also suggest exposing an 'NMI rowhammer delay count' in /proc/interrupts, 
to make it easier to debug this. (Perhaps only show the line if the count is 
nonzero.)

Finally, could we please also add a sysctl and Kconfig that allows this feature to 
be turned on/off, with the default bootup value determined by the Kconfig value 
(i.e. by the distribution)? Similar to CONFIG_SECURITY_SELINUX_BOOTPARAM_VALUE.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 79+ messages in thread

* [kernel-hardening] Re: rowhammer protection [was Re: Getting interrupt every million cache misses]
@ 2016-10-28  7:07             ` Ingo Molnar
  0 siblings, 0 replies; 79+ messages in thread
From: Ingo Molnar @ 2016-10-28  7:07 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Kees Cook, Peter Zijlstra, Arnaldo Carvalho de Melo, kernel list,
	Ingo Molnar, Alexander Shishkin, kernel-hardening


* Pavel Machek <pavel@ucw.cz> wrote:

> +static void rh_overflow(struct perf_event *event, struct perf_sample_data *data, struct pt_regs *regs)
> +{
> +	u64 *ts = this_cpu_ptr(&rh_timestamp); /* this is NMI context */
> +	u64 now = ktime_get_mono_fast_ns();
> +	s64 delta = now - *ts;
> +
> +	*ts = now;
> +
> +	/* FIXME msec per usec, reverse logic? */
> +	if (delta < 64 * NSEC_PER_MSEC)
> +		mdelay(56);
> +}

I'd suggest making the absolute delay sysctl tunable, because 'wait 56 msecs' is 
very magic, and do we know it 100% that 56 msecs is what is needed everywhere?

Plus I'd also suggest exposing an 'NMI rowhammer delay count' in /proc/interrupts, 
to make it easier to debug this. (Perhaps only show the line if the count is 
nonzero.)

Finally, could we please also add a sysctl and Kconfig that allows this feature to 
be turned on/off, with the default bootup value determined by the Kconfig value 
(i.e. by the distribution)? Similar to CONFIG_SECURITY_SELINUX_BOOTPARAM_VALUE.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: rowhammer protection [was Re: Getting interrupt every million cache misses]
  2016-10-28  7:07             ` [kernel-hardening] " Ingo Molnar
@ 2016-10-28  8:50               ` Pavel Machek
  -1 siblings, 0 replies; 79+ messages in thread
From: Pavel Machek @ 2016-10-28  8:50 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Kees Cook, Peter Zijlstra, Arnaldo Carvalho de Melo, kernel list,
	Ingo Molnar, Alexander Shishkin, kernel-hardening

[-- Attachment #1: Type: text/plain, Size: 1167 bytes --]

On Fri 2016-10-28 09:07:01, Ingo Molnar wrote:
> 
> * Pavel Machek <pavel@ucw.cz> wrote:
> 
> > +static void rh_overflow(struct perf_event *event, struct perf_sample_data *data, struct pt_regs *regs)
> > +{
> > +	u64 *ts = this_cpu_ptr(&rh_timestamp); /* this is NMI context */
> > +	u64 now = ktime_get_mono_fast_ns();
> > +	s64 delta = now - *ts;
> > +
> > +	*ts = now;
> > +
> > +	/* FIXME msec per usec, reverse logic? */
> > +	if (delta < 64 * NSEC_PER_MSEC)
> > +		mdelay(56);
> > +}
> 
> I'd suggest making the absolute delay sysctl tunable, because 'wait 56 msecs' is 
> very magic, and do we know it 100% that 56 msecs is what is needed
> everywhere?

I agree this needs to be tunable (and with the other suggestions). But
this is actually not the most important tunable: the detection
threshold (rh_attr.sample_period) should be way more important.

And yes, this will all need to be tunable, somehow. But lets verify
that this works, first :-).

Thanks and best regards,
								Pavel

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

^ permalink raw reply	[flat|nested] 79+ messages in thread

* [kernel-hardening] Re: rowhammer protection [was Re: Getting interrupt every million cache misses]
@ 2016-10-28  8:50               ` Pavel Machek
  0 siblings, 0 replies; 79+ messages in thread
From: Pavel Machek @ 2016-10-28  8:50 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Kees Cook, Peter Zijlstra, Arnaldo Carvalho de Melo, kernel list,
	Ingo Molnar, Alexander Shishkin, kernel-hardening

[-- Attachment #1: Type: text/plain, Size: 1167 bytes --]

On Fri 2016-10-28 09:07:01, Ingo Molnar wrote:
> 
> * Pavel Machek <pavel@ucw.cz> wrote:
> 
> > +static void rh_overflow(struct perf_event *event, struct perf_sample_data *data, struct pt_regs *regs)
> > +{
> > +	u64 *ts = this_cpu_ptr(&rh_timestamp); /* this is NMI context */
> > +	u64 now = ktime_get_mono_fast_ns();
> > +	s64 delta = now - *ts;
> > +
> > +	*ts = now;
> > +
> > +	/* FIXME msec per usec, reverse logic? */
> > +	if (delta < 64 * NSEC_PER_MSEC)
> > +		mdelay(56);
> > +}
> 
> I'd suggest making the absolute delay sysctl tunable, because 'wait 56 msecs' is 
> very magic, and do we know it 100% that 56 msecs is what is needed
> everywhere?

I agree this needs to be tunable (and with the other suggestions). But
this is actually not the most important tunable: the detection
threshold (rh_attr.sample_period) should be way more important.

And yes, this will all need to be tunable, somehow. But lets verify
that this works, first :-).

Thanks and best regards,
								Pavel

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: rowhammer protection [was Re: Getting interrupt every million cache misses]
  2016-10-28  8:50               ` [kernel-hardening] " Pavel Machek
@ 2016-10-28  8:59                 ` Ingo Molnar
  -1 siblings, 0 replies; 79+ messages in thread
From: Ingo Molnar @ 2016-10-28  8:59 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Kees Cook, Peter Zijlstra, Arnaldo Carvalho de Melo, kernel list,
	Ingo Molnar, Alexander Shishkin, kernel-hardening


* Pavel Machek <pavel@ucw.cz> wrote:

> On Fri 2016-10-28 09:07:01, Ingo Molnar wrote:
> > 
> > * Pavel Machek <pavel@ucw.cz> wrote:
> > 
> > > +static void rh_overflow(struct perf_event *event, struct perf_sample_data *data, struct pt_regs *regs)
> > > +{
> > > +	u64 *ts = this_cpu_ptr(&rh_timestamp); /* this is NMI context */
> > > +	u64 now = ktime_get_mono_fast_ns();
> > > +	s64 delta = now - *ts;
> > > +
> > > +	*ts = now;
> > > +
> > > +	/* FIXME msec per usec, reverse logic? */
> > > +	if (delta < 64 * NSEC_PER_MSEC)
> > > +		mdelay(56);
> > > +}
> > 
> > I'd suggest making the absolute delay sysctl tunable, because 'wait 56 msecs' is 
> > very magic, and do we know it 100% that 56 msecs is what is needed
> > everywhere?
> 
> I agree this needs to be tunable (and with the other suggestions). But
> this is actually not the most important tunable: the detection
> threshold (rh_attr.sample_period) should be way more important.
> 
> And yes, this will all need to be tunable, somehow. But lets verify
> that this works, first :-).

Yeah.

Btw., a 56 NMI delay is pretty brutal in terms of latencies - it might
result in a smoother system to detect 100,000 cache misses and do a
~5.6 msecs delay instead?

(Assuming the shorter threshold does not trigger too often, of course.)

With all the tunables and statistics it would be possible to enumerate how 
frequently the protection mechanism kicks in during regular workloads.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 79+ messages in thread

* [kernel-hardening] Re: rowhammer protection [was Re: Getting interrupt every million cache misses]
@ 2016-10-28  8:59                 ` Ingo Molnar
  0 siblings, 0 replies; 79+ messages in thread
From: Ingo Molnar @ 2016-10-28  8:59 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Kees Cook, Peter Zijlstra, Arnaldo Carvalho de Melo, kernel list,
	Ingo Molnar, Alexander Shishkin, kernel-hardening


* Pavel Machek <pavel@ucw.cz> wrote:

> On Fri 2016-10-28 09:07:01, Ingo Molnar wrote:
> > 
> > * Pavel Machek <pavel@ucw.cz> wrote:
> > 
> > > +static void rh_overflow(struct perf_event *event, struct perf_sample_data *data, struct pt_regs *regs)
> > > +{
> > > +	u64 *ts = this_cpu_ptr(&rh_timestamp); /* this is NMI context */
> > > +	u64 now = ktime_get_mono_fast_ns();
> > > +	s64 delta = now - *ts;
> > > +
> > > +	*ts = now;
> > > +
> > > +	/* FIXME msec per usec, reverse logic? */
> > > +	if (delta < 64 * NSEC_PER_MSEC)
> > > +		mdelay(56);
> > > +}
> > 
> > I'd suggest making the absolute delay sysctl tunable, because 'wait 56 msecs' is 
> > very magic, and do we know it 100% that 56 msecs is what is needed
> > everywhere?
> 
> I agree this needs to be tunable (and with the other suggestions). But
> this is actually not the most important tunable: the detection
> threshold (rh_attr.sample_period) should be way more important.
> 
> And yes, this will all need to be tunable, somehow. But lets verify
> that this works, first :-).

Yeah.

Btw., a 56 NMI delay is pretty brutal in terms of latencies - it might
result in a smoother system to detect 100,000 cache misses and do a
~5.6 msecs delay instead?

(Assuming the shorter threshold does not trigger too often, of course.)

With all the tunables and statistics it would be possible to enumerate how 
frequently the protection mechanism kicks in during regular workloads.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: rowhammer protection [was Re: Getting interrupt every million cache misses]
  2016-10-28  8:50               ` [kernel-hardening] " Pavel Machek
@ 2016-10-28  9:04                 ` Peter Zijlstra
  -1 siblings, 0 replies; 79+ messages in thread
From: Peter Zijlstra @ 2016-10-28  9:04 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Ingo Molnar, Kees Cook, Arnaldo Carvalho de Melo, kernel list,
	Ingo Molnar, Alexander Shishkin, kernel-hardening

On Fri, Oct 28, 2016 at 10:50:39AM +0200, Pavel Machek wrote:
> On Fri 2016-10-28 09:07:01, Ingo Molnar wrote:
> > 
> > * Pavel Machek <pavel@ucw.cz> wrote:
> > 
> > > +static void rh_overflow(struct perf_event *event, struct perf_sample_data *data, struct pt_regs *regs)
> > > +{
> > > +	u64 *ts = this_cpu_ptr(&rh_timestamp); /* this is NMI context */
> > > +	u64 now = ktime_get_mono_fast_ns();
> > > +	s64 delta = now - *ts;
> > > +
> > > +	*ts = now;
> > > +
> > > +	/* FIXME msec per usec, reverse logic? */
> > > +	if (delta < 64 * NSEC_PER_MSEC)
> > > +		mdelay(56);
> > > +}
> > 
> > I'd suggest making the absolute delay sysctl tunable, because 'wait 56 msecs' is 
> > very magic, and do we know it 100% that 56 msecs is what is needed
> > everywhere?
> 
> I agree this needs to be tunable (and with the other suggestions). But
> this is actually not the most important tunable: the detection
> threshold (rh_attr.sample_period) should be way more important.

So being totally ignorant of the detail of how rowhammer abuses the DDR
thing, would it make sense to trigger more often and delay shorter? Or
is there some minimal delay required for things to settle or something.

^ permalink raw reply	[flat|nested] 79+ messages in thread

* [kernel-hardening] Re: rowhammer protection [was Re: Getting interrupt every million cache misses]
@ 2016-10-28  9:04                 ` Peter Zijlstra
  0 siblings, 0 replies; 79+ messages in thread
From: Peter Zijlstra @ 2016-10-28  9:04 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Ingo Molnar, Kees Cook, Arnaldo Carvalho de Melo, kernel list,
	Ingo Molnar, Alexander Shishkin, kernel-hardening

On Fri, Oct 28, 2016 at 10:50:39AM +0200, Pavel Machek wrote:
> On Fri 2016-10-28 09:07:01, Ingo Molnar wrote:
> > 
> > * Pavel Machek <pavel@ucw.cz> wrote:
> > 
> > > +static void rh_overflow(struct perf_event *event, struct perf_sample_data *data, struct pt_regs *regs)
> > > +{
> > > +	u64 *ts = this_cpu_ptr(&rh_timestamp); /* this is NMI context */
> > > +	u64 now = ktime_get_mono_fast_ns();
> > > +	s64 delta = now - *ts;
> > > +
> > > +	*ts = now;
> > > +
> > > +	/* FIXME msec per usec, reverse logic? */
> > > +	if (delta < 64 * NSEC_PER_MSEC)
> > > +		mdelay(56);
> > > +}
> > 
> > I'd suggest making the absolute delay sysctl tunable, because 'wait 56 msecs' is 
> > very magic, and do we know it 100% that 56 msecs is what is needed
> > everywhere?
> 
> I agree this needs to be tunable (and with the other suggestions). But
> this is actually not the most important tunable: the detection
> threshold (rh_attr.sample_period) should be way more important.

So being totally ignorant of the detail of how rowhammer abuses the DDR
thing, would it make sense to trigger more often and delay shorter? Or
is there some minimal delay required for things to settle or something.

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: rowhammer protection [was Re: Getting interrupt every million cache misses]
  2016-10-28  9:04                 ` [kernel-hardening] " Peter Zijlstra
@ 2016-10-28  9:27                   ` Vegard Nossum
  -1 siblings, 0 replies; 79+ messages in thread
From: Vegard Nossum @ 2016-10-28  9:27 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Pavel Machek, Ingo Molnar, Kees Cook, Arnaldo Carvalho de Melo,
	kernel list, Ingo Molnar, Alexander Shishkin, kernel-hardening

On 28 October 2016 at 11:04, Peter Zijlstra <peterz@infradead.org> wrote:
> On Fri, Oct 28, 2016 at 10:50:39AM +0200, Pavel Machek wrote:
>> On Fri 2016-10-28 09:07:01, Ingo Molnar wrote:
>> >
>> > * Pavel Machek <pavel@ucw.cz> wrote:
>> >
>> > > +static void rh_overflow(struct perf_event *event, struct perf_sample_data *data, struct pt_regs *regs)
>> > > +{
>> > > + u64 *ts = this_cpu_ptr(&rh_timestamp); /* this is NMI context */
>> > > + u64 now = ktime_get_mono_fast_ns();
>> > > + s64 delta = now - *ts;
>> > > +
>> > > + *ts = now;
>> > > +
>> > > + /* FIXME msec per usec, reverse logic? */
>> > > + if (delta < 64 * NSEC_PER_MSEC)
>> > > +         mdelay(56);
>> > > +}
>> >
>> > I'd suggest making the absolute delay sysctl tunable, because 'wait 56 msecs' is
>> > very magic, and do we know it 100% that 56 msecs is what is needed
>> > everywhere?
>>
>> I agree this needs to be tunable (and with the other suggestions). But
>> this is actually not the most important tunable: the detection
>> threshold (rh_attr.sample_period) should be way more important.
>
> So being totally ignorant of the detail of how rowhammer abuses the DDR
> thing, would it make sense to trigger more often and delay shorter? Or
> is there some minimal delay required for things to settle or something.

Would it make sense to sample the counter on context switch, do some
accounting on a per-task cache miss counter, and slow down just the
single task(s) with a too high cache miss rate? That way there's no
global slowdown (which I assume would be the case here). The task's
slice of CPU would have to be taken into account because otherwise you
could have multiple cooperating tasks that each escape the limit but
taken together go above it.


Vegard

^ permalink raw reply	[flat|nested] 79+ messages in thread

* [kernel-hardening] Re: rowhammer protection [was Re: Getting interrupt every million cache misses]
@ 2016-10-28  9:27                   ` Vegard Nossum
  0 siblings, 0 replies; 79+ messages in thread
From: Vegard Nossum @ 2016-10-28  9:27 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Pavel Machek, Ingo Molnar, Kees Cook, Arnaldo Carvalho de Melo,
	kernel list, Ingo Molnar, Alexander Shishkin, kernel-hardening

On 28 October 2016 at 11:04, Peter Zijlstra <peterz@infradead.org> wrote:
> On Fri, Oct 28, 2016 at 10:50:39AM +0200, Pavel Machek wrote:
>> On Fri 2016-10-28 09:07:01, Ingo Molnar wrote:
>> >
>> > * Pavel Machek <pavel@ucw.cz> wrote:
>> >
>> > > +static void rh_overflow(struct perf_event *event, struct perf_sample_data *data, struct pt_regs *regs)
>> > > +{
>> > > + u64 *ts = this_cpu_ptr(&rh_timestamp); /* this is NMI context */
>> > > + u64 now = ktime_get_mono_fast_ns();
>> > > + s64 delta = now - *ts;
>> > > +
>> > > + *ts = now;
>> > > +
>> > > + /* FIXME msec per usec, reverse logic? */
>> > > + if (delta < 64 * NSEC_PER_MSEC)
>> > > +         mdelay(56);
>> > > +}
>> >
>> > I'd suggest making the absolute delay sysctl tunable, because 'wait 56 msecs' is
>> > very magic, and do we know it 100% that 56 msecs is what is needed
>> > everywhere?
>>
>> I agree this needs to be tunable (and with the other suggestions). But
>> this is actually not the most important tunable: the detection
>> threshold (rh_attr.sample_period) should be way more important.
>
> So being totally ignorant of the detail of how rowhammer abuses the DDR
> thing, would it make sense to trigger more often and delay shorter? Or
> is there some minimal delay required for things to settle or something.

Would it make sense to sample the counter on context switch, do some
accounting on a per-task cache miss counter, and slow down just the
single task(s) with a too high cache miss rate? That way there's no
global slowdown (which I assume would be the case here). The task's
slice of CPU would have to be taken into account because otherwise you
could have multiple cooperating tasks that each escape the limit but
taken together go above it.


Vegard

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: rowhammer protection [was Re: Getting interrupt every million cache misses]
  2016-10-28  9:27                   ` [kernel-hardening] " Vegard Nossum
@ 2016-10-28  9:35                     ` Ingo Molnar
  -1 siblings, 0 replies; 79+ messages in thread
From: Ingo Molnar @ 2016-10-28  9:35 UTC (permalink / raw)
  To: Vegard Nossum
  Cc: Peter Zijlstra, Pavel Machek, Kees Cook,
	Arnaldo Carvalho de Melo, kernel list, Ingo Molnar,
	Alexander Shishkin, kernel-hardening


* Vegard Nossum <vegard.nossum@gmail.com> wrote:

> Would it make sense to sample the counter on context switch, do some
> accounting on a per-task cache miss counter, and slow down just the
> single task(s) with a too high cache miss rate? That way there's no
> global slowdown (which I assume would be the case here). The task's
> slice of CPU would have to be taken into account because otherwise you
> could have multiple cooperating tasks that each escape the limit but
> taken together go above it.

Attackers could work this around by splitting the rowhammer workload between 
multiple threads/processes.

I.e. the problem is that the risk may come from any 'unprivileged user-space 
code', where the rowhammer workload might be spread over multiple threads, 
processes or even users.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 79+ messages in thread

* [kernel-hardening] Re: rowhammer protection [was Re: Getting interrupt every million cache misses]
@ 2016-10-28  9:35                     ` Ingo Molnar
  0 siblings, 0 replies; 79+ messages in thread
From: Ingo Molnar @ 2016-10-28  9:35 UTC (permalink / raw)
  To: Vegard Nossum
  Cc: Peter Zijlstra, Pavel Machek, Kees Cook,
	Arnaldo Carvalho de Melo, kernel list, Ingo Molnar,
	Alexander Shishkin, kernel-hardening


* Vegard Nossum <vegard.nossum@gmail.com> wrote:

> Would it make sense to sample the counter on context switch, do some
> accounting on a per-task cache miss counter, and slow down just the
> single task(s) with a too high cache miss rate? That way there's no
> global slowdown (which I assume would be the case here). The task's
> slice of CPU would have to be taken into account because otherwise you
> could have multiple cooperating tasks that each escape the limit but
> taken together go above it.

Attackers could work this around by splitting the rowhammer workload between 
multiple threads/processes.

I.e. the problem is that the risk may come from any 'unprivileged user-space 
code', where the rowhammer workload might be spread over multiple threads, 
processes or even users.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: rowhammer protection [was Re: Getting interrupt every million cache misses]
  2016-10-28  9:35                     ` [kernel-hardening] " Ingo Molnar
@ 2016-10-28  9:47                       ` Vegard Nossum
  -1 siblings, 0 replies; 79+ messages in thread
From: Vegard Nossum @ 2016-10-28  9:47 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Peter Zijlstra, Pavel Machek, Kees Cook,
	Arnaldo Carvalho de Melo, kernel list, Ingo Molnar,
	Alexander Shishkin, kernel-hardening

On 28 October 2016 at 11:35, Ingo Molnar <mingo@kernel.org> wrote:
>
> * Vegard Nossum <vegard.nossum@gmail.com> wrote:
>
>> Would it make sense to sample the counter on context switch, do some
>> accounting on a per-task cache miss counter, and slow down just the
>> single task(s) with a too high cache miss rate? That way there's no
>> global slowdown (which I assume would be the case here). The task's
>> slice of CPU would have to be taken into account because otherwise you
>> could have multiple cooperating tasks that each escape the limit but
>> taken together go above it.
>
> Attackers could work this around by splitting the rowhammer workload between
> multiple threads/processes.
>
> I.e. the problem is that the risk may come from any 'unprivileged user-space
> code', where the rowhammer workload might be spread over multiple threads,
> processes or even users.

That's why I emphasised the number of misses per CPU slice rather than
just the total number of misses. I assumed there must be at least one
task continuously hammering memory for a successful attack, in which
case it should be observable with as little as 1 slice of CPU (however
long that is), no?


Vegard

^ permalink raw reply	[flat|nested] 79+ messages in thread

* [kernel-hardening] Re: rowhammer protection [was Re: Getting interrupt every million cache misses]
@ 2016-10-28  9:47                       ` Vegard Nossum
  0 siblings, 0 replies; 79+ messages in thread
From: Vegard Nossum @ 2016-10-28  9:47 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Peter Zijlstra, Pavel Machek, Kees Cook,
	Arnaldo Carvalho de Melo, kernel list, Ingo Molnar,
	Alexander Shishkin, kernel-hardening

On 28 October 2016 at 11:35, Ingo Molnar <mingo@kernel.org> wrote:
>
> * Vegard Nossum <vegard.nossum@gmail.com> wrote:
>
>> Would it make sense to sample the counter on context switch, do some
>> accounting on a per-task cache miss counter, and slow down just the
>> single task(s) with a too high cache miss rate? That way there's no
>> global slowdown (which I assume would be the case here). The task's
>> slice of CPU would have to be taken into account because otherwise you
>> could have multiple cooperating tasks that each escape the limit but
>> taken together go above it.
>
> Attackers could work this around by splitting the rowhammer workload between
> multiple threads/processes.
>
> I.e. the problem is that the risk may come from any 'unprivileged user-space
> code', where the rowhammer workload might be spread over multiple threads,
> processes or even users.

That's why I emphasised the number of misses per CPU slice rather than
just the total number of misses. I assumed there must be at least one
task continuously hammering memory for a successful attack, in which
case it should be observable with as little as 1 slice of CPU (however
long that is), no?


Vegard

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [kernel-hardening] rowhammer protection [was Re: Getting interrupt every million cache misses]
  2016-10-27 21:27           ` [kernel-hardening] " Pavel Machek
@ 2016-10-28  9:51             ` Mark Rutland
  -1 siblings, 0 replies; 79+ messages in thread
From: Mark Rutland @ 2016-10-28  9:51 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Kees Cook, Peter Zijlstra, Arnaldo Carvalho de Melo, kernel list,
	Ingo Molnar, Alexander Shishkin, kernel-hardening

Hi,

I missed the original, so I've lost some context.

Has this been tested on a system vulnerable to rowhammer, and if so, was
it reliable in mitigating the issue?

Which particular attack codebase was it tested against?

On Thu, Oct 27, 2016 at 11:27:47PM +0200, Pavel Machek wrote:
> --- /dev/null
> +++ b/kernel/events/nohammer.c
> @@ -0,0 +1,66 @@
> +/*
> + * Thanks to Peter Zijlstra <peterz@infradead.org>.
> + */
> +
> +#include <linux/perf_event.h>
> +#include <linux/module.h>
> +#include <linux/delay.h>
> +
> +struct perf_event_attr rh_attr = {
> +	.type	= PERF_TYPE_HARDWARE,
> +	.config = PERF_COUNT_HW_CACHE_MISSES,
> +	.size	= sizeof(struct perf_event_attr),
> +	.pinned	= 1,
> +	/* FIXME: it is 1000000 per cpu. */
> +	.sample_period = 500000,
> +};

I'm not sure that this is general enough to live in core code, because:

* there are existing ways around this (e.g. in the drammer case, using a
  non-cacheable mapping, which I don't believe would count as a cache
  miss).

  Given that, I'm very worried that this gives the false impression of
  protection in cases where a software workaround of this sort is
  insufficient or impossible.

* the precise semantics of performance counter events varies drastically
  across implementations. PERF_COUNT_HW_CACHE_MISSES, might only map to
  one particular level of cache, and/or may not be implemented on all
  cores.

* On some implementations, it may be that the counters are not
  interchangeable, and for those this would take away
  PERF_COUNT_HW_CACHE_MISSES from existing users.

> +static DEFINE_PER_CPU(struct perf_event *, rh_event);
> +static DEFINE_PER_CPU(u64, rh_timestamp);
> +
> +static void rh_overflow(struct perf_event *event, struct perf_sample_data *data, struct pt_regs *regs)
> +{
> +	u64 *ts = this_cpu_ptr(&rh_timestamp); /* this is NMI context */
> +	u64 now = ktime_get_mono_fast_ns();
> +	s64 delta = now - *ts;
> +
> +	*ts = now;
> +
> +	/* FIXME msec per usec, reverse logic? */
> +	if (delta < 64 * NSEC_PER_MSEC)
> +		mdelay(56);
> +}

If I round-robin my attack across CPUs, how much does this help?

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [kernel-hardening] rowhammer protection [was Re: Getting interrupt every million cache misses]
@ 2016-10-28  9:51             ` Mark Rutland
  0 siblings, 0 replies; 79+ messages in thread
From: Mark Rutland @ 2016-10-28  9:51 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Kees Cook, Peter Zijlstra, Arnaldo Carvalho de Melo, kernel list,
	Ingo Molnar, Alexander Shishkin, kernel-hardening

Hi,

I missed the original, so I've lost some context.

Has this been tested on a system vulnerable to rowhammer, and if so, was
it reliable in mitigating the issue?

Which particular attack codebase was it tested against?

On Thu, Oct 27, 2016 at 11:27:47PM +0200, Pavel Machek wrote:
> --- /dev/null
> +++ b/kernel/events/nohammer.c
> @@ -0,0 +1,66 @@
> +/*
> + * Thanks to Peter Zijlstra <peterz@infradead.org>.
> + */
> +
> +#include <linux/perf_event.h>
> +#include <linux/module.h>
> +#include <linux/delay.h>
> +
> +struct perf_event_attr rh_attr = {
> +	.type	= PERF_TYPE_HARDWARE,
> +	.config = PERF_COUNT_HW_CACHE_MISSES,
> +	.size	= sizeof(struct perf_event_attr),
> +	.pinned	= 1,
> +	/* FIXME: it is 1000000 per cpu. */
> +	.sample_period = 500000,
> +};

I'm not sure that this is general enough to live in core code, because:

* there are existing ways around this (e.g. in the drammer case, using a
  non-cacheable mapping, which I don't believe would count as a cache
  miss).

  Given that, I'm very worried that this gives the false impression of
  protection in cases where a software workaround of this sort is
  insufficient or impossible.

* the precise semantics of performance counter events varies drastically
  across implementations. PERF_COUNT_HW_CACHE_MISSES, might only map to
  one particular level of cache, and/or may not be implemented on all
  cores.

* On some implementations, it may be that the counters are not
  interchangeable, and for those this would take away
  PERF_COUNT_HW_CACHE_MISSES from existing users.

> +static DEFINE_PER_CPU(struct perf_event *, rh_event);
> +static DEFINE_PER_CPU(u64, rh_timestamp);
> +
> +static void rh_overflow(struct perf_event *event, struct perf_sample_data *data, struct pt_regs *regs)
> +{
> +	u64 *ts = this_cpu_ptr(&rh_timestamp); /* this is NMI context */
> +	u64 now = ktime_get_mono_fast_ns();
> +	s64 delta = now - *ts;
> +
> +	*ts = now;
> +
> +	/* FIXME msec per usec, reverse logic? */
> +	if (delta < 64 * NSEC_PER_MSEC)
> +		mdelay(56);
> +}

If I round-robin my attack across CPUs, how much does this help?

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [kernel-hardening] Re: rowhammer protection [was Re: Getting interrupt every million cache misses]
  2016-10-28  9:35                     ` [kernel-hardening] " Ingo Molnar
  (?)
  (?)
@ 2016-10-28  9:53                     ` Mark Rutland
  -1 siblings, 0 replies; 79+ messages in thread
From: Mark Rutland @ 2016-10-28  9:53 UTC (permalink / raw)
  To: kernel-hardening
  Cc: Vegard Nossum, Peter Zijlstra, Pavel Machek, Kees Cook,
	Arnaldo Carvalho de Melo, kernel list, Ingo Molnar,
	Alexander Shishkin

On Fri, Oct 28, 2016 at 11:35:47AM +0200, Ingo Molnar wrote:
> 
> * Vegard Nossum <vegard.nossum@gmail.com> wrote:
> 
> > Would it make sense to sample the counter on context switch, do some
> > accounting on a per-task cache miss counter, and slow down just the
> > single task(s) with a too high cache miss rate? That way there's no
> > global slowdown (which I assume would be the case here). The task's
> > slice of CPU would have to be taken into account because otherwise you
> > could have multiple cooperating tasks that each escape the limit but
> > taken together go above it.
> 
> Attackers could work this around by splitting the rowhammer workload between 
> multiple threads/processes.

With the proposed approach, they could split across multiple CPUs
instead, no?

... or was that covered in a prior thread?

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [kernel-hardening] rowhammer protection [was Re: Getting interrupt every million cache misses]
  2016-10-28  9:51             ` Mark Rutland
@ 2016-10-28 11:21               ` Pavel Machek
  -1 siblings, 0 replies; 79+ messages in thread
From: Pavel Machek @ 2016-10-28 11:21 UTC (permalink / raw)
  To: Mark Rutland
  Cc: Kees Cook, Peter Zijlstra, Arnaldo Carvalho de Melo, kernel list,
	Ingo Molnar, Alexander Shishkin, kernel-hardening

[-- Attachment #1: Type: text/plain, Size: 8025 bytes --]

Hi!

> I missed the original, so I've lost some context.

You can read it on lkml, but I guess you did not lose anything
important.

> Has this been tested on a system vulnerable to rowhammer, and if so, was
> it reliable in mitigating the issue?
> 
> Which particular attack codebase was it tested against?

I have rowhammer-test here,

commit 9824453fff76e0a3f5d1ac8200bc6c447c4fff57
Author: Mark Seaborn <mseaborn@chromium.org>

. I do not have vulnerable machine near me, so no "real" tests, but
I'm pretty sure it will make the error no longer reproducible with the
newer version. [Help welcome ;-)]

> > +struct perf_event_attr rh_attr = {
> > +	.type	= PERF_TYPE_HARDWARE,
> > +	.config = PERF_COUNT_HW_CACHE_MISSES,
> > +	.size	= sizeof(struct perf_event_attr),
> > +	.pinned	= 1,
> > +	/* FIXME: it is 1000000 per cpu. */
> > +	.sample_period = 500000,
> > +};
> 
> I'm not sure that this is general enough to live in core code, because:

Well, I'd like to postpone debate 'where does it live' to the later
stage. The problem is not arch-specific, the solution is not too
arch-specific either. I believe we can use Kconfig to hide it from
users where it does not apply. Anyway, lets decide if it works and
where, first.

> * the precise semantics of performance counter events varies drastically
>   across implementations. PERF_COUNT_HW_CACHE_MISSES, might only map to
>   one particular level of cache, and/or may not be implemented on all
>   cores.

If it maps to one particular cache level, we are fine (or maybe will
trigger protection too often). If some cores are not counted, that's
bad.

> * On some implementations, it may be that the counters are not
>   interchangeable, and for those this would take away
>   PERF_COUNT_HW_CACHE_MISSES from existing users.

Yup. Note that with this kind of protection, one missing performance
counter is likely to be small problem.

> > +	*ts = now;
> > +
> > +	/* FIXME msec per usec, reverse logic? */
> > +	if (delta < 64 * NSEC_PER_MSEC)
> > +		mdelay(56);
> > +}
> 
> If I round-robin my attack across CPUs, how much does this help?

See below for new explanation. With 2 CPUs, we are fine. On monster
big-little 8-core machines, we'd probably trigger protection too
often.

								Pavel

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index e24e981..c6ffcaf 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -315,6 +315,7 @@ config PGTABLE_LEVELS
 
 source "init/Kconfig"
 source "kernel/Kconfig.freezer"
+source "kernel/events/Kconfig"
 
 menu "Processor type and features"
 
diff --git a/kernel/events/Kconfig b/kernel/events/Kconfig
new file mode 100644
index 0000000..7359427
--- /dev/null
+++ b/kernel/events/Kconfig
@@ -0,0 +1,9 @@
+config NOHAMMER
+        tristate "Rowhammer protection"
+        help
+	  Enable rowhammer attack prevention. Will degrade system
+	  performance under attack so much that attack should not
+	  be feasible.
+
+	  To compile this driver as a module, choose M here: the
+	  module will be called nohammer.
diff --git a/kernel/events/Makefile b/kernel/events/Makefile
index 2925188..03a2785 100644
--- a/kernel/events/Makefile
+++ b/kernel/events/Makefile
@@ -4,6 +4,8 @@ endif
 
 obj-y := core.o ring_buffer.o callchain.o
 
+obj-$(CONFIG_NOHAMMER) += nohammer.o
+
 obj-$(CONFIG_HAVE_HW_BREAKPOINT) += hw_breakpoint.o
 obj-$(CONFIG_UPROBES) += uprobes.o
 
diff --git a/kernel/events/nohammer.c b/kernel/events/nohammer.c
new file mode 100644
index 0000000..d96bacd
--- /dev/null
+++ b/kernel/events/nohammer.c
@@ -0,0 +1,140 @@
+/*
+ * Attempt to prevent rowhammer attack.
+ *
+ * On many new DRAM chips, repeated read access to nearby cells can cause
+ * victim cell to flip bits. Unfortunately, that can be used to gain root
+ * on affected machine, or to execute native code from javascript, escaping
+ * the sandbox.
+ *
+ * Fortunately, a lot of memory accesses is needed between DRAM refresh
+ * cycles. This is rather unusual workload, and we can detect it, and
+ * prevent the DRAM accesses, before bit flips happen.
+ *
+ * Thanks to Peter Zijlstra <peterz@infradead.org>.
+ * Thanks to presentation at blackhat.
+ */
+
+#include <linux/perf_event.h>
+#include <linux/module.h>
+#include <linux/delay.h>
+
+static struct perf_event_attr rh_attr = {
+	.type	= PERF_TYPE_HARDWARE,
+	.config = PERF_COUNT_HW_CACHE_MISSES,
+	.size	= sizeof(struct perf_event_attr),
+	.pinned	= 1,
+	.sample_period = 10000,
+};
+
+/*
+ * How often is the DRAM refreshed. Setting it too high is safe.
+ */
+static int dram_refresh_msec = 64;
+
+static DEFINE_PER_CPU(struct perf_event *, rh_event);
+static DEFINE_PER_CPU(u64, rh_timestamp);
+
+static void rh_overflow(struct perf_event *event, struct perf_sample_data *data, struct pt_regs *regs)
+{
+	u64 *ts = this_cpu_ptr(&rh_timestamp); /* this is NMI context */
+	u64 now = ktime_get_mono_fast_ns();
+	s64 delta = now - *ts;
+
+	*ts = now;
+
+	if (delta < dram_refresh_msec * NSEC_PER_MSEC)
+		mdelay(dram_refresh_msec);
+}
+
+static __init int rh_module_init(void)
+{
+	int cpu;
+
+/*
+ * DRAM refresh is every 64 msec. That is not enough to prevent rowhammer.
+ * Some vendors doubled the refresh rate to 32 msec, that helps a lot, but
+ * does not close the attack completely. 8 msec refresh would probably do
+ * that on almost all chips.
+ *
+ * Thinkpad X60 can produce cca 12,200,000 cache misses a second, that's
+ * 780,800 cache misses per 64 msec window.
+ *
+ * X60 is from generation that is not yet vulnerable from rowhammer, and
+ * is pretty slow machine. That means that this limit is probably very
+ * safe on newer machines.
+ */
+	int cache_misses_per_second = 12200000;
+
+/*
+ * Maximum permitted utilization of DRAM. Setting this to f will mean that
+ * when more than 1/f of maximum cache-miss performance is used, delay will
+ * be inserted, and will have similar effect on rowhammer as refreshing memory
+ * f times more often.
+ *
+ * Setting this to 8 should prevent the rowhammer attack.
+ */
+	int dram_max_utilization_factor = 8;
+
+	/*
+	 * Hardware should be able to do approximately this many
+	 * misses per refresh
+	 */
+	int cache_miss_per_refresh = (cache_misses_per_second * dram_refresh_msec)/1000;
+
+	/*
+	 * So we do not want more than this many accesses to DRAM per
+	 * refresh.
+	 */
+	int cache_miss_limit = cache_miss_per_refresh / dram_max_utilization_factor;
+
+/*
+ * DRAM is shared between CPUs, but these performance counters are per-CPU.
+ */
+	int max_attacking_cpus = 2;
+
+	/*
+	 * We ignore counter overflows "too far away", but some of the
+	 * events might have actually occurent recently. Thus additional
+	 * factor of 2
+	 */
+
+	rh_attr.sample_period = cache_miss_limit / (2*max_attacking_cpus);
+
+	printk("Rowhammer protection limit is set to %d cache misses per %d msec\n",
+	       (int) rh_attr.sample_period, dram_refresh_msec);
+
+	/* XXX borken vs hotplug */
+
+	for_each_online_cpu(cpu) {
+		struct perf_event *event;
+
+		event = perf_event_create_kernel_counter(&rh_attr, cpu, NULL, rh_overflow, NULL);
+		per_cpu(rh_event, cpu) = event;		
+		if (!event) {
+			pr_err("Not enough resources to initialize nohammer on cpu %d\n", cpu);
+			continue;
+		}
+		pr_info("Nohammer initialized on cpu %d\n", cpu);
+	}
+	return 0;
+}
+
+static __exit void rh_module_exit(void)
+{
+	int cpu;
+
+	for_each_online_cpu(cpu) {
+		struct perf_event *event = per_cpu(rh_event, cpu);
+
+		if (event)
+			perf_event_release_kernel(event);
+	}
+	return;
+}
+
+module_init(rh_module_init);
+module_exit(rh_module_exit);
+
+MODULE_DESCRIPTION("Rowhammer protection");
+//MODULE_LICENSE("GPL v2+");
+MODULE_LICENSE("GPL");


-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* Re: [kernel-hardening] rowhammer protection [was Re: Getting interrupt every million cache misses]
@ 2016-10-28 11:21               ` Pavel Machek
  0 siblings, 0 replies; 79+ messages in thread
From: Pavel Machek @ 2016-10-28 11:21 UTC (permalink / raw)
  To: Mark Rutland
  Cc: Kees Cook, Peter Zijlstra, Arnaldo Carvalho de Melo, kernel list,
	Ingo Molnar, Alexander Shishkin, kernel-hardening

[-- Attachment #1: Type: text/plain, Size: 8025 bytes --]

Hi!

> I missed the original, so I've lost some context.

You can read it on lkml, but I guess you did not lose anything
important.

> Has this been tested on a system vulnerable to rowhammer, and if so, was
> it reliable in mitigating the issue?
> 
> Which particular attack codebase was it tested against?

I have rowhammer-test here,

commit 9824453fff76e0a3f5d1ac8200bc6c447c4fff57
Author: Mark Seaborn <mseaborn@chromium.org>

. I do not have vulnerable machine near me, so no "real" tests, but
I'm pretty sure it will make the error no longer reproducible with the
newer version. [Help welcome ;-)]

> > +struct perf_event_attr rh_attr = {
> > +	.type	= PERF_TYPE_HARDWARE,
> > +	.config = PERF_COUNT_HW_CACHE_MISSES,
> > +	.size	= sizeof(struct perf_event_attr),
> > +	.pinned	= 1,
> > +	/* FIXME: it is 1000000 per cpu. */
> > +	.sample_period = 500000,
> > +};
> 
> I'm not sure that this is general enough to live in core code, because:

Well, I'd like to postpone debate 'where does it live' to the later
stage. The problem is not arch-specific, the solution is not too
arch-specific either. I believe we can use Kconfig to hide it from
users where it does not apply. Anyway, lets decide if it works and
where, first.

> * the precise semantics of performance counter events varies drastically
>   across implementations. PERF_COUNT_HW_CACHE_MISSES, might only map to
>   one particular level of cache, and/or may not be implemented on all
>   cores.

If it maps to one particular cache level, we are fine (or maybe will
trigger protection too often). If some cores are not counted, that's
bad.

> * On some implementations, it may be that the counters are not
>   interchangeable, and for those this would take away
>   PERF_COUNT_HW_CACHE_MISSES from existing users.

Yup. Note that with this kind of protection, one missing performance
counter is likely to be small problem.

> > +	*ts = now;
> > +
> > +	/* FIXME msec per usec, reverse logic? */
> > +	if (delta < 64 * NSEC_PER_MSEC)
> > +		mdelay(56);
> > +}
> 
> If I round-robin my attack across CPUs, how much does this help?

See below for new explanation. With 2 CPUs, we are fine. On monster
big-little 8-core machines, we'd probably trigger protection too
often.

								Pavel

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index e24e981..c6ffcaf 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -315,6 +315,7 @@ config PGTABLE_LEVELS
 
 source "init/Kconfig"
 source "kernel/Kconfig.freezer"
+source "kernel/events/Kconfig"
 
 menu "Processor type and features"
 
diff --git a/kernel/events/Kconfig b/kernel/events/Kconfig
new file mode 100644
index 0000000..7359427
--- /dev/null
+++ b/kernel/events/Kconfig
@@ -0,0 +1,9 @@
+config NOHAMMER
+        tristate "Rowhammer protection"
+        help
+	  Enable rowhammer attack prevention. Will degrade system
+	  performance under attack so much that attack should not
+	  be feasible.
+
+	  To compile this driver as a module, choose M here: the
+	  module will be called nohammer.
diff --git a/kernel/events/Makefile b/kernel/events/Makefile
index 2925188..03a2785 100644
--- a/kernel/events/Makefile
+++ b/kernel/events/Makefile
@@ -4,6 +4,8 @@ endif
 
 obj-y := core.o ring_buffer.o callchain.o
 
+obj-$(CONFIG_NOHAMMER) += nohammer.o
+
 obj-$(CONFIG_HAVE_HW_BREAKPOINT) += hw_breakpoint.o
 obj-$(CONFIG_UPROBES) += uprobes.o
 
diff --git a/kernel/events/nohammer.c b/kernel/events/nohammer.c
new file mode 100644
index 0000000..d96bacd
--- /dev/null
+++ b/kernel/events/nohammer.c
@@ -0,0 +1,140 @@
+/*
+ * Attempt to prevent rowhammer attack.
+ *
+ * On many new DRAM chips, repeated read access to nearby cells can cause
+ * victim cell to flip bits. Unfortunately, that can be used to gain root
+ * on affected machine, or to execute native code from javascript, escaping
+ * the sandbox.
+ *
+ * Fortunately, a lot of memory accesses is needed between DRAM refresh
+ * cycles. This is rather unusual workload, and we can detect it, and
+ * prevent the DRAM accesses, before bit flips happen.
+ *
+ * Thanks to Peter Zijlstra <peterz@infradead.org>.
+ * Thanks to presentation at blackhat.
+ */
+
+#include <linux/perf_event.h>
+#include <linux/module.h>
+#include <linux/delay.h>
+
+static struct perf_event_attr rh_attr = {
+	.type	= PERF_TYPE_HARDWARE,
+	.config = PERF_COUNT_HW_CACHE_MISSES,
+	.size	= sizeof(struct perf_event_attr),
+	.pinned	= 1,
+	.sample_period = 10000,
+};
+
+/*
+ * How often is the DRAM refreshed. Setting it too high is safe.
+ */
+static int dram_refresh_msec = 64;
+
+static DEFINE_PER_CPU(struct perf_event *, rh_event);
+static DEFINE_PER_CPU(u64, rh_timestamp);
+
+static void rh_overflow(struct perf_event *event, struct perf_sample_data *data, struct pt_regs *regs)
+{
+	u64 *ts = this_cpu_ptr(&rh_timestamp); /* this is NMI context */
+	u64 now = ktime_get_mono_fast_ns();
+	s64 delta = now - *ts;
+
+	*ts = now;
+
+	if (delta < dram_refresh_msec * NSEC_PER_MSEC)
+		mdelay(dram_refresh_msec);
+}
+
+static __init int rh_module_init(void)
+{
+	int cpu;
+
+/*
+ * DRAM refresh is every 64 msec. That is not enough to prevent rowhammer.
+ * Some vendors doubled the refresh rate to 32 msec, that helps a lot, but
+ * does not close the attack completely. 8 msec refresh would probably do
+ * that on almost all chips.
+ *
+ * Thinkpad X60 can produce cca 12,200,000 cache misses a second, that's
+ * 780,800 cache misses per 64 msec window.
+ *
+ * X60 is from generation that is not yet vulnerable from rowhammer, and
+ * is pretty slow machine. That means that this limit is probably very
+ * safe on newer machines.
+ */
+	int cache_misses_per_second = 12200000;
+
+/*
+ * Maximum permitted utilization of DRAM. Setting this to f will mean that
+ * when more than 1/f of maximum cache-miss performance is used, delay will
+ * be inserted, and will have similar effect on rowhammer as refreshing memory
+ * f times more often.
+ *
+ * Setting this to 8 should prevent the rowhammer attack.
+ */
+	int dram_max_utilization_factor = 8;
+
+	/*
+	 * Hardware should be able to do approximately this many
+	 * misses per refresh
+	 */
+	int cache_miss_per_refresh = (cache_misses_per_second * dram_refresh_msec)/1000;
+
+	/*
+	 * So we do not want more than this many accesses to DRAM per
+	 * refresh.
+	 */
+	int cache_miss_limit = cache_miss_per_refresh / dram_max_utilization_factor;
+
+/*
+ * DRAM is shared between CPUs, but these performance counters are per-CPU.
+ */
+	int max_attacking_cpus = 2;
+
+	/*
+	 * We ignore counter overflows "too far away", but some of the
+	 * events might have actually occurent recently. Thus additional
+	 * factor of 2
+	 */
+
+	rh_attr.sample_period = cache_miss_limit / (2*max_attacking_cpus);
+
+	printk("Rowhammer protection limit is set to %d cache misses per %d msec\n",
+	       (int) rh_attr.sample_period, dram_refresh_msec);
+
+	/* XXX borken vs hotplug */
+
+	for_each_online_cpu(cpu) {
+		struct perf_event *event;
+
+		event = perf_event_create_kernel_counter(&rh_attr, cpu, NULL, rh_overflow, NULL);
+		per_cpu(rh_event, cpu) = event;		
+		if (!event) {
+			pr_err("Not enough resources to initialize nohammer on cpu %d\n", cpu);
+			continue;
+		}
+		pr_info("Nohammer initialized on cpu %d\n", cpu);
+	}
+	return 0;
+}
+
+static __exit void rh_module_exit(void)
+{
+	int cpu;
+
+	for_each_online_cpu(cpu) {
+		struct perf_event *event = per_cpu(rh_event, cpu);
+
+		if (event)
+			perf_event_release_kernel(event);
+	}
+	return;
+}
+
+module_init(rh_module_init);
+module_exit(rh_module_exit);
+
+MODULE_DESCRIPTION("Rowhammer protection");
+//MODULE_LICENSE("GPL v2+");
+MODULE_LICENSE("GPL");


-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* Re: rowhammer protection [was Re: Getting interrupt every million cache misses]
  2016-10-28  9:04                 ` [kernel-hardening] " Peter Zijlstra
@ 2016-10-28 11:27                   ` Pavel Machek
  -1 siblings, 0 replies; 79+ messages in thread
From: Pavel Machek @ 2016-10-28 11:27 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Ingo Molnar, Kees Cook, Arnaldo Carvalho de Melo, kernel list,
	Ingo Molnar, Alexander Shishkin, kernel-hardening

[-- Attachment #1: Type: text/plain, Size: 987 bytes --]

Hi!

> > I agree this needs to be tunable (and with the other suggestions). But
> > this is actually not the most important tunable: the detection
> > threshold (rh_attr.sample_period) should be way more important.
> 
> So being totally ignorant of the detail of how rowhammer abuses the DDR
> thing, would it make sense to trigger more often and delay shorter? Or
> is there some minimal delay required for things to settle or
 > something.

We can trigger more often and delay shorter, but it will mean that
protection will trigger with more false positives. I guess I'll play
with constants too see how big the effect is.

BTW...

[ 6267.180092] INFO: NMI handler (perf_event_nmi_handler) took too
long to run: 63.501 msecs

but I'm doing mdelay(64). .5 msec is not big difference, but...

Best regards,
									Pavel

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

^ permalink raw reply	[flat|nested] 79+ messages in thread

* [kernel-hardening] Re: rowhammer protection [was Re: Getting interrupt every million cache misses]
@ 2016-10-28 11:27                   ` Pavel Machek
  0 siblings, 0 replies; 79+ messages in thread
From: Pavel Machek @ 2016-10-28 11:27 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Ingo Molnar, Kees Cook, Arnaldo Carvalho de Melo, kernel list,
	Ingo Molnar, Alexander Shishkin, kernel-hardening

[-- Attachment #1: Type: text/plain, Size: 987 bytes --]

Hi!

> > I agree this needs to be tunable (and with the other suggestions). But
> > this is actually not the most important tunable: the detection
> > threshold (rh_attr.sample_period) should be way more important.
> 
> So being totally ignorant of the detail of how rowhammer abuses the DDR
> thing, would it make sense to trigger more often and delay shorter? Or
> is there some minimal delay required for things to settle or
 > something.

We can trigger more often and delay shorter, but it will mean that
protection will trigger with more false positives. I guess I'll play
with constants too see how big the effect is.

BTW...

[ 6267.180092] INFO: NMI handler (perf_event_nmi_handler) took too
long to run: 63.501 msecs

but I'm doing mdelay(64). .5 msec is not big difference, but...

Best regards,
									Pavel

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: rowhammer protection [was Re: Getting interrupt every million cache misses]
  2016-10-28  8:59                 ` [kernel-hardening] " Ingo Molnar
@ 2016-10-28 11:55                   ` Pavel Machek
  -1 siblings, 0 replies; 79+ messages in thread
From: Pavel Machek @ 2016-10-28 11:55 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Kees Cook, Peter Zijlstra, Arnaldo Carvalho de Melo, kernel list,
	Ingo Molnar, Alexander Shishkin, kernel-hardening

[-- Attachment #1: Type: text/plain, Size: 1557 bytes --]

Hi!

> > I agree this needs to be tunable (and with the other suggestions). But
> > this is actually not the most important tunable: the detection
> > threshold (rh_attr.sample_period) should be way more important.
> > 
> > And yes, this will all need to be tunable, somehow. But lets verify
> > that this works, first :-).
> 
> Yeah.
> 
> Btw., a 56 NMI delay is pretty brutal in terms of latencies - it might
> result in a smoother system to detect 100,000 cache misses and do a
> ~5.6 msecs delay instead?
> 
> (Assuming the shorter threshold does not trigger too often, of
> course.)

Yeah, it is brutal workaround for a nasty bug. Slowdown depends on maximum utilization:

+/*
+ * Maximum permitted utilization of DRAM. Setting this to f will mean that
+ * when more than 1/f of maximum cache-miss performance is used, delay will
+ * be inserted, and will have similar effect on rowhammer as refreshing memory
+ * f times more often.
+ *
+ * Setting this to 8 should prevent the rowhammer attack.
+ */
+       int dram_max_utilization_factor = 8;

|                               | no prot. | fact. 1 | fact. 2 | fact. 8 |
| linux-n900$ time ./mkit       | 1m35     | 1m47    | 2m07    | 6m37    |
| rowhammer-test (for 43200000) | 2.86     | 9.75    | 16.7307 | 59.3738 |

(With factor 1 and 2 cpu attacker, we don't guarantee any protection.)

Best regards,
									Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

^ permalink raw reply	[flat|nested] 79+ messages in thread

* [kernel-hardening] Re: rowhammer protection [was Re: Getting interrupt every million cache misses]
@ 2016-10-28 11:55                   ` Pavel Machek
  0 siblings, 0 replies; 79+ messages in thread
From: Pavel Machek @ 2016-10-28 11:55 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Kees Cook, Peter Zijlstra, Arnaldo Carvalho de Melo, kernel list,
	Ingo Molnar, Alexander Shishkin, kernel-hardening

[-- Attachment #1: Type: text/plain, Size: 1557 bytes --]

Hi!

> > I agree this needs to be tunable (and with the other suggestions). But
> > this is actually not the most important tunable: the detection
> > threshold (rh_attr.sample_period) should be way more important.
> > 
> > And yes, this will all need to be tunable, somehow. But lets verify
> > that this works, first :-).
> 
> Yeah.
> 
> Btw., a 56 NMI delay is pretty brutal in terms of latencies - it might
> result in a smoother system to detect 100,000 cache misses and do a
> ~5.6 msecs delay instead?
> 
> (Assuming the shorter threshold does not trigger too often, of
> course.)

Yeah, it is brutal workaround for a nasty bug. Slowdown depends on maximum utilization:

+/*
+ * Maximum permitted utilization of DRAM. Setting this to f will mean that
+ * when more than 1/f of maximum cache-miss performance is used, delay will
+ * be inserted, and will have similar effect on rowhammer as refreshing memory
+ * f times more often.
+ *
+ * Setting this to 8 should prevent the rowhammer attack.
+ */
+       int dram_max_utilization_factor = 8;

|                               | no prot. | fact. 1 | fact. 2 | fact. 8 |
| linux-n900$ time ./mkit       | 1m35     | 1m47    | 2m07    | 6m37    |
| rowhammer-test (for 43200000) | 2.86     | 9.75    | 16.7307 | 59.3738 |

(With factor 1 and 2 cpu attacker, we don't guarantee any protection.)

Best regards,
									Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [kernel-hardening] rowhammer protection [was Re: Getting interrupt every million cache misses]
  2016-10-28 11:21               ` Pavel Machek
@ 2016-10-28 14:05                 ` Mark Rutland
  -1 siblings, 0 replies; 79+ messages in thread
From: Mark Rutland @ 2016-10-28 14:05 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Kees Cook, Peter Zijlstra, Arnaldo Carvalho de Melo, kernel list,
	Ingo Molnar, Alexander Shishkin, kernel-hardening

Hi,

On Fri, Oct 28, 2016 at 01:21:36PM +0200, Pavel Machek wrote:
> > Has this been tested on a system vulnerable to rowhammer, and if so, was
> > it reliable in mitigating the issue?
> > 
> > Which particular attack codebase was it tested against?
> 
> I have rowhammer-test here,
> 
> commit 9824453fff76e0a3f5d1ac8200bc6c447c4fff57
> Author: Mark Seaborn <mseaborn@chromium.org>

... from which repo?

> I do not have vulnerable machine near me, so no "real" tests, but
> I'm pretty sure it will make the error no longer reproducible with the
> newer version. [Help welcome ;-)]

Even if we hope this works, I think we have to be very careful with that
kind of assertion. Until we have data is to its efficacy, I don't think
we should claim that this is an effective mitigation.

> > > +struct perf_event_attr rh_attr = {
> > > +	.type	= PERF_TYPE_HARDWARE,
> > > +	.config = PERF_COUNT_HW_CACHE_MISSES,
> > > +	.size	= sizeof(struct perf_event_attr),
> > > +	.pinned	= 1,
> > > +	/* FIXME: it is 1000000 per cpu. */
> > > +	.sample_period = 500000,
> > > +};
> > 
> > I'm not sure that this is general enough to live in core code, because:
> 
> Well, I'd like to postpone debate 'where does it live' to the later
> stage. The problem is not arch-specific, the solution is not too
> arch-specific either. I believe we can use Kconfig to hide it from
> users where it does not apply. Anyway, lets decide if it works and
> where, first.

You seem to have forgotten the drammer case here, which this would not
have protected against. I'm not sure, but I suspect that we could have
similar issues with mappings using other attributes (e.g write-through),
as these would cause the memory traffic without cache miss events.

> > * the precise semantics of performance counter events varies drastically
> >   across implementations. PERF_COUNT_HW_CACHE_MISSES, might only map to
> >   one particular level of cache, and/or may not be implemented on all
> >   cores.
> 
> If it maps to one particular cache level, we are fine (or maybe will
> trigger protection too often). If some cores are not counted, that's bad.

Perhaps, but that depends on a number of implementation details. If "too
often" means "all the time", people will turn this off when they could
otherwise have been protected (e.g. if we can accurately monitor the
last level of cache).

> > * On some implementations, it may be that the counters are not
> >   interchangeable, and for those this would take away
> >   PERF_COUNT_HW_CACHE_MISSES from existing users.
> 
> Yup. Note that with this kind of protection, one missing performance
> counter is likely to be small problem.

That depends. Who chooses when to turn this on? If it's down to the
distro, this can adversely affect users with perfectly safe DRAM.

> > > +	/* FIXME msec per usec, reverse logic? */
> > > +	if (delta < 64 * NSEC_PER_MSEC)
> > > +		mdelay(56);
> > > +}
> > 
> > If I round-robin my attack across CPUs, how much does this help?
> 
> See below for new explanation. With 2 CPUs, we are fine. On monster
> big-little 8-core machines, we'd probably trigger protection too
> often.

We see larger core counts in mobile devices these days. In China,
octa-core phones are popular, for example. Servers go much larger.

> diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
> index e24e981..c6ffcaf 100644
> --- a/arch/x86/Kconfig
> +++ b/arch/x86/Kconfig
> @@ -315,6 +315,7 @@ config PGTABLE_LEVELS
>  
>  source "init/Kconfig"
>  source "kernel/Kconfig.freezer"
> +source "kernel/events/Kconfig"
>  
>  menu "Processor type and features"
>  
> diff --git a/kernel/events/Kconfig b/kernel/events/Kconfig
> new file mode 100644
> index 0000000..7359427
> --- /dev/null
> +++ b/kernel/events/Kconfig
> @@ -0,0 +1,9 @@
> +config NOHAMMER
> +        tristate "Rowhammer protection"
> +        help
> +	  Enable rowhammer attack prevention. Will degrade system
> +	  performance under attack so much that attack should not
> +	  be feasible.


I think that this must make it clear that this is a best-effort approach
(i.e. it does not guarantee that an attack is not possible), and also
should make clear that said penalty may occur in other situations.

[...]

> +static struct perf_event_attr rh_attr = {
> +	.type	= PERF_TYPE_HARDWARE,
> +	.config = PERF_COUNT_HW_CACHE_MISSES,
> +	.size	= sizeof(struct perf_event_attr),
> +	.pinned	= 1,
> +	.sample_period = 10000,
> +};

What kind of overhead (just from taking the interrupt) will this come
with?

> +/*
> + * How often is the DRAM refreshed. Setting it too high is safe.
> + */

Stale comment? Given the check against delta below, this doesn't look to
be true.

> +static int dram_refresh_msec = 64;
> +
> +static DEFINE_PER_CPU(struct perf_event *, rh_event);
> +static DEFINE_PER_CPU(u64, rh_timestamp);
> +
> +static void rh_overflow(struct perf_event *event, struct perf_sample_data *data, struct pt_regs *regs)
> +{
> +	u64 *ts = this_cpu_ptr(&rh_timestamp); /* this is NMI context */
> +	u64 now = ktime_get_mono_fast_ns();
> +	s64 delta = now - *ts;
> +
> +	*ts = now;
> +
> +	if (delta < dram_refresh_msec * NSEC_PER_MSEC)
> +		mdelay(dram_refresh_msec);
> +}

[...]

> +/*
> + * DRAM is shared between CPUs, but these performance counters are per-CPU.
> + */
> +	int max_attacking_cpus = 2;

As above, many systems today have more than two CPUs. In the drammmer
paper, it looks like the majority had four.

Thanks
Mark.

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [kernel-hardening] rowhammer protection [was Re: Getting interrupt every million cache misses]
@ 2016-10-28 14:05                 ` Mark Rutland
  0 siblings, 0 replies; 79+ messages in thread
From: Mark Rutland @ 2016-10-28 14:05 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Kees Cook, Peter Zijlstra, Arnaldo Carvalho de Melo, kernel list,
	Ingo Molnar, Alexander Shishkin, kernel-hardening

Hi,

On Fri, Oct 28, 2016 at 01:21:36PM +0200, Pavel Machek wrote:
> > Has this been tested on a system vulnerable to rowhammer, and if so, was
> > it reliable in mitigating the issue?
> > 
> > Which particular attack codebase was it tested against?
> 
> I have rowhammer-test here,
> 
> commit 9824453fff76e0a3f5d1ac8200bc6c447c4fff57
> Author: Mark Seaborn <mseaborn@chromium.org>

... from which repo?

> I do not have vulnerable machine near me, so no "real" tests, but
> I'm pretty sure it will make the error no longer reproducible with the
> newer version. [Help welcome ;-)]

Even if we hope this works, I think we have to be very careful with that
kind of assertion. Until we have data is to its efficacy, I don't think
we should claim that this is an effective mitigation.

> > > +struct perf_event_attr rh_attr = {
> > > +	.type	= PERF_TYPE_HARDWARE,
> > > +	.config = PERF_COUNT_HW_CACHE_MISSES,
> > > +	.size	= sizeof(struct perf_event_attr),
> > > +	.pinned	= 1,
> > > +	/* FIXME: it is 1000000 per cpu. */
> > > +	.sample_period = 500000,
> > > +};
> > 
> > I'm not sure that this is general enough to live in core code, because:
> 
> Well, I'd like to postpone debate 'where does it live' to the later
> stage. The problem is not arch-specific, the solution is not too
> arch-specific either. I believe we can use Kconfig to hide it from
> users where it does not apply. Anyway, lets decide if it works and
> where, first.

You seem to have forgotten the drammer case here, which this would not
have protected against. I'm not sure, but I suspect that we could have
similar issues with mappings using other attributes (e.g write-through),
as these would cause the memory traffic without cache miss events.

> > * the precise semantics of performance counter events varies drastically
> >   across implementations. PERF_COUNT_HW_CACHE_MISSES, might only map to
> >   one particular level of cache, and/or may not be implemented on all
> >   cores.
> 
> If it maps to one particular cache level, we are fine (or maybe will
> trigger protection too often). If some cores are not counted, that's bad.

Perhaps, but that depends on a number of implementation details. If "too
often" means "all the time", people will turn this off when they could
otherwise have been protected (e.g. if we can accurately monitor the
last level of cache).

> > * On some implementations, it may be that the counters are not
> >   interchangeable, and for those this would take away
> >   PERF_COUNT_HW_CACHE_MISSES from existing users.
> 
> Yup. Note that with this kind of protection, one missing performance
> counter is likely to be small problem.

That depends. Who chooses when to turn this on? If it's down to the
distro, this can adversely affect users with perfectly safe DRAM.

> > > +	/* FIXME msec per usec, reverse logic? */
> > > +	if (delta < 64 * NSEC_PER_MSEC)
> > > +		mdelay(56);
> > > +}
> > 
> > If I round-robin my attack across CPUs, how much does this help?
> 
> See below for new explanation. With 2 CPUs, we are fine. On monster
> big-little 8-core machines, we'd probably trigger protection too
> often.

We see larger core counts in mobile devices these days. In China,
octa-core phones are popular, for example. Servers go much larger.

> diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
> index e24e981..c6ffcaf 100644
> --- a/arch/x86/Kconfig
> +++ b/arch/x86/Kconfig
> @@ -315,6 +315,7 @@ config PGTABLE_LEVELS
>  
>  source "init/Kconfig"
>  source "kernel/Kconfig.freezer"
> +source "kernel/events/Kconfig"
>  
>  menu "Processor type and features"
>  
> diff --git a/kernel/events/Kconfig b/kernel/events/Kconfig
> new file mode 100644
> index 0000000..7359427
> --- /dev/null
> +++ b/kernel/events/Kconfig
> @@ -0,0 +1,9 @@
> +config NOHAMMER
> +        tristate "Rowhammer protection"
> +        help
> +	  Enable rowhammer attack prevention. Will degrade system
> +	  performance under attack so much that attack should not
> +	  be feasible.


I think that this must make it clear that this is a best-effort approach
(i.e. it does not guarantee that an attack is not possible), and also
should make clear that said penalty may occur in other situations.

[...]

> +static struct perf_event_attr rh_attr = {
> +	.type	= PERF_TYPE_HARDWARE,
> +	.config = PERF_COUNT_HW_CACHE_MISSES,
> +	.size	= sizeof(struct perf_event_attr),
> +	.pinned	= 1,
> +	.sample_period = 10000,
> +};

What kind of overhead (just from taking the interrupt) will this come
with?

> +/*
> + * How often is the DRAM refreshed. Setting it too high is safe.
> + */

Stale comment? Given the check against delta below, this doesn't look to
be true.

> +static int dram_refresh_msec = 64;
> +
> +static DEFINE_PER_CPU(struct perf_event *, rh_event);
> +static DEFINE_PER_CPU(u64, rh_timestamp);
> +
> +static void rh_overflow(struct perf_event *event, struct perf_sample_data *data, struct pt_regs *regs)
> +{
> +	u64 *ts = this_cpu_ptr(&rh_timestamp); /* this is NMI context */
> +	u64 now = ktime_get_mono_fast_ns();
> +	s64 delta = now - *ts;
> +
> +	*ts = now;
> +
> +	if (delta < dram_refresh_msec * NSEC_PER_MSEC)
> +		mdelay(dram_refresh_msec);
> +}

[...]

> +/*
> + * DRAM is shared between CPUs, but these performance counters are per-CPU.
> + */
> +	int max_attacking_cpus = 2;

As above, many systems today have more than two CPUs. In the drammmer
paper, it looks like the majority had four.

Thanks
Mark.

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [kernel-hardening] rowhammer protection [was Re: Getting interrupt every million cache misses]
  2016-10-28 14:05                 ` Mark Rutland
@ 2016-10-28 14:18                   ` Peter Zijlstra
  -1 siblings, 0 replies; 79+ messages in thread
From: Peter Zijlstra @ 2016-10-28 14:18 UTC (permalink / raw)
  To: Mark Rutland
  Cc: Pavel Machek, Kees Cook, Arnaldo Carvalho de Melo, kernel list,
	Ingo Molnar, Alexander Shishkin, kernel-hardening

On Fri, Oct 28, 2016 at 03:05:22PM +0100, Mark Rutland wrote:
> 
> > > * the precise semantics of performance counter events varies drastically
> > >   across implementations. PERF_COUNT_HW_CACHE_MISSES, might only map to
> > >   one particular level of cache, and/or may not be implemented on all
> > >   cores.
> > 
> > If it maps to one particular cache level, we are fine (or maybe will
> > trigger protection too often). If some cores are not counted, that's bad.
> 
> Perhaps, but that depends on a number of implementation details. If "too
> often" means "all the time", people will turn this off when they could
> otherwise have been protected (e.g. if we can accurately monitor the
> last level of cache).

Right, so one of the things mentioned in the paper is x86 NT stores.
Those are not cached and I'm not at all sure they're accounted in the
event we use for cache misses.

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [kernel-hardening] rowhammer protection [was Re: Getting interrupt every million cache misses]
@ 2016-10-28 14:18                   ` Peter Zijlstra
  0 siblings, 0 replies; 79+ messages in thread
From: Peter Zijlstra @ 2016-10-28 14:18 UTC (permalink / raw)
  To: Mark Rutland
  Cc: Pavel Machek, Kees Cook, Arnaldo Carvalho de Melo, kernel list,
	Ingo Molnar, Alexander Shishkin, kernel-hardening

On Fri, Oct 28, 2016 at 03:05:22PM +0100, Mark Rutland wrote:
> 
> > > * the precise semantics of performance counter events varies drastically
> > >   across implementations. PERF_COUNT_HW_CACHE_MISSES, might only map to
> > >   one particular level of cache, and/or may not be implemented on all
> > >   cores.
> > 
> > If it maps to one particular cache level, we are fine (or maybe will
> > trigger protection too often). If some cores are not counted, that's bad.
> 
> Perhaps, but that depends on a number of implementation details. If "too
> often" means "all the time", people will turn this off when they could
> otherwise have been protected (e.g. if we can accurately monitor the
> last level of cache).

Right, so one of the things mentioned in the paper is x86 NT stores.
Those are not cached and I'm not at all sure they're accounted in the
event we use for cache misses.

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [kernel-hardening] rowhammer protection [was Re: Getting interrupt every million cache misses]
  2016-10-28 14:05                 ` Mark Rutland
@ 2016-10-28 17:27                   ` Pavel Machek
  -1 siblings, 0 replies; 79+ messages in thread
From: Pavel Machek @ 2016-10-28 17:27 UTC (permalink / raw)
  To: Mark Rutland
  Cc: Kees Cook, Peter Zijlstra, Arnaldo Carvalho de Melo, kernel list,
	Ingo Molnar, Alexander Shishkin, kernel-hardening

[-- Attachment #1: Type: text/plain, Size: 5276 bytes --]

Hi!

> On Fri, Oct 28, 2016 at 01:21:36PM +0200, Pavel Machek wrote:
> > > Has this been tested on a system vulnerable to rowhammer, and if so, was
> > > it reliable in mitigating the issue?
> > > 
> > > Which particular attack codebase was it tested against?
> > 
> > I have rowhammer-test here,
> > 
> > commit 9824453fff76e0a3f5d1ac8200bc6c447c4fff57
> > Author: Mark Seaborn <mseaborn@chromium.org>
> 
> ... from which repo?

https://github.com/mseaborn/rowhammer-test.git

> > I do not have vulnerable machine near me, so no "real" tests, but
> > I'm pretty sure it will make the error no longer reproducible with the
> > newer version. [Help welcome ;-)]
> 
> Even if we hope this works, I think we have to be very careful with that
> kind of assertion. Until we have data is to its efficacy, I don't think
> we should claim that this is an effective mitigation.

On my hardware, rowhammer errors are not trivial to reproduce. It
takes time (minutes). I'm pretty sure this will be enough to stop the
exploit. If you have machines where rowhammer errors are really easy
to reproduce, testing on it would be welcome.

> > Well, I'd like to postpone debate 'where does it live' to the later
> > stage. The problem is not arch-specific, the solution is not too
> > arch-specific either. I believe we can use Kconfig to hide it from
> > users where it does not apply. Anyway, lets decide if it works and
> > where, first.
> 
> You seem to have forgotten the drammer case here, which this would not
> have protected against. I'm not sure, but I suspect that we could have
> similar issues with mappings using other attributes (e.g write-through),
> as these would cause the memory traffic without cache miss events.

Can you get me example code for x86 or x86-64? If this is trivial to
workaround using movnt or something like that, it would be good to
know.

I did not go through the drammer paper in too great detail. They have
some kind of DMA-able memory, and they abuse it to do direct writes?
So you can "simply" stop providing DMA-able memory to the userland,
right? [Ok, bye bye accelerated graphics, I guess. But living w/o
graphics acceleration is preferable to remote root...]

OTOH... the exploit that scares me most is javascript sandbox
escape. I should be able to stop that... and other JIT escape cases
where untrusted code does not have access to special instructions.

On x86, there seems to be "DATA_MEM_REFS" performance counter, if
cache misses do not account movnt, this one should. Will need checking.

> Perhaps, but that depends on a number of implementation details. If "too
> often" means "all the time", people will turn this off when they could
> otherwise have been protected (e.g. if we can accurately monitor the
> last level of cache).

Yup. Doing it well is preferable to doing it badly.

> > > * On some implementations, it may be that the counters are not
> > >   interchangeable, and for those this would take away
> > >   PERF_COUNT_HW_CACHE_MISSES from existing users.
> > 
> > Yup. Note that with this kind of protection, one missing performance
> > counter is likely to be small problem.
> 
> That depends. Who chooses when to turn this on? If it's down to the
> distro, this can adversely affect users with perfectly safe DRAM.

You don't want this enabled on machines with working DRAM, there will
be performance impact.

> > > > +	/* FIXME msec per usec, reverse logic? */
> > > > +	if (delta < 64 * NSEC_PER_MSEC)
> > > > +		mdelay(56);
> > > > +}
> > > 
> > > If I round-robin my attack across CPUs, how much does this help?
> > 
> > See below for new explanation. With 2 CPUs, we are fine. On monster
> > big-little 8-core machines, we'd probably trigger protection too
> > often.
> 
> We see larger core counts in mobile devices these days. In China,
> octa-core phones are popular, for example. Servers go much larger.

Well, I can't help everyone :-(. On servers, there's ECC. On phones,
well, don't buy broken machines. This will work, but performance
impact will not be nice.

> > +static struct perf_event_attr rh_attr = {
...
> > +	.sample_period = 10000,
> > +};
> 
> What kind of overhead (just from taking the interrupt) will this come
> with?

This is not used, see below.

> > +/*
> > + * How often is the DRAM refreshed. Setting it too high is safe.
> > + */
> 
> Stale comment? Given the check against delta below, this doesn't look to
> be true.

Thinko, actually. Too low is safe, AFAICT.

> > +/*
> > + * DRAM is shared between CPUs, but these performance counters are per-CPU.
> > + */
> > +	int max_attacking_cpus = 2;
> 
> As above, many systems today have more than two CPUs. In the drammmer
> paper, it looks like the majority had four.

We can do set this automatically, and we should also take cpu hotplug
into account. But lets get it working first.

Actually in the ARM case (Drammer), it may be better to stop exploit
some other way. Turning off/redesigning GPU acceleration should work
there, right?

Best regards,
									Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [kernel-hardening] rowhammer protection [was Re: Getting interrupt every million cache misses]
@ 2016-10-28 17:27                   ` Pavel Machek
  0 siblings, 0 replies; 79+ messages in thread
From: Pavel Machek @ 2016-10-28 17:27 UTC (permalink / raw)
  To: Mark Rutland
  Cc: Kees Cook, Peter Zijlstra, Arnaldo Carvalho de Melo, kernel list,
	Ingo Molnar, Alexander Shishkin, kernel-hardening

[-- Attachment #1: Type: text/plain, Size: 5276 bytes --]

Hi!

> On Fri, Oct 28, 2016 at 01:21:36PM +0200, Pavel Machek wrote:
> > > Has this been tested on a system vulnerable to rowhammer, and if so, was
> > > it reliable in mitigating the issue?
> > > 
> > > Which particular attack codebase was it tested against?
> > 
> > I have rowhammer-test here,
> > 
> > commit 9824453fff76e0a3f5d1ac8200bc6c447c4fff57
> > Author: Mark Seaborn <mseaborn@chromium.org>
> 
> ... from which repo?

https://github.com/mseaborn/rowhammer-test.git

> > I do not have vulnerable machine near me, so no "real" tests, but
> > I'm pretty sure it will make the error no longer reproducible with the
> > newer version. [Help welcome ;-)]
> 
> Even if we hope this works, I think we have to be very careful with that
> kind of assertion. Until we have data is to its efficacy, I don't think
> we should claim that this is an effective mitigation.

On my hardware, rowhammer errors are not trivial to reproduce. It
takes time (minutes). I'm pretty sure this will be enough to stop the
exploit. If you have machines where rowhammer errors are really easy
to reproduce, testing on it would be welcome.

> > Well, I'd like to postpone debate 'where does it live' to the later
> > stage. The problem is not arch-specific, the solution is not too
> > arch-specific either. I believe we can use Kconfig to hide it from
> > users where it does not apply. Anyway, lets decide if it works and
> > where, first.
> 
> You seem to have forgotten the drammer case here, which this would not
> have protected against. I'm not sure, but I suspect that we could have
> similar issues with mappings using other attributes (e.g write-through),
> as these would cause the memory traffic without cache miss events.

Can you get me example code for x86 or x86-64? If this is trivial to
workaround using movnt or something like that, it would be good to
know.

I did not go through the drammer paper in too great detail. They have
some kind of DMA-able memory, and they abuse it to do direct writes?
So you can "simply" stop providing DMA-able memory to the userland,
right? [Ok, bye bye accelerated graphics, I guess. But living w/o
graphics acceleration is preferable to remote root...]

OTOH... the exploit that scares me most is javascript sandbox
escape. I should be able to stop that... and other JIT escape cases
where untrusted code does not have access to special instructions.

On x86, there seems to be "DATA_MEM_REFS" performance counter, if
cache misses do not account movnt, this one should. Will need checking.

> Perhaps, but that depends on a number of implementation details. If "too
> often" means "all the time", people will turn this off when they could
> otherwise have been protected (e.g. if we can accurately monitor the
> last level of cache).

Yup. Doing it well is preferable to doing it badly.

> > > * On some implementations, it may be that the counters are not
> > >   interchangeable, and for those this would take away
> > >   PERF_COUNT_HW_CACHE_MISSES from existing users.
> > 
> > Yup. Note that with this kind of protection, one missing performance
> > counter is likely to be small problem.
> 
> That depends. Who chooses when to turn this on? If it's down to the
> distro, this can adversely affect users with perfectly safe DRAM.

You don't want this enabled on machines with working DRAM, there will
be performance impact.

> > > > +	/* FIXME msec per usec, reverse logic? */
> > > > +	if (delta < 64 * NSEC_PER_MSEC)
> > > > +		mdelay(56);
> > > > +}
> > > 
> > > If I round-robin my attack across CPUs, how much does this help?
> > 
> > See below for new explanation. With 2 CPUs, we are fine. On monster
> > big-little 8-core machines, we'd probably trigger protection too
> > often.
> 
> We see larger core counts in mobile devices these days. In China,
> octa-core phones are popular, for example. Servers go much larger.

Well, I can't help everyone :-(. On servers, there's ECC. On phones,
well, don't buy broken machines. This will work, but performance
impact will not be nice.

> > +static struct perf_event_attr rh_attr = {
...
> > +	.sample_period = 10000,
> > +};
> 
> What kind of overhead (just from taking the interrupt) will this come
> with?

This is not used, see below.

> > +/*
> > + * How often is the DRAM refreshed. Setting it too high is safe.
> > + */
> 
> Stale comment? Given the check against delta below, this doesn't look to
> be true.

Thinko, actually. Too low is safe, AFAICT.

> > +/*
> > + * DRAM is shared between CPUs, but these performance counters are per-CPU.
> > + */
> > +	int max_attacking_cpus = 2;
> 
> As above, many systems today have more than two CPUs. In the drammmer
> paper, it looks like the majority had four.

We can do set this automatically, and we should also take cpu hotplug
into account. But lets get it working first.

Actually in the ARM case (Drammer), it may be better to stop exploit
some other way. Turning off/redesigning GPU acceleration should work
there, right?

Best regards,
									Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [kernel-hardening] rowhammer protection [was Re: Getting interrupt every million cache misses]
  2016-10-28 14:18                   ` Peter Zijlstra
@ 2016-10-28 18:30                     ` Pavel Machek
  -1 siblings, 0 replies; 79+ messages in thread
From: Pavel Machek @ 2016-10-28 18:30 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Mark Rutland, Kees Cook, Arnaldo Carvalho de Melo, kernel list,
	Ingo Molnar, Alexander Shishkin, kernel-hardening

[-- Attachment #1: Type: text/plain, Size: 2089 bytes --]

On Fri 2016-10-28 16:18:40, Peter Zijlstra wrote:
> On Fri, Oct 28, 2016 at 03:05:22PM +0100, Mark Rutland wrote:
> > 
> > > > * the precise semantics of performance counter events varies drastically
> > > >   across implementations. PERF_COUNT_HW_CACHE_MISSES, might only map to
> > > >   one particular level of cache, and/or may not be implemented on all
> > > >   cores.
> > > 
> > > If it maps to one particular cache level, we are fine (or maybe will
> > > trigger protection too often). If some cores are not counted, that's bad.
> > 
> > Perhaps, but that depends on a number of implementation details. If "too
> > often" means "all the time", people will turn this off when they could
> > otherwise have been protected (e.g. if we can accurately monitor the
> > last level of cache).
> 
> Right, so one of the things mentioned in the paper is x86 NT stores.
> Those are not cached and I'm not at all sure they're accounted in the
> event we use for cache misses.

Would you (or someone) have pointer to good documentation source on
available performance counters?

Rowhammer is normally done using reads (not writes), exploiting fact
that you can modify memory just by reading it. But it may be possible
that writes have similar effect, and that attacker cells can be far
enough from victim cells that it is a problem.

MOVNTDQA could be another problem, but hopefully that happens only on
memory types userland does not have access to.

Hmm, and according to short test, movnt is not counted:

pavel@duo:/data/l/linux/tools$ sudo perf_3.16 stat
--event=cache-misses ./a.out
^C./a.out: Interrupt

 Performance counter stats for './a.out':

            61,271      cache-misses

      11.605840031 seconds time elapsed

long long foo;

void main(void)
{
	foo = &foo;
	while (1) {
	      asm volatile(
	      	  "mov foo, %edi \n\
		  movnti %eax, (%edi)");
	}
}
							

								Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [kernel-hardening] rowhammer protection [was Re: Getting interrupt every million cache misses]
@ 2016-10-28 18:30                     ` Pavel Machek
  0 siblings, 0 replies; 79+ messages in thread
From: Pavel Machek @ 2016-10-28 18:30 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Mark Rutland, Kees Cook, Arnaldo Carvalho de Melo, kernel list,
	Ingo Molnar, Alexander Shishkin, kernel-hardening

[-- Attachment #1: Type: text/plain, Size: 2089 bytes --]

On Fri 2016-10-28 16:18:40, Peter Zijlstra wrote:
> On Fri, Oct 28, 2016 at 03:05:22PM +0100, Mark Rutland wrote:
> > 
> > > > * the precise semantics of performance counter events varies drastically
> > > >   across implementations. PERF_COUNT_HW_CACHE_MISSES, might only map to
> > > >   one particular level of cache, and/or may not be implemented on all
> > > >   cores.
> > > 
> > > If it maps to one particular cache level, we are fine (or maybe will
> > > trigger protection too often). If some cores are not counted, that's bad.
> > 
> > Perhaps, but that depends on a number of implementation details. If "too
> > often" means "all the time", people will turn this off when they could
> > otherwise have been protected (e.g. if we can accurately monitor the
> > last level of cache).
> 
> Right, so one of the things mentioned in the paper is x86 NT stores.
> Those are not cached and I'm not at all sure they're accounted in the
> event we use for cache misses.

Would you (or someone) have pointer to good documentation source on
available performance counters?

Rowhammer is normally done using reads (not writes), exploiting fact
that you can modify memory just by reading it. But it may be possible
that writes have similar effect, and that attacker cells can be far
enough from victim cells that it is a problem.

MOVNTDQA could be another problem, but hopefully that happens only on
memory types userland does not have access to.

Hmm, and according to short test, movnt is not counted:

pavel@duo:/data/l/linux/tools$ sudo perf_3.16 stat
--event=cache-misses ./a.out
^C./a.out: Interrupt

 Performance counter stats for './a.out':

            61,271      cache-misses

      11.605840031 seconds time elapsed

long long foo;

void main(void)
{
	foo = &foo;
	while (1) {
	      asm volatile(
	      	  "mov foo, %edi \n\
		  movnti %eax, (%edi)");
	}
}
							

								Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [kernel-hardening] rowhammer protection [was Re: Getting interrupt every million cache misses]
  2016-10-28 18:30                     ` Pavel Machek
@ 2016-10-28 18:48                       ` Peter Zijlstra
  -1 siblings, 0 replies; 79+ messages in thread
From: Peter Zijlstra @ 2016-10-28 18:48 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Mark Rutland, Kees Cook, Arnaldo Carvalho de Melo, kernel list,
	Ingo Molnar, Alexander Shishkin, kernel-hardening

On Fri, Oct 28, 2016 at 08:30:14PM +0200, Pavel Machek wrote:
> Would you (or someone) have pointer to good documentation source on
> available performance counters?

The Intel SDM has a section on them and the AMD Bios and Kernel
Developers Guide does too.

That is, they contain lists of available counters for the various parts
from these vendors and that's pretty much all there is.

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [kernel-hardening] rowhammer protection [was Re: Getting interrupt every million cache misses]
@ 2016-10-28 18:48                       ` Peter Zijlstra
  0 siblings, 0 replies; 79+ messages in thread
From: Peter Zijlstra @ 2016-10-28 18:48 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Mark Rutland, Kees Cook, Arnaldo Carvalho de Melo, kernel list,
	Ingo Molnar, Alexander Shishkin, kernel-hardening

On Fri, Oct 28, 2016 at 08:30:14PM +0200, Pavel Machek wrote:
> Would you (or someone) have pointer to good documentation source on
> available performance counters?

The Intel SDM has a section on them and the AMD Bios and Kernel
Developers Guide does too.

That is, they contain lists of available counters for the various parts
from these vendors and that's pretty much all there is.

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: Re: [kernel-hardening] rowhammer protection [was Re: Getting interrupt every million cache misses]
  2016-10-28 17:27                   ` Pavel Machek
@ 2016-10-29 13:06                     ` Daniel Gruss
  -1 siblings, 0 replies; 79+ messages in thread
From: Daniel Gruss @ 2016-10-29 13:06 UTC (permalink / raw)
  To: kernel-hardening
  Cc: Pavel Machek, Mark Rutland, Kees Cook, Peter Zijlstra,
	Arnaldo Carvalho de Melo, kernel list, Ingo Molnar,
	Alexander Shishkin

I think that this idea to mitigate Rowhammer is not a good approach.

I wrote Rowhammer.js (we published a paper on that) and I had the first 
reproducible bit flips on DDR4 at both, increased and default refresh 
rates (published in our DRAMA paper).

We have researched the number of cache misses induced from different 
applications in the past and there are many applications that cause more 
cache misses than Rowhammer (published in our Flush+Flush paper) they 
just cause them on different rows.
Slowing down a system surely works, but you could also, as a mitigation 
just make this CPU core run at the lowest possible frequency. That would 
likely be more effective than the solution you suggest.

Now, every Rowhammer attack exploits not only the DRAM effects but also 
the way the operating system organizes memory.

Some papers exploit page deduplication and disabling page deduplication 
should be the default also for other reasons, such as information 
disclosure attacks. If page deduplication is disabled, attacks like 
Dedup est Machina and Flip Feng Shui are inherently not possible anymore.

Most other attacks target page tables (the Google exploit, Rowhammer.js, 
Drammer). Now in Rowhammer.js we suggested a very simple fix, that is 
just an extension of what Linux already does.
Unless out of memory page tables and user pages are not placed in the 
same 2MB region. We suggested that this behavior should be more strict 
even in memory pressure situations. If the OS can only find a page table 
that resides in the same 2MB region as a user page, the request should 
fail instead and the process requesting it should go out of memory. More 
generally, the attack surface is gone if the OS never places a page 
table in proximity of less than 2MB to a user page.
That is a simple fix that does not cost any runtime performance. It 
mitigates all these scary attacks and won't even incur a memory cost in 
most situation.

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: Re: [kernel-hardening] rowhammer protection [was Re: Getting interrupt every million cache misses]
@ 2016-10-29 13:06                     ` Daniel Gruss
  0 siblings, 0 replies; 79+ messages in thread
From: Daniel Gruss @ 2016-10-29 13:06 UTC (permalink / raw)
  To: kernel-hardening
  Cc: Pavel Machek, Mark Rutland, Kees Cook, Peter Zijlstra,
	Arnaldo Carvalho de Melo, kernel list, Ingo Molnar,
	Alexander Shishkin

I think that this idea to mitigate Rowhammer is not a good approach.

I wrote Rowhammer.js (we published a paper on that) and I had the first 
reproducible bit flips on DDR4 at both, increased and default refresh 
rates (published in our DRAMA paper).

We have researched the number of cache misses induced from different 
applications in the past and there are many applications that cause more 
cache misses than Rowhammer (published in our Flush+Flush paper) they 
just cause them on different rows.
Slowing down a system surely works, but you could also, as a mitigation 
just make this CPU core run at the lowest possible frequency. That would 
likely be more effective than the solution you suggest.

Now, every Rowhammer attack exploits not only the DRAM effects but also 
the way the operating system organizes memory.

Some papers exploit page deduplication and disabling page deduplication 
should be the default also for other reasons, such as information 
disclosure attacks. If page deduplication is disabled, attacks like 
Dedup est Machina and Flip Feng Shui are inherently not possible anymore.

Most other attacks target page tables (the Google exploit, Rowhammer.js, 
Drammer). Now in Rowhammer.js we suggested a very simple fix, that is 
just an extension of what Linux already does.
Unless out of memory page tables and user pages are not placed in the 
same 2MB region. We suggested that this behavior should be more strict 
even in memory pressure situations. If the OS can only find a page table 
that resides in the same 2MB region as a user page, the request should 
fail instead and the process requesting it should go out of memory. More 
generally, the attack surface is gone if the OS never places a page 
table in proximity of less than 2MB to a user page.
That is a simple fix that does not cost any runtime performance. It 
mitigates all these scary attacks and won't even incur a memory cost in 
most situation.

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: Re: [kernel-hardening] rowhammer protection [was Re: Getting interrupt every million cache misses]
  2016-10-29 13:06                     ` Daniel Gruss
@ 2016-10-29 19:42                       ` Pavel Machek
  -1 siblings, 0 replies; 79+ messages in thread
From: Pavel Machek @ 2016-10-29 19:42 UTC (permalink / raw)
  To: Daniel Gruss
  Cc: kernel-hardening, Mark Rutland, Kees Cook, Peter Zijlstra,
	Arnaldo Carvalho de Melo, kernel list, Ingo Molnar,
	Alexander Shishkin

[-- Attachment #1: Type: text/plain, Size: 3444 bytes --]

Hi!

> I think that this idea to mitigate Rowhammer is not a good approach.

Well.. it does not have to be good if it is the best we have.

> I wrote Rowhammer.js (we published a paper on that) and I had the first
> reproducible bit flips on DDR4 at both, increased and default refresh rates
> (published in our DRAMA paper).

Congratulations. Now I'd like to take away your toys :-).

> We have researched the number of cache misses induced from different
> applications in the past and there are many applications that cause more
> cache misses than Rowhammer (published in our Flush+Flush paper) they just
> cause them on different rows.
> Slowing down a system surely works, but you could also, as a mitigation just
> make this CPU core run at the lowest possible frequency. That would likely
> be more effective than the solution you suggest.

Not in my testing. First, I'm not at all sure lowest CPU speed would
make any difference at all (even CPU at lowest clock is way faster
than DRAM). Second, going to lowest clock speed will reduce
performance

[But if you can test it and it works... it would be nice to know. It
is very simple to implement w/o kernel changes.] 

> Now, every Rowhammer attack exploits not only the DRAM effects but also the
> way the operating system organizes memory.
> 
> Some papers exploit page deduplication and disabling page deduplication
> should be the default also for other reasons, such as information disclosure
> attacks. If page deduplication is disabled, attacks like Dedup est Machina
> and Flip Feng Shui are inherently not possible anymore.

No, sorry, not going to play this particular whack-a-mole game. Linux
is designed for working hardware, and with bit flips, something is
going to break. (Does Flip Feng Shui really depend on dedup?) 

> Most other attacks target page tables (the Google exploit, Rowhammer.js,
> Drammer). Now in Rowhammer.js we suggested a very simple fix, that is just
> an extension of what Linux already does.
> Unless out of memory page tables and user pages are not placed in the same
> 2MB region. We suggested that this behavior should be more strict even in
> memory pressure situations. If the OS can only find a page table that
> resides in the same 2MB region as a user page, the request should fail
> instead and the process requesting it should go out of memory. More
> generally, the attack surface is gone if the OS never places a page table in
> proximity of less than 2MB to a user page.

But it will be nowhere near complete fix, right?

Will fix user attacking kernel, but not user1 attacking user2. You
could put each "user" into separate 2MB region, but then you'd have to
track who needs go go where. (Same uid is not enough, probably "can
ptrace"?)

But more importantly....

That'll still let remote server gain permissons of local user running
web server... using javascript exploit right? And that's actually
attack that I find most scary. Local user to root exploit is bad, but
getting permissions of web browser from remote web server is very,
very, very bad.

> That is a simple fix that does not cost any runtime performance.

Simple? Not really, I'm afraid. Feel free to try to implement it.

Best regards,

									Pavel

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: Re: [kernel-hardening] rowhammer protection [was Re: Getting interrupt every million cache misses]
@ 2016-10-29 19:42                       ` Pavel Machek
  0 siblings, 0 replies; 79+ messages in thread
From: Pavel Machek @ 2016-10-29 19:42 UTC (permalink / raw)
  To: Daniel Gruss
  Cc: kernel-hardening, Mark Rutland, Kees Cook, Peter Zijlstra,
	Arnaldo Carvalho de Melo, kernel list, Ingo Molnar,
	Alexander Shishkin

[-- Attachment #1: Type: text/plain, Size: 3444 bytes --]

Hi!

> I think that this idea to mitigate Rowhammer is not a good approach.

Well.. it does not have to be good if it is the best we have.

> I wrote Rowhammer.js (we published a paper on that) and I had the first
> reproducible bit flips on DDR4 at both, increased and default refresh rates
> (published in our DRAMA paper).

Congratulations. Now I'd like to take away your toys :-).

> We have researched the number of cache misses induced from different
> applications in the past and there are many applications that cause more
> cache misses than Rowhammer (published in our Flush+Flush paper) they just
> cause them on different rows.
> Slowing down a system surely works, but you could also, as a mitigation just
> make this CPU core run at the lowest possible frequency. That would likely
> be more effective than the solution you suggest.

Not in my testing. First, I'm not at all sure lowest CPU speed would
make any difference at all (even CPU at lowest clock is way faster
than DRAM). Second, going to lowest clock speed will reduce
performance

[But if you can test it and it works... it would be nice to know. It
is very simple to implement w/o kernel changes.] 

> Now, every Rowhammer attack exploits not only the DRAM effects but also the
> way the operating system organizes memory.
> 
> Some papers exploit page deduplication and disabling page deduplication
> should be the default also for other reasons, such as information disclosure
> attacks. If page deduplication is disabled, attacks like Dedup est Machina
> and Flip Feng Shui are inherently not possible anymore.

No, sorry, not going to play this particular whack-a-mole game. Linux
is designed for working hardware, and with bit flips, something is
going to break. (Does Flip Feng Shui really depend on dedup?) 

> Most other attacks target page tables (the Google exploit, Rowhammer.js,
> Drammer). Now in Rowhammer.js we suggested a very simple fix, that is just
> an extension of what Linux already does.
> Unless out of memory page tables and user pages are not placed in the same
> 2MB region. We suggested that this behavior should be more strict even in
> memory pressure situations. If the OS can only find a page table that
> resides in the same 2MB region as a user page, the request should fail
> instead and the process requesting it should go out of memory. More
> generally, the attack surface is gone if the OS never places a page table in
> proximity of less than 2MB to a user page.

But it will be nowhere near complete fix, right?

Will fix user attacking kernel, but not user1 attacking user2. You
could put each "user" into separate 2MB region, but then you'd have to
track who needs go go where. (Same uid is not enough, probably "can
ptrace"?)

But more importantly....

That'll still let remote server gain permissons of local user running
web server... using javascript exploit right? And that's actually
attack that I find most scary. Local user to root exploit is bad, but
getting permissions of web browser from remote web server is very,
very, very bad.

> That is a simple fix that does not cost any runtime performance.

Simple? Not really, I'm afraid. Feel free to try to implement it.

Best regards,

									Pavel

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [kernel-hardening] rowhammer protection [was Re: Getting interrupt every million cache misses]
  2016-10-29 19:42                       ` Pavel Machek
@ 2016-10-29 20:05                         ` Daniel Gruss
  -1 siblings, 0 replies; 79+ messages in thread
From: Daniel Gruss @ 2016-10-29 20:05 UTC (permalink / raw)
  To: Pavel Machek
  Cc: kernel-hardening, Mark Rutland, Kees Cook, Peter Zijlstra,
	Arnaldo Carvalho de Melo, kernel list, Ingo Molnar,
	Alexander Shishkin

On 29.10.2016 21:42, Pavel Machek wrote:
> Congratulations. Now I'd like to take away your toys :-).

I'm would like you to do that, but I'm very confident you're not 
successful the way your starting ;)

> Not in my testing.

Have you tried music/video reencoding? Games? Anything that works with a 
decent amount of memory but not too much hard disk i/o?
Numbers are very clear there...

> First, I'm not at all sure lowest CPU speed would
> make any difference at all

It would. I've seen many bitflips but none where the CPU operated in the 
lower frequency range.

> Second, going to lowest clock speed will reduce performance

As does the countermeasure you propose...

> No, sorry, not going to play this particular whack-a-mole game.

But you are already with the countermeasure you propose...

> Linux is designed for working hardware, and with bit flips, something is
> going to break. (Does Flip Feng Shui really depend on dedup?)

Deduplication should be disabled not because of bit flips but because of 
information leakage (deduplication attacks, cache side-channel attacks, ...)

Yes, Flip Feng Shui requires deduplication and does not work without.
Disabling deduplication is what the authors recommend as a countermeasure.

> But it will be nowhere near complete fix, right?
>
> Will fix user attacking kernel, but not user1 attacking user2. You
> could put each "user" into separate 2MB region, but then you'd have to
> track who needs go go where. (Same uid is not enough, probably "can
> ptrace"?)

Exactly. But preventing user2kernel is already a good start, and you 
would prevent that without any doubt and without any cost.

user2user is something else to think about and more complicated because 
you have shared libraries + copy on write --> same problems as 
deduplication. I think it might make sense to discuss whether separating 
by uids or even pids would be viable.

> That'll still let remote server gain permissons of local user running
> web server... using javascript exploit right?  And that's actually
> attack that I find most scary. Local user to root exploit is bad, but
> getting permissions of web browser from remote web server is very,
> very, very bad.

Rowhammer.js skips the browser... it goes JS to full phys. memory 
access. Anyway, preventing Rowhammer from JS should be easy because even 
the slightest slow down should be enough to prevent any Rowhammer attack 
from JS.

>> That is a simple fix that does not cost any runtime performance.
>
> Simple? Not really, I'm afraid. Feel free to try to implement it.

I had a student who already implemented this in another OS, I'm 
confident it can be done in Linux as well...


Cheers,
Daniel

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [kernel-hardening] rowhammer protection [was Re: Getting interrupt every million cache misses]
@ 2016-10-29 20:05                         ` Daniel Gruss
  0 siblings, 0 replies; 79+ messages in thread
From: Daniel Gruss @ 2016-10-29 20:05 UTC (permalink / raw)
  To: Pavel Machek
  Cc: kernel-hardening, Mark Rutland, Kees Cook, Peter Zijlstra,
	Arnaldo Carvalho de Melo, kernel list, Ingo Molnar,
	Alexander Shishkin

On 29.10.2016 21:42, Pavel Machek wrote:
> Congratulations. Now I'd like to take away your toys :-).

I'm would like you to do that, but I'm very confident you're not 
successful the way your starting ;)

> Not in my testing.

Have you tried music/video reencoding? Games? Anything that works with a 
decent amount of memory but not too much hard disk i/o?
Numbers are very clear there...

> First, I'm not at all sure lowest CPU speed would
> make any difference at all

It would. I've seen many bitflips but none where the CPU operated in the 
lower frequency range.

> Second, going to lowest clock speed will reduce performance

As does the countermeasure you propose...

> No, sorry, not going to play this particular whack-a-mole game.

But you are already with the countermeasure you propose...

> Linux is designed for working hardware, and with bit flips, something is
> going to break. (Does Flip Feng Shui really depend on dedup?)

Deduplication should be disabled not because of bit flips but because of 
information leakage (deduplication attacks, cache side-channel attacks, ...)

Yes, Flip Feng Shui requires deduplication and does not work without.
Disabling deduplication is what the authors recommend as a countermeasure.

> But it will be nowhere near complete fix, right?
>
> Will fix user attacking kernel, but not user1 attacking user2. You
> could put each "user" into separate 2MB region, but then you'd have to
> track who needs go go where. (Same uid is not enough, probably "can
> ptrace"?)

Exactly. But preventing user2kernel is already a good start, and you 
would prevent that without any doubt and without any cost.

user2user is something else to think about and more complicated because 
you have shared libraries + copy on write --> same problems as 
deduplication. I think it might make sense to discuss whether separating 
by uids or even pids would be viable.

> That'll still let remote server gain permissons of local user running
> web server... using javascript exploit right?  And that's actually
> attack that I find most scary. Local user to root exploit is bad, but
> getting permissions of web browser from remote web server is very,
> very, very bad.

Rowhammer.js skips the browser... it goes JS to full phys. memory 
access. Anyway, preventing Rowhammer from JS should be easy because even 
the slightest slow down should be enough to prevent any Rowhammer attack 
from JS.

>> That is a simple fix that does not cost any runtime performance.
>
> Simple? Not really, I'm afraid. Feel free to try to implement it.

I had a student who already implemented this in another OS, I'm 
confident it can be done in Linux as well...


Cheers,
Daniel

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [kernel-hardening] rowhammer protection [was Re: Getting interrupt every million cache misses]
  2016-10-29 20:05                         ` Daniel Gruss
  (?)
@ 2016-10-29 20:14                         ` Daniel Gruss
  -1 siblings, 0 replies; 79+ messages in thread
From: Daniel Gruss @ 2016-10-29 20:14 UTC (permalink / raw)
  To: kernel-hardening

On 29.10.2016 22:05, Daniel Gruss wrote:
> I'm very confident you're not successful the way you['re] starting ;)

Just to give a reason to that: on many systems you can have bitflips 
with far less than 1 or 2 million cache misses.
https://users.ece.cmu.edu/~yoonguk/papers/kim-isca14.pdf reports as 
little as 139K accesses (-> cache misses) to be sufficient on one of 
their systems and on one of my systems I can have a bit flip with as 
little as 78K cache misses (double sided hammering).

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [kernel-hardening] rowhammer protection [was Re: Getting interrupt every million cache misses]
  2016-10-29 20:05                         ` Daniel Gruss
@ 2016-10-29 21:05                           ` Pavel Machek
  -1 siblings, 0 replies; 79+ messages in thread
From: Pavel Machek @ 2016-10-29 21:05 UTC (permalink / raw)
  To: Daniel Gruss
  Cc: kernel-hardening, Mark Rutland, Kees Cook, Peter Zijlstra,
	Arnaldo Carvalho de Melo, kernel list, Ingo Molnar,
	Alexander Shishkin

[-- Attachment #1: Type: text/plain, Size: 4032 bytes --]

Hi!

On Sat 2016-10-29 22:05:16, Daniel Gruss wrote:
> On 29.10.2016 21:42, Pavel Machek wrote:
> >Congratulations. Now I'd like to take away your toys :-).
> 
> I'm would like you to do that, but I'm very confident you're not successful
> the way your starting ;)

:-). Lets see.

> >Not in my testing.
> 
> Have you tried music/video reencoding? Games? Anything that works with a
> decent amount of memory but not too much hard disk i/o?
> Numbers are very clear there...

So far I did bzip2 and kernel compilation. I believe I can prevent
flips in rowhammer-test with bzip2 going from 4 seconds to 5
seconds... let me see.

If you have simple test that you'd like me to try, speak up. Best if
it takes cca 10 seconds to run.

> >First, I'm not at all sure lowest CPU speed would
> >make any difference at all
> 
> It would. I've seen many bitflips but none where the CPU operated in the
> lower frequency range.

Ok, let me try that. Problem is that the machine I'm testing on takes
20 minutes to produce bit flip...

> >Second, going to lowest clock speed will reduce performance
> 
> As does the countermeasure you propose...

Yes. But hopefully not quite _as_ drastically. (going to lowest clock
would make bzip2 go from 4 to 12 seconds or so, right?)

> Yes, Flip Feng Shui requires deduplication and does not work without.
> Disabling deduplication is what the authors recommend as a
> countermeasure.

Ok, Flip Feng Shui is easy, then. :-).

> >But it will be nowhere near complete fix, right?
> >
> >Will fix user attacking kernel, but not user1 attacking user2. You
> >could put each "user" into separate 2MB region, but then you'd have to
> >track who needs go go where. (Same uid is not enough, probably "can
> >ptrace"?)
> 
> Exactly. But preventing user2kernel is already a good start, and you would
> prevent that without any doubt and without any cost.

Well, it is only good start if the result is mergeable, and can be
used to prevent all attacks we care about.

> >That'll still let remote server gain permissons of local user running
> >web server... using javascript exploit right?  And that's actually
> >attack that I find most scary. Local user to root exploit is bad, but
> >getting permissions of web browser from remote web server is very,
> >very, very bad.
> 
> Rowhammer.js skips the browser... it goes JS to full phys. memory access.
> Anyway, preventing Rowhammer from JS should be easy because even the
> slightest slow down should be enough to prevent any Rowhammer attack from
> JS.

Are you sure? How much slowdown is enough to prevent the attack? (And
can I get patched chromium? Patched JVM? Patched qemu?) Dunno.. are
only just in time compilers affected? Or can I get for example pdf
document that does all the wrong memory accesses during rendering,
triggering buffer overrun in xpdf and arbitrary code execution?

Running userland on non-working machine is scary :-(.

Shall we introduce new syscall "get_mandatory_jit_slowdown()"?

I'd like kernel patch that works around rowhammer problem... in
kernel. I'm willing to accept some slowdown (say from 4 to 6 seconds
for common tasks). I'd prefer solution to be contained in kernel, and
present working (but slower) machine to userspace. I believe I can do
that.

> >>That is a simple fix that does not cost any runtime performance.
> >
> >Simple? Not really, I'm afraid. Feel free to try to implement it.
> 
> I had a student who already implemented this in another OS, I'm confident it
> can be done in Linux as well...

Well, I'm not saying its impossible. But I'd like to see the
implementation. Its definitely more work than nohammer.c. Order of
magnitude more, at least.

But yes, it will help with side channel attacks, etc. So yes, I'd like
to see the patch.

Best regards,

									Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [kernel-hardening] rowhammer protection [was Re: Getting interrupt every million cache misses]
@ 2016-10-29 21:05                           ` Pavel Machek
  0 siblings, 0 replies; 79+ messages in thread
From: Pavel Machek @ 2016-10-29 21:05 UTC (permalink / raw)
  To: Daniel Gruss
  Cc: kernel-hardening, Mark Rutland, Kees Cook, Peter Zijlstra,
	Arnaldo Carvalho de Melo, kernel list, Ingo Molnar,
	Alexander Shishkin

[-- Attachment #1: Type: text/plain, Size: 4032 bytes --]

Hi!

On Sat 2016-10-29 22:05:16, Daniel Gruss wrote:
> On 29.10.2016 21:42, Pavel Machek wrote:
> >Congratulations. Now I'd like to take away your toys :-).
> 
> I'm would like you to do that, but I'm very confident you're not successful
> the way your starting ;)

:-). Lets see.

> >Not in my testing.
> 
> Have you tried music/video reencoding? Games? Anything that works with a
> decent amount of memory but not too much hard disk i/o?
> Numbers are very clear there...

So far I did bzip2 and kernel compilation. I believe I can prevent
flips in rowhammer-test with bzip2 going from 4 seconds to 5
seconds... let me see.

If you have simple test that you'd like me to try, speak up. Best if
it takes cca 10 seconds to run.

> >First, I'm not at all sure lowest CPU speed would
> >make any difference at all
> 
> It would. I've seen many bitflips but none where the CPU operated in the
> lower frequency range.

Ok, let me try that. Problem is that the machine I'm testing on takes
20 minutes to produce bit flip...

> >Second, going to lowest clock speed will reduce performance
> 
> As does the countermeasure you propose...

Yes. But hopefully not quite _as_ drastically. (going to lowest clock
would make bzip2 go from 4 to 12 seconds or so, right?)

> Yes, Flip Feng Shui requires deduplication and does not work without.
> Disabling deduplication is what the authors recommend as a
> countermeasure.

Ok, Flip Feng Shui is easy, then. :-).

> >But it will be nowhere near complete fix, right?
> >
> >Will fix user attacking kernel, but not user1 attacking user2. You
> >could put each "user" into separate 2MB region, but then you'd have to
> >track who needs go go where. (Same uid is not enough, probably "can
> >ptrace"?)
> 
> Exactly. But preventing user2kernel is already a good start, and you would
> prevent that without any doubt and without any cost.

Well, it is only good start if the result is mergeable, and can be
used to prevent all attacks we care about.

> >That'll still let remote server gain permissons of local user running
> >web server... using javascript exploit right?  And that's actually
> >attack that I find most scary. Local user to root exploit is bad, but
> >getting permissions of web browser from remote web server is very,
> >very, very bad.
> 
> Rowhammer.js skips the browser... it goes JS to full phys. memory access.
> Anyway, preventing Rowhammer from JS should be easy because even the
> slightest slow down should be enough to prevent any Rowhammer attack from
> JS.

Are you sure? How much slowdown is enough to prevent the attack? (And
can I get patched chromium? Patched JVM? Patched qemu?) Dunno.. are
only just in time compilers affected? Or can I get for example pdf
document that does all the wrong memory accesses during rendering,
triggering buffer overrun in xpdf and arbitrary code execution?

Running userland on non-working machine is scary :-(.

Shall we introduce new syscall "get_mandatory_jit_slowdown()"?

I'd like kernel patch that works around rowhammer problem... in
kernel. I'm willing to accept some slowdown (say from 4 to 6 seconds
for common tasks). I'd prefer solution to be contained in kernel, and
present working (but slower) machine to userspace. I believe I can do
that.

> >>That is a simple fix that does not cost any runtime performance.
> >
> >Simple? Not really, I'm afraid. Feel free to try to implement it.
> 
> I had a student who already implemented this in another OS, I'm confident it
> can be done in Linux as well...

Well, I'm not saying its impossible. But I'd like to see the
implementation. Its definitely more work than nohammer.c. Order of
magnitude more, at least.

But yes, it will help with side channel attacks, etc. So yes, I'd like
to see the patch.

Best regards,

									Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [kernel-hardening] rowhammer protection [was Re: Getting interrupt every million cache misses]
  2016-10-29 21:05                           ` Pavel Machek
@ 2016-10-29 21:07                             ` Daniel Gruss
  -1 siblings, 0 replies; 79+ messages in thread
From: Daniel Gruss @ 2016-10-29 21:07 UTC (permalink / raw)
  To: Pavel Machek
  Cc: kernel-hardening, Mark Rutland, Kees Cook, Peter Zijlstra,
	Arnaldo Carvalho de Melo, kernel list, Ingo Molnar,
	Alexander Shishkin

On 29.10.2016 23:05, Pavel Machek wrote:
> So far I did bzip2 and kernel compilation. I believe I can prevent
> flips in rowhammer-test with bzip2 going from 4 seconds to 5
> seconds... let me see.

can you prevent bitflips in this one? 
https://github.com/IAIK/rowhammerjs/tree/master/native

> Ok, let me try that. Problem is that the machine I'm testing on takes
> 20 minutes to produce bit flip...

will be lots faster with my code above ;)

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [kernel-hardening] rowhammer protection [was Re: Getting interrupt every million cache misses]
@ 2016-10-29 21:07                             ` Daniel Gruss
  0 siblings, 0 replies; 79+ messages in thread
From: Daniel Gruss @ 2016-10-29 21:07 UTC (permalink / raw)
  To: Pavel Machek
  Cc: kernel-hardening, Mark Rutland, Kees Cook, Peter Zijlstra,
	Arnaldo Carvalho de Melo, kernel list, Ingo Molnar,
	Alexander Shishkin

On 29.10.2016 23:05, Pavel Machek wrote:
> So far I did bzip2 and kernel compilation. I believe I can prevent
> flips in rowhammer-test with bzip2 going from 4 seconds to 5
> seconds... let me see.

can you prevent bitflips in this one? 
https://github.com/IAIK/rowhammerjs/tree/master/native

> Ok, let me try that. Problem is that the machine I'm testing on takes
> 20 minutes to produce bit flip...

will be lots faster with my code above ;)

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [kernel-hardening] rowhammer protection [was Re: Getting interrupt every million cache misses]
  2016-10-29 21:07                             ` Daniel Gruss
@ 2016-10-29 21:45                               ` Pavel Machek
  -1 siblings, 0 replies; 79+ messages in thread
From: Pavel Machek @ 2016-10-29 21:45 UTC (permalink / raw)
  To: Daniel Gruss
  Cc: kernel-hardening, Mark Rutland, Kees Cook, Peter Zijlstra,
	Arnaldo Carvalho de Melo, kernel list, Ingo Molnar,
	Alexander Shishkin

[-- Attachment #1: Type: text/plain, Size: 1569 bytes --]

On Sat 2016-10-29 23:07:59, Daniel Gruss wrote:
> On 29.10.2016 23:05, Pavel Machek wrote:
> >So far I did bzip2 and kernel compilation. I believe I can prevent
> >flips in rowhammer-test with bzip2 going from 4 seconds to 5
> >seconds... let me see.
> 
> can you prevent bitflips in this one?
> https://github.com/IAIK/rowhammerjs/tree/master/native

Thanks for the pointer. Unfortunately, my test machine has 64-bit
kernel, but 32-bit userland, so I can't compile it:

g++ -g -pthread -std=c++11 -O3 -o rowhammer rowhammer.cc
rowhammer.cc: In function ‘int main(int, char**)’:
rowhammer.cc:243:57: error: inconsistent operand constraints in an
‘asm’
   asm volatile ("rdtscp" : "=a" (a), "=d" (d) : : "rcx");

I tried g++ -m64, but that does not seem to work here at all. I'll try
to find some way to compile it during the week.

(BTW any idea which version would be right for this cpu?

processor     : 1
vendor_id     : GenuineIntel
cpu family    : 6
model	      	: 23
model name	: Intel(R) Core(TM)2 Duo CPU     E7400  @ 2.80GHz
stepping	: 10

Its Wolfdale-3M according to wikipedia... that seems older than
indy/sandy/haswell/skylake, so I'll just use the generic version...?)

> >Ok, let me try that. Problem is that the machine I'm testing on takes
> >20 minutes to produce bit flip...
> 
> will be lots faster with my code above ;)

Yes, that will help :-).
									Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [kernel-hardening] rowhammer protection [was Re: Getting interrupt every million cache misses]
@ 2016-10-29 21:45                               ` Pavel Machek
  0 siblings, 0 replies; 79+ messages in thread
From: Pavel Machek @ 2016-10-29 21:45 UTC (permalink / raw)
  To: Daniel Gruss
  Cc: kernel-hardening, Mark Rutland, Kees Cook, Peter Zijlstra,
	Arnaldo Carvalho de Melo, kernel list, Ingo Molnar,
	Alexander Shishkin

[-- Attachment #1: Type: text/plain, Size: 1569 bytes --]

On Sat 2016-10-29 23:07:59, Daniel Gruss wrote:
> On 29.10.2016 23:05, Pavel Machek wrote:
> >So far I did bzip2 and kernel compilation. I believe I can prevent
> >flips in rowhammer-test with bzip2 going from 4 seconds to 5
> >seconds... let me see.
> 
> can you prevent bitflips in this one?
> https://github.com/IAIK/rowhammerjs/tree/master/native

Thanks for the pointer. Unfortunately, my test machine has 64-bit
kernel, but 32-bit userland, so I can't compile it:

g++ -g -pthread -std=c++11 -O3 -o rowhammer rowhammer.cc
rowhammer.cc: In function ‘int main(int, char**)’:
rowhammer.cc:243:57: error: inconsistent operand constraints in an
‘asm’
   asm volatile ("rdtscp" : "=a" (a), "=d" (d) : : "rcx");

I tried g++ -m64, but that does not seem to work here at all. I'll try
to find some way to compile it during the week.

(BTW any idea which version would be right for this cpu?

processor     : 1
vendor_id     : GenuineIntel
cpu family    : 6
model	      	: 23
model name	: Intel(R) Core(TM)2 Duo CPU     E7400  @ 2.80GHz
stepping	: 10

Its Wolfdale-3M according to wikipedia... that seems older than
indy/sandy/haswell/skylake, so I'll just use the generic version...?)

> >Ok, let me try that. Problem is that the machine I'm testing on takes
> >20 minutes to produce bit flip...
> 
> will be lots faster with my code above ;)

Yes, that will help :-).
									Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [kernel-hardening] rowhammer protection [was Re: Getting interrupt every million cache misses]
  2016-10-29 21:45                               ` Pavel Machek
@ 2016-10-29 21:49                                 ` Daniel Gruss
  -1 siblings, 0 replies; 79+ messages in thread
From: Daniel Gruss @ 2016-10-29 21:49 UTC (permalink / raw)
  To: Pavel Machek
  Cc: kernel-hardening, Mark Rutland, Kees Cook, Peter Zijlstra,
	Arnaldo Carvalho de Melo, kernel list, Ingo Molnar,
	Alexander Shishkin

On 29.10.2016 23:45, Pavel Machek wrote:
> indy/sandy/haswell/skylake, so I'll just use the generic version...?)

yes, generic might work, but i never tested it on anything that old...

on my system i have >30 bit flips per second (ivy bridge i5-3xxx) with 
the rowhammer-ivy test... sometimes even more than 100 per second...

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [kernel-hardening] rowhammer protection [was Re: Getting interrupt every million cache misses]
@ 2016-10-29 21:49                                 ` Daniel Gruss
  0 siblings, 0 replies; 79+ messages in thread
From: Daniel Gruss @ 2016-10-29 21:49 UTC (permalink / raw)
  To: Pavel Machek
  Cc: kernel-hardening, Mark Rutland, Kees Cook, Peter Zijlstra,
	Arnaldo Carvalho de Melo, kernel list, Ingo Molnar,
	Alexander Shishkin

On 29.10.2016 23:45, Pavel Machek wrote:
> indy/sandy/haswell/skylake, so I'll just use the generic version...?)

yes, generic might work, but i never tested it on anything that old...

on my system i have >30 bit flips per second (ivy bridge i5-3xxx) with 
the rowhammer-ivy test... sometimes even more than 100 per second...

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [kernel-hardening] rowhammer protection [was Re: Getting interrupt every million cache misses]
  2016-10-29 21:49                                 ` Daniel Gruss
@ 2016-10-29 22:01                                   ` Pavel Machek
  -1 siblings, 0 replies; 79+ messages in thread
From: Pavel Machek @ 2016-10-29 22:01 UTC (permalink / raw)
  To: Daniel Gruss
  Cc: kernel-hardening, Mark Rutland, Kees Cook, Peter Zijlstra,
	Arnaldo Carvalho de Melo, kernel list, Ingo Molnar,
	Alexander Shishkin

[-- Attachment #1: Type: text/plain, Size: 692 bytes --]

On Sat 2016-10-29 23:49:57, Daniel Gruss wrote:
> On 29.10.2016 23:45, Pavel Machek wrote:
> >indy/sandy/haswell/skylake, so I'll just use the generic version...?)
> 
> yes, generic might work, but i never tested it on anything that old...
> 
> on my system i have >30 bit flips per second (ivy bridge i5-3xxx) with the
> rowhammer-ivy test... sometimes even more than 100 per second...

Hmm, maybe I'm glad I don't have a new machine :-).

I assume you still get _some_ bitflips with generic "rowhammer"?

Best regards,
									Pavel

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [kernel-hardening] rowhammer protection [was Re: Getting interrupt every million cache misses]
@ 2016-10-29 22:01                                   ` Pavel Machek
  0 siblings, 0 replies; 79+ messages in thread
From: Pavel Machek @ 2016-10-29 22:01 UTC (permalink / raw)
  To: Daniel Gruss
  Cc: kernel-hardening, Mark Rutland, Kees Cook, Peter Zijlstra,
	Arnaldo Carvalho de Melo, kernel list, Ingo Molnar,
	Alexander Shishkin

[-- Attachment #1: Type: text/plain, Size: 692 bytes --]

On Sat 2016-10-29 23:49:57, Daniel Gruss wrote:
> On 29.10.2016 23:45, Pavel Machek wrote:
> >indy/sandy/haswell/skylake, so I'll just use the generic version...?)
> 
> yes, generic might work, but i never tested it on anything that old...
> 
> on my system i have >30 bit flips per second (ivy bridge i5-3xxx) with the
> rowhammer-ivy test... sometimes even more than 100 per second...

Hmm, maybe I'm glad I don't have a new machine :-).

I assume you still get _some_ bitflips with generic "rowhammer"?

Best regards,
									Pavel

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [kernel-hardening] rowhammer protection [was Re: Getting interrupt every million cache misses]
  2016-10-29 22:01                                   ` Pavel Machek
@ 2016-10-29 22:02                                     ` Daniel Gruss
  -1 siblings, 0 replies; 79+ messages in thread
From: Daniel Gruss @ 2016-10-29 22:02 UTC (permalink / raw)
  To: Pavel Machek
  Cc: kernel-hardening, Mark Rutland, Kees Cook, Peter Zijlstra,
	Arnaldo Carvalho de Melo, kernel list, Ingo Molnar,
	Alexander Shishkin

On 30.10.2016 00:01, Pavel Machek wrote:
> Hmm, maybe I'm glad I don't have a new machine :-).
>
> I assume you still get _some_ bitflips with generic "rowhammer"?

1 or 2 every 20-30 minutes...

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [kernel-hardening] rowhammer protection [was Re: Getting interrupt every million cache misses]
@ 2016-10-29 22:02                                     ` Daniel Gruss
  0 siblings, 0 replies; 79+ messages in thread
From: Daniel Gruss @ 2016-10-29 22:02 UTC (permalink / raw)
  To: Pavel Machek
  Cc: kernel-hardening, Mark Rutland, Kees Cook, Peter Zijlstra,
	Arnaldo Carvalho de Melo, kernel list, Ingo Molnar,
	Alexander Shishkin

On 30.10.2016 00:01, Pavel Machek wrote:
> Hmm, maybe I'm glad I don't have a new machine :-).
>
> I assume you still get _some_ bitflips with generic "rowhammer"?

1 or 2 every 20-30 minutes...

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [kernel-hardening] rowhammer protection [was Re: Getting interrupt every million cache misses]
  2016-10-28 14:05                 ` Mark Rutland
@ 2016-10-31  8:27                   ` Pavel Machek
  -1 siblings, 0 replies; 79+ messages in thread
From: Pavel Machek @ 2016-10-31  8:27 UTC (permalink / raw)
  To: Mark Rutland
  Cc: Kees Cook, Peter Zijlstra, Arnaldo Carvalho de Melo, kernel list,
	Ingo Molnar, Alexander Shishkin, kernel-hardening

[-- Attachment #1: Type: text/plain, Size: 1917 bytes --]

Hi!

> On Fri, Oct 28, 2016 at 01:21:36PM +0200, Pavel Machek wrote:
> > > Has this been tested on a system vulnerable to rowhammer, and if so, was
> > > it reliable in mitigating the issue?
> > > 
> > > Which particular attack codebase was it tested against?
> > 
> > I have rowhammer-test here,
> > 
> > commit 9824453fff76e0a3f5d1ac8200bc6c447c4fff57
> > Author: Mark Seaborn <mseaborn@chromium.org>
> 
> ... from which repo?
> 
> > I do not have vulnerable machine near me, so no "real" tests, but
> > I'm pretty sure it will make the error no longer reproducible with the
> > newer version. [Help welcome ;-)]
> 
> Even if we hope this works, I think we have to be very careful with that
> kind of assertion. Until we have data is to its efficacy, I don't think
> we should claim that this is an effective mitigation.

Ok, so it turns out I was right. On my vulnerable machine, normally
bug is reproducible in less than 500 iterations:

Iteration 432 (after 1013.31s)
  error at 0xda7cf280: got 0xffffffffffffffef
Iteration 446 (after 1102.56s)
  error at 0xec21ea00: got 0xffffffefffffffff
Iteration 206 (after 497.50s)
  error at 0xd07d1438: got 0xffffffffffffffdf
Iteration 409 (after 1350.96s)
  error at 0xbd3b9108: got 0xefffffffffffffff
Iteration 120 (after 326.08s)
  error at 0xe398c438: got 0xffffffffffffffdf

With nohammer, I'm at 2300 iterations, and still no faults.

Daniel Gruss <daniel@gruss.cc> claims he has an attack that can do 30
flips a second on modern hardware. I'm not going to buy broken
hardware just for a test. Code is at
https://github.com/IAIK/rowhammerjs/tree/master/native . Would someone
be willing to get it running on vulnerable machine and test kernel
patches?

Thanks,

									Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [kernel-hardening] rowhammer protection [was Re: Getting interrupt every million cache misses]
@ 2016-10-31  8:27                   ` Pavel Machek
  0 siblings, 0 replies; 79+ messages in thread
From: Pavel Machek @ 2016-10-31  8:27 UTC (permalink / raw)
  To: Mark Rutland
  Cc: Kees Cook, Peter Zijlstra, Arnaldo Carvalho de Melo, kernel list,
	Ingo Molnar, Alexander Shishkin, kernel-hardening

[-- Attachment #1: Type: text/plain, Size: 1917 bytes --]

Hi!

> On Fri, Oct 28, 2016 at 01:21:36PM +0200, Pavel Machek wrote:
> > > Has this been tested on a system vulnerable to rowhammer, and if so, was
> > > it reliable in mitigating the issue?
> > > 
> > > Which particular attack codebase was it tested against?
> > 
> > I have rowhammer-test here,
> > 
> > commit 9824453fff76e0a3f5d1ac8200bc6c447c4fff57
> > Author: Mark Seaborn <mseaborn@chromium.org>
> 
> ... from which repo?
> 
> > I do not have vulnerable machine near me, so no "real" tests, but
> > I'm pretty sure it will make the error no longer reproducible with the
> > newer version. [Help welcome ;-)]
> 
> Even if we hope this works, I think we have to be very careful with that
> kind of assertion. Until we have data is to its efficacy, I don't think
> we should claim that this is an effective mitigation.

Ok, so it turns out I was right. On my vulnerable machine, normally
bug is reproducible in less than 500 iterations:

Iteration 432 (after 1013.31s)
  error at 0xda7cf280: got 0xffffffffffffffef
Iteration 446 (after 1102.56s)
  error at 0xec21ea00: got 0xffffffefffffffff
Iteration 206 (after 497.50s)
  error at 0xd07d1438: got 0xffffffffffffffdf
Iteration 409 (after 1350.96s)
  error at 0xbd3b9108: got 0xefffffffffffffff
Iteration 120 (after 326.08s)
  error at 0xe398c438: got 0xffffffffffffffdf

With nohammer, I'm at 2300 iterations, and still no faults.

Daniel Gruss <daniel@gruss.cc> claims he has an attack that can do 30
flips a second on modern hardware. I'm not going to buy broken
hardware just for a test. Code is at
https://github.com/IAIK/rowhammerjs/tree/master/native . Would someone
be willing to get it running on vulnerable machine and test kernel
patches?

Thanks,

									Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [kernel-hardening] rowhammer protection [was Re: Getting interrupt every million cache misses]
  2016-10-31  8:27                   ` Pavel Machek
@ 2016-10-31 14:47                     ` Mark Rutland
  -1 siblings, 0 replies; 79+ messages in thread
From: Mark Rutland @ 2016-10-31 14:47 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Kees Cook, Peter Zijlstra, Arnaldo Carvalho de Melo, kernel list,
	Ingo Molnar, Alexander Shishkin, kernel-hardening

On Mon, Oct 31, 2016 at 09:27:05AM +0100, Pavel Machek wrote:
> > On Fri, Oct 28, 2016 at 01:21:36PM +0200, Pavel Machek wrote:
> > > > Has this been tested on a system vulnerable to rowhammer, and if so, was
> > > > it reliable in mitigating the issue?

> > > I do not have vulnerable machine near me, so no "real" tests, but
> > > I'm pretty sure it will make the error no longer reproducible with the
> > > newer version. [Help welcome ;-)]
> > 
> > Even if we hope this works, I think we have to be very careful with that
> > kind of assertion. Until we have data is to its efficacy, I don't think
> > we should claim that this is an effective mitigation.
> 
> Ok, so it turns out I was right. On my vulnerable machine, normally
> bug is reproducible in less than 500 iterations:

> With nohammer, I'm at 2300 iterations, and still no faults.

To be quite frank, this is anecdotal. It only shows one particular attack is
made slower (or perhaps defeated), and doesn't show that the mitigation is
reliable or generally applicable (to other machines or other variants of the
attack).

Even if this happens to work on some machines, I still do not think one can
sell this as a generally applicable and reliable mitigation. Especially given
that others working in this area seem to have evidence otherwise, e.g. [1] (as
noted by spender in the LWN comments).

Thanks,
Mark.

[1] https://twitter.com/halvarflake/status/792314613568311296

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [kernel-hardening] rowhammer protection [was Re: Getting interrupt every million cache misses]
@ 2016-10-31 14:47                     ` Mark Rutland
  0 siblings, 0 replies; 79+ messages in thread
From: Mark Rutland @ 2016-10-31 14:47 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Kees Cook, Peter Zijlstra, Arnaldo Carvalho de Melo, kernel list,
	Ingo Molnar, Alexander Shishkin, kernel-hardening

On Mon, Oct 31, 2016 at 09:27:05AM +0100, Pavel Machek wrote:
> > On Fri, Oct 28, 2016 at 01:21:36PM +0200, Pavel Machek wrote:
> > > > Has this been tested on a system vulnerable to rowhammer, and if so, was
> > > > it reliable in mitigating the issue?

> > > I do not have vulnerable machine near me, so no "real" tests, but
> > > I'm pretty sure it will make the error no longer reproducible with the
> > > newer version. [Help welcome ;-)]
> > 
> > Even if we hope this works, I think we have to be very careful with that
> > kind of assertion. Until we have data is to its efficacy, I don't think
> > we should claim that this is an effective mitigation.
> 
> Ok, so it turns out I was right. On my vulnerable machine, normally
> bug is reproducible in less than 500 iterations:

> With nohammer, I'm at 2300 iterations, and still no faults.

To be quite frank, this is anecdotal. It only shows one particular attack is
made slower (or perhaps defeated), and doesn't show that the mitigation is
reliable or generally applicable (to other machines or other variants of the
attack).

Even if this happens to work on some machines, I still do not think one can
sell this as a generally applicable and reliable mitigation. Especially given
that others working in this area seem to have evidence otherwise, e.g. [1] (as
noted by spender in the LWN comments).

Thanks,
Mark.

[1] https://twitter.com/halvarflake/status/792314613568311296

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [kernel-hardening] rowhammer protection [was Re: Getting interrupt every million cache misses]
  2016-10-31 14:47                     ` Mark Rutland
@ 2016-10-31 21:13                       ` Pavel Machek
  -1 siblings, 0 replies; 79+ messages in thread
From: Pavel Machek @ 2016-10-31 21:13 UTC (permalink / raw)
  To: Mark Rutland
  Cc: Kees Cook, Peter Zijlstra, Arnaldo Carvalho de Melo, kernel list,
	Ingo Molnar, Alexander Shishkin, kernel-hardening

[-- Attachment #1: Type: text/plain, Size: 2979 bytes --]

On Mon 2016-10-31 14:47:39, Mark Rutland wrote:
> On Mon, Oct 31, 2016 at 09:27:05AM +0100, Pavel Machek wrote:
> > > On Fri, Oct 28, 2016 at 01:21:36PM +0200, Pavel Machek wrote:
> > > > > Has this been tested on a system vulnerable to rowhammer, and if so, was
> > > > > it reliable in mitigating the issue?
> 
> > > > I do not have vulnerable machine near me, so no "real" tests, but
> > > > I'm pretty sure it will make the error no longer reproducible with the
> > > > newer version. [Help welcome ;-)]
> > > 
> > > Even if we hope this works, I think we have to be very careful with that
> > > kind of assertion. Until we have data is to its efficacy, I don't think
> > > we should claim that this is an effective mitigation.
...
> 
> To be quite frank, this is anecdotal. It only shows one particular attack is
> made slower (or perhaps defeated), and doesn't show that the mitigation is
> reliable or generally applicable (to other machines or other variants of the
> attack).

So... I said that I'm pretty sure it will fix problem in my testing,
then you say that I should be careful with my words, I confirm it was
true, and now you complain that it is anecdotal?

Are you serious?

Of course I know that fixing rowhammer-test on my machine is quite a
low bar to ask. _And that's also why I said I'm pretty sure I'd pass
that bar_.

I'm still asking for help with testing, but all you do is claim that
"we can't be sure".

> Even if this happens to work on some machines, I still do not think one can
> sell this as a generally applicable and reliable mitigation. Especially given
> that others working in this area seem to have evidence otherwise, e.g. [1] (as
> noted by spender in the LWN comments).

Slowing this attack _is_ defeating it. It is enough to slow it 8
times, and it is gone, boom, not there any more.

Now.. I have to figure out what to do with movnt. No currently known
attack uses movnt. Still, that one should be solved.

Other than that... this is not magic. Attack is quite well
understood. All you have to do is prevent more than 8msec worth of
memory accesses. My patch can do that, and it will work,
everywhere... you just won't like the fact that your machine now works
on 10% of original performance.

Now, it is possible that researches will come up with attack that only
needs 2msec worth of accesses. So we change the constants. Performance
will be even worse. It is also possible that even more broken DRAM
comes out. Same solution. Plus someone certainly has a memory that
flips some bits even without help from funny access patterns. Too
bad. We can't help them.

Would it be less confusing if we redefined task description from
"prevent rowhammer" to "prevent more than X memory accesses in 64
msec"?

Best regards,
									Pavel

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [kernel-hardening] rowhammer protection [was Re: Getting interrupt every million cache misses]
@ 2016-10-31 21:13                       ` Pavel Machek
  0 siblings, 0 replies; 79+ messages in thread
From: Pavel Machek @ 2016-10-31 21:13 UTC (permalink / raw)
  To: Mark Rutland
  Cc: Kees Cook, Peter Zijlstra, Arnaldo Carvalho de Melo, kernel list,
	Ingo Molnar, Alexander Shishkin, kernel-hardening

[-- Attachment #1: Type: text/plain, Size: 2979 bytes --]

On Mon 2016-10-31 14:47:39, Mark Rutland wrote:
> On Mon, Oct 31, 2016 at 09:27:05AM +0100, Pavel Machek wrote:
> > > On Fri, Oct 28, 2016 at 01:21:36PM +0200, Pavel Machek wrote:
> > > > > Has this been tested on a system vulnerable to rowhammer, and if so, was
> > > > > it reliable in mitigating the issue?
> 
> > > > I do not have vulnerable machine near me, so no "real" tests, but
> > > > I'm pretty sure it will make the error no longer reproducible with the
> > > > newer version. [Help welcome ;-)]
> > > 
> > > Even if we hope this works, I think we have to be very careful with that
> > > kind of assertion. Until we have data is to its efficacy, I don't think
> > > we should claim that this is an effective mitigation.
...
> 
> To be quite frank, this is anecdotal. It only shows one particular attack is
> made slower (or perhaps defeated), and doesn't show that the mitigation is
> reliable or generally applicable (to other machines or other variants of the
> attack).

So... I said that I'm pretty sure it will fix problem in my testing,
then you say that I should be careful with my words, I confirm it was
true, and now you complain that it is anecdotal?

Are you serious?

Of course I know that fixing rowhammer-test on my machine is quite a
low bar to ask. _And that's also why I said I'm pretty sure I'd pass
that bar_.

I'm still asking for help with testing, but all you do is claim that
"we can't be sure".

> Even if this happens to work on some machines, I still do not think one can
> sell this as a generally applicable and reliable mitigation. Especially given
> that others working in this area seem to have evidence otherwise, e.g. [1] (as
> noted by spender in the LWN comments).

Slowing this attack _is_ defeating it. It is enough to slow it 8
times, and it is gone, boom, not there any more.

Now.. I have to figure out what to do with movnt. No currently known
attack uses movnt. Still, that one should be solved.

Other than that... this is not magic. Attack is quite well
understood. All you have to do is prevent more than 8msec worth of
memory accesses. My patch can do that, and it will work,
everywhere... you just won't like the fact that your machine now works
on 10% of original performance.

Now, it is possible that researches will come up with attack that only
needs 2msec worth of accesses. So we change the constants. Performance
will be even worse. It is also possible that even more broken DRAM
comes out. Same solution. Plus someone certainly has a memory that
flips some bits even without help from funny access patterns. Too
bad. We can't help them.

Would it be less confusing if we redefined task description from
"prevent rowhammer" to "prevent more than X memory accesses in 64
msec"?

Best regards,
									Pavel

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [kernel-hardening] rowhammer protection [was Re: Getting interrupt every million cache misses]
  2016-10-31 21:13                       ` Pavel Machek
@ 2016-10-31 22:09                         ` Mark Rutland
  -1 siblings, 0 replies; 79+ messages in thread
From: Mark Rutland @ 2016-10-31 22:09 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Kees Cook, Peter Zijlstra, Arnaldo Carvalho de Melo, kernel list,
	Ingo Molnar, Alexander Shishkin, kernel-hardening

On Mon, Oct 31, 2016 at 10:13:03PM +0100, Pavel Machek wrote:
> On Mon 2016-10-31 14:47:39, Mark Rutland wrote:
> > On Mon, Oct 31, 2016 at 09:27:05AM +0100, Pavel Machek wrote:
> > > > On Fri, Oct 28, 2016 at 01:21:36PM +0200, Pavel Machek wrote:
> > > > > > Has this been tested on a system vulnerable to rowhammer, and if so, was
> > > > > > it reliable in mitigating the issue?
> > 
> > > > > I do not have vulnerable machine near me, so no "real" tests, but
> > > > > I'm pretty sure it will make the error no longer reproducible with the
> > > > > newer version. [Help welcome ;-)]
> > > > 
> > > > Even if we hope this works, I think we have to be very careful with that
> > > > kind of assertion. Until we have data is to its efficacy, I don't think
> > > > we should claim that this is an effective mitigation.
> ...
> > 
> > To be quite frank, this is anecdotal. It only shows one particular attack is
> > made slower (or perhaps defeated), and doesn't show that the mitigation is
> > reliable or generally applicable (to other machines or other variants of the
> > attack).
> 
> So... I said that I'm pretty sure it will fix problem in my testing,
> then you say that I should be careful with my words, I confirm it was
> true, and now you complain that it is anecdotal?

Clearly I have chosen my words poorly here. I believe that this may help
against some attacks on some machines and workloads, and I believe your results
for your machine.

My main concern was that this appears to be described as a general solution, as
in the Kconfig text:

	Enable rowhammer attack prevention. Will degrade system
	performance under attack so much that attack should not
	be feasible.

... yet there are a number of reasons why this may not be the case given varied
attack mechanisms (e.g. using non-cacheable mappings, movnt, etc), given some
hardware configurations (e.g. "large" SMP machines or where timing is
marginal), given some workloads may incidentally trip often enough to be
severely penalised, and given that performance counter support is sufficiently
varied (across architectures, CPU implementations, and even boards using the
same CPU if one considers things like interrupt routing).

Given that, I think that makes an overly-strong, and perhaps misleading claim
(i.e. people could turn the option on and believe that they are protected, when
they are not, leaving them worse off). It isn't really possible to fail
gracefully here, and even if this is suitable for some hardware, very few
people are in a position to determine whether their hardware falls in that
category.

Unfortunately, I do not believe that there is a simple and/or general software
mitigation.

> Would it be less confusing if we redefined task description from
> "prevent rowhammer" to "prevent more than X memory accesses in 64
> msec"?

Definitely. Quantifying exactly what you're trying to defend against (and
therefore what you are not) would help to address at least one of my concerns.

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [kernel-hardening] rowhammer protection [was Re: Getting interrupt every million cache misses]
@ 2016-10-31 22:09                         ` Mark Rutland
  0 siblings, 0 replies; 79+ messages in thread
From: Mark Rutland @ 2016-10-31 22:09 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Kees Cook, Peter Zijlstra, Arnaldo Carvalho de Melo, kernel list,
	Ingo Molnar, Alexander Shishkin, kernel-hardening

On Mon, Oct 31, 2016 at 10:13:03PM +0100, Pavel Machek wrote:
> On Mon 2016-10-31 14:47:39, Mark Rutland wrote:
> > On Mon, Oct 31, 2016 at 09:27:05AM +0100, Pavel Machek wrote:
> > > > On Fri, Oct 28, 2016 at 01:21:36PM +0200, Pavel Machek wrote:
> > > > > > Has this been tested on a system vulnerable to rowhammer, and if so, was
> > > > > > it reliable in mitigating the issue?
> > 
> > > > > I do not have vulnerable machine near me, so no "real" tests, but
> > > > > I'm pretty sure it will make the error no longer reproducible with the
> > > > > newer version. [Help welcome ;-)]
> > > > 
> > > > Even if we hope this works, I think we have to be very careful with that
> > > > kind of assertion. Until we have data is to its efficacy, I don't think
> > > > we should claim that this is an effective mitigation.
> ...
> > 
> > To be quite frank, this is anecdotal. It only shows one particular attack is
> > made slower (or perhaps defeated), and doesn't show that the mitigation is
> > reliable or generally applicable (to other machines or other variants of the
> > attack).
> 
> So... I said that I'm pretty sure it will fix problem in my testing,
> then you say that I should be careful with my words, I confirm it was
> true, and now you complain that it is anecdotal?

Clearly I have chosen my words poorly here. I believe that this may help
against some attacks on some machines and workloads, and I believe your results
for your machine.

My main concern was that this appears to be described as a general solution, as
in the Kconfig text:

	Enable rowhammer attack prevention. Will degrade system
	performance under attack so much that attack should not
	be feasible.

... yet there are a number of reasons why this may not be the case given varied
attack mechanisms (e.g. using non-cacheable mappings, movnt, etc), given some
hardware configurations (e.g. "large" SMP machines or where timing is
marginal), given some workloads may incidentally trip often enough to be
severely penalised, and given that performance counter support is sufficiently
varied (across architectures, CPU implementations, and even boards using the
same CPU if one considers things like interrupt routing).

Given that, I think that makes an overly-strong, and perhaps misleading claim
(i.e. people could turn the option on and believe that they are protected, when
they are not, leaving them worse off). It isn't really possible to fail
gracefully here, and even if this is suitable for some hardware, very few
people are in a position to determine whether their hardware falls in that
category.

Unfortunately, I do not believe that there is a simple and/or general software
mitigation.

> Would it be less confusing if we redefined task description from
> "prevent rowhammer" to "prevent more than X memory accesses in 64
> msec"?

Definitely. Quantifying exactly what you're trying to defend against (and
therefore what you are not) would help to address at least one of my concerns.

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [kernel-hardening] rowhammer protection [was Re: Getting interrupt every million cache misses]
  2016-10-31  8:27                   ` Pavel Machek
@ 2016-11-01  6:33                     ` Ingo Molnar
  -1 siblings, 0 replies; 79+ messages in thread
From: Ingo Molnar @ 2016-11-01  6:33 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Mark Rutland, Kees Cook, Peter Zijlstra,
	Arnaldo Carvalho de Melo, kernel list, Ingo Molnar,
	Alexander Shishkin, kernel-hardening


* Pavel Machek <pavel@ucw.cz> wrote:

> I'm not going to buy broken hardware just for a test.

Can you suggest a method to find heavily rowhammer affected hardware? Only by 
testing it, or are there some chipset IDs ranges or dmidecode info that will 
pinpoint potentially affected machines?

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [kernel-hardening] rowhammer protection [was Re: Getting interrupt every million cache misses]
@ 2016-11-01  6:33                     ` Ingo Molnar
  0 siblings, 0 replies; 79+ messages in thread
From: Ingo Molnar @ 2016-11-01  6:33 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Mark Rutland, Kees Cook, Peter Zijlstra,
	Arnaldo Carvalho de Melo, kernel list, Ingo Molnar,
	Alexander Shishkin, kernel-hardening


* Pavel Machek <pavel@ucw.cz> wrote:

> I'm not going to buy broken hardware just for a test.

Can you suggest a method to find heavily rowhammer affected hardware? Only by 
testing it, or are there some chipset IDs ranges or dmidecode info that will 
pinpoint potentially affected machines?

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [kernel-hardening] rowhammer protection [was Re: Getting interrupt every million cache misses]
  2016-11-01  6:33                     ` Ingo Molnar
  (?)
@ 2016-11-01  7:20                     ` Daniel Micay
  -1 siblings, 0 replies; 79+ messages in thread
From: Daniel Micay @ 2016-11-01  7:20 UTC (permalink / raw)
  To: kernel-hardening, Pavel Machek
  Cc: Mark Rutland, Kees Cook, Peter Zijlstra,
	Arnaldo Carvalho de Melo, kernel list, Ingo Molnar,
	Alexander Shishkin

[-- Attachment #1: Type: text/plain, Size: 1294 bytes --]

On Tue, 2016-11-01 at 07:33 +0100, Ingo Molnar wrote:
> * Pavel Machek <pavel@ucw.cz> wrote:
> 
> > I'm not going to buy broken hardware just for a test.
> 
> Can you suggest a method to find heavily rowhammer affected hardware?
> Only by 
> testing it, or are there some chipset IDs ranges or dmidecode info
> that will 
> pinpoint potentially affected machines?
> 
> Thanks,
> 
> 	Ingo

You can read the memory timing values, but you can't know if they're
reasonable for that hardware. Higher quality memory can have better
timings without being broken. The only relevant information would be the
memory model, combined with an expensive / time consuming effort to
build a blacklist based on testing. It doesn't seem realistic, unless
it's done in a coarse way based on brand and the date information.

I don't know how to get this data on Linux. The CPU-Z tool for Windows
knows how to obtain it but it's based on a proprietary library.

You definitely don't need to buy broken hardware to test a broken
hardware setup though. You just need a custom computer build where
motherboards expose the memory timing configuration. You can make it
more vulnerable by raising the refresh period (tREF). I wanted to play
around with that but haven't gotten around to it.

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [kernel-hardening] rowhammer protection [was Re: Getting interrupt every million cache misses]
  2016-11-01  6:33                     ` Ingo Molnar
  (?)
  (?)
@ 2016-11-01  7:53                     ` Daniel Gruss
  -1 siblings, 0 replies; 79+ messages in thread
From: Daniel Gruss @ 2016-11-01  7:53 UTC (permalink / raw)
  To: kernel-hardening, Pavel Machek
  Cc: Mark Rutland, Kees Cook, Peter Zijlstra,
	Arnaldo Carvalho de Melo, kernel list, Ingo Molnar,
	Alexander Shishkin

On 01.11.2016 07:33, Ingo Molnar wrote:
> Can you suggest a method to find heavily rowhammer affected hardware? Only by
> testing it, or are there some chipset IDs ranges or dmidecode info that will
> pinpoint potentially affected machines?

I have worked with many different systems both on running rowhammer 
attacks and testing defense mechanisms. So far, every Ivy Bridge i5 
(DDR3) that I had access to was susceptible to bit flips - you will have 
highest chances with Ivy Bridge i5...

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [kernel-hardening] rowhammer protection [was Re: Getting interrupt every million cache misses]
  2016-11-01  6:33                     ` Ingo Molnar
@ 2016-11-01  8:10                       ` Pavel Machek
  -1 siblings, 0 replies; 79+ messages in thread
From: Pavel Machek @ 2016-11-01  8:10 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Mark Rutland, Kees Cook, Peter Zijlstra,
	Arnaldo Carvalho de Melo, kernel list, Ingo Molnar,
	Alexander Shishkin, kernel-hardening

[-- Attachment #1: Type: text/plain, Size: 1136 bytes --]

Hi!

> * Pavel Machek <pavel@ucw.cz> wrote:
> 
> > I'm not going to buy broken hardware just for a test.
> 
> Can you suggest a method to find heavily rowhammer affected hardware? Only by 
> testing it, or are there some chipset IDs ranges or dmidecode info that will 
> pinpoint potentially affected machines?

Testing can be used. https://github.com/mseaborn/rowhammer-test.git
. It finds faults at 1 of 2 machines here (but takes half an
hour). Then, if your hardware is one of ivy/sandy/haswell/skylake,
https://github.com/IAIK/rowhammerjs.git can be used for much faster
attack (many flips a second).

Unfortunately, what I have here is:

cpu family     : 6
model	       	 : 23
model name	 : Intel(R) Core(TM)2 Duo CPU     E7400  @ 2.80GHz
stepping	 : 10
microcode	 : 0xa07

so rowhammerjs/native is not available for this system. Bit mapping
for memory hash functions would need to be reverse engineered for more
effective attack.

Best regards,
									Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [kernel-hardening] rowhammer protection [was Re: Getting interrupt every million cache misses]
@ 2016-11-01  8:10                       ` Pavel Machek
  0 siblings, 0 replies; 79+ messages in thread
From: Pavel Machek @ 2016-11-01  8:10 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Mark Rutland, Kees Cook, Peter Zijlstra,
	Arnaldo Carvalho de Melo, kernel list, Ingo Molnar,
	Alexander Shishkin, kernel-hardening

[-- Attachment #1: Type: text/plain, Size: 1136 bytes --]

Hi!

> * Pavel Machek <pavel@ucw.cz> wrote:
> 
> > I'm not going to buy broken hardware just for a test.
> 
> Can you suggest a method to find heavily rowhammer affected hardware? Only by 
> testing it, or are there some chipset IDs ranges or dmidecode info that will 
> pinpoint potentially affected machines?

Testing can be used. https://github.com/mseaborn/rowhammer-test.git
. It finds faults at 1 of 2 machines here (but takes half an
hour). Then, if your hardware is one of ivy/sandy/haswell/skylake,
https://github.com/IAIK/rowhammerjs.git can be used for much faster
attack (many flips a second).

Unfortunately, what I have here is:

cpu family     : 6
model	       	 : 23
model name	 : Intel(R) Core(TM)2 Duo CPU     E7400  @ 2.80GHz
stepping	 : 10
microcode	 : 0xa07

so rowhammerjs/native is not available for this system. Bit mapping
for memory hash functions would need to be reverse engineered for more
effective attack.

Best regards,
									Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [kernel-hardening] rowhammer protection [was Re: Getting interrupt every million cache misses]
  2016-11-01  8:10                       ` Pavel Machek
  (?)
@ 2016-11-01  8:13                       ` Daniel Gruss
  -1 siblings, 0 replies; 79+ messages in thread
From: Daniel Gruss @ 2016-11-01  8:13 UTC (permalink / raw)
  To: kernel-hardening, Ingo Molnar
  Cc: Mark Rutland, Kees Cook, Peter Zijlstra,
	Arnaldo Carvalho de Melo, kernel list, Ingo Molnar,
	Alexander Shishkin

On 01.11.2016 09:10, Pavel Machek wrote:
> cpu family     : 6
> model	       	 : 23
> model name	 : Intel(R) Core(TM)2 Duo CPU     E7400  @ 2.80GHz
> stepping	 : 10
> microcode	 : 0xa07
>
> so rowhammerjs/native is not available for this system. Bit mapping
> for memory hash functions would need to be reverse engineered for more
> effective attack.

By coincidence, we wrote a tool to do that in software: 
https://github.com/IAIK/drama ;)

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [kernel-hardening] rowhammer protection [was Re: Getting interrupt every million cache misses]
  2016-10-28 14:18                   ` Peter Zijlstra
@ 2016-11-02 18:13                     ` Pavel Machek
  -1 siblings, 0 replies; 79+ messages in thread
From: Pavel Machek @ 2016-11-02 18:13 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Mark Rutland, Kees Cook, Arnaldo Carvalho de Melo, kernel list,
	Ingo Molnar, Alexander Shishkin, kernel-hardening

[-- Attachment #1: Type: text/plain, Size: 2023 bytes --]

Hi!

> On Fri, Oct 28, 2016 at 03:05:22PM +0100, Mark Rutland wrote:
> > 
> > > > * the precise semantics of performance counter events varies drastically
> > > >   across implementations. PERF_COUNT_HW_CACHE_MISSES, might only map to
> > > >   one particular level of cache, and/or may not be implemented on all
> > > >   cores.
> > > 
> > > If it maps to one particular cache level, we are fine (or maybe will
> > > trigger protection too often). If some cores are not counted, that's bad.
> > 
> > Perhaps, but that depends on a number of implementation details. If "too
> > often" means "all the time", people will turn this off when they could
> > otherwise have been protected (e.g. if we can accurately monitor the
> > last level of cache).
> 
> Right, so one of the things mentioned in the paper is x86 NT stores.
> Those are not cached and I'm not at all sure they're accounted in the
> event we use for cache misses.

Well, I tried this... and the movnti is as fast as plain mov. Clearly
it is being cached here.

I guess we could switch to different performance counter, such as

+       [PERF_COUNT_HW_BUS_CYCLES]              = 0xc06f, /* Non
halted bus cycles: 0x013c */

if NT stores are indeed a problem. But so far I don't have any
indication they are, so I'd like to have an working example to test
against. (It does not have to produce bitflips, it would be enough to
produce enough memory traffic bypassing cache.)

Best regards,
								Pavel

/*
 * gcc -O2 rowhammer.c -o rowhammer
 */

char pad[1024];
long long foo;
char pad2[1024];

void main(void)
{
	long long i;
	asm volatile(
		"mov $foo, %%edi \n\
			clflush (%%edi)" ::: "%edi");
	
	for (i=0; i<1000000000; i++) {
#if 1
		asm volatile(
			"mov $foo, %%edi \n\
			movnti %%eax, (%%edi)" ::: "%edi");
#endif

		// asm volatile( "" );
	}
}


-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [kernel-hardening] rowhammer protection [was Re: Getting interrupt every million cache misses]
@ 2016-11-02 18:13                     ` Pavel Machek
  0 siblings, 0 replies; 79+ messages in thread
From: Pavel Machek @ 2016-11-02 18:13 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Mark Rutland, Kees Cook, Arnaldo Carvalho de Melo, kernel list,
	Ingo Molnar, Alexander Shishkin, kernel-hardening

[-- Attachment #1: Type: text/plain, Size: 2023 bytes --]

Hi!

> On Fri, Oct 28, 2016 at 03:05:22PM +0100, Mark Rutland wrote:
> > 
> > > > * the precise semantics of performance counter events varies drastically
> > > >   across implementations. PERF_COUNT_HW_CACHE_MISSES, might only map to
> > > >   one particular level of cache, and/or may not be implemented on all
> > > >   cores.
> > > 
> > > If it maps to one particular cache level, we are fine (or maybe will
> > > trigger protection too often). If some cores are not counted, that's bad.
> > 
> > Perhaps, but that depends on a number of implementation details. If "too
> > often" means "all the time", people will turn this off when they could
> > otherwise have been protected (e.g. if we can accurately monitor the
> > last level of cache).
> 
> Right, so one of the things mentioned in the paper is x86 NT stores.
> Those are not cached and I'm not at all sure they're accounted in the
> event we use for cache misses.

Well, I tried this... and the movnti is as fast as plain mov. Clearly
it is being cached here.

I guess we could switch to different performance counter, such as

+       [PERF_COUNT_HW_BUS_CYCLES]              = 0xc06f, /* Non
halted bus cycles: 0x013c */

if NT stores are indeed a problem. But so far I don't have any
indication they are, so I'd like to have an working example to test
against. (It does not have to produce bitflips, it would be enough to
produce enough memory traffic bypassing cache.)

Best regards,
								Pavel

/*
 * gcc -O2 rowhammer.c -o rowhammer
 */

char pad[1024];
long long foo;
char pad2[1024];

void main(void)
{
	long long i;
	asm volatile(
		"mov $foo, %%edi \n\
			clflush (%%edi)" ::: "%edi");
	
	for (i=0; i<1000000000; i++) {
#if 1
		asm volatile(
			"mov $foo, %%edi \n\
			movnti %%eax, (%%edi)" ::: "%edi");
#endif

		// asm volatile( "" );
	}
}


-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

^ permalink raw reply	[flat|nested] 79+ messages in thread

end of thread, other threads:[~2016-11-02 18:13 UTC | newest]

Thread overview: 79+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-10-26 20:54 Getting interrupt every million cache misses Pavel Machek
2016-10-27  8:28 ` Peter Zijlstra
2016-10-27  8:46   ` Pavel Machek
2016-10-27  9:15     ` Peter Zijlstra
2016-10-27  9:11   ` Pavel Machek
2016-10-27  9:33     ` Peter Zijlstra
2016-10-27 20:40       ` Kees Cook
2016-10-27 20:40         ` [kernel-hardening] " Kees Cook
2016-10-27 21:27         ` rowhammer protection [was Re: Getting interrupt every million cache misses] Pavel Machek
2016-10-27 21:27           ` [kernel-hardening] " Pavel Machek
2016-10-28  7:07           ` Ingo Molnar
2016-10-28  7:07             ` [kernel-hardening] " Ingo Molnar
2016-10-28  8:50             ` Pavel Machek
2016-10-28  8:50               ` [kernel-hardening] " Pavel Machek
2016-10-28  8:59               ` Ingo Molnar
2016-10-28  8:59                 ` [kernel-hardening] " Ingo Molnar
2016-10-28 11:55                 ` Pavel Machek
2016-10-28 11:55                   ` [kernel-hardening] " Pavel Machek
2016-10-28  9:04               ` Peter Zijlstra
2016-10-28  9:04                 ` [kernel-hardening] " Peter Zijlstra
2016-10-28  9:27                 ` Vegard Nossum
2016-10-28  9:27                   ` [kernel-hardening] " Vegard Nossum
2016-10-28  9:35                   ` Ingo Molnar
2016-10-28  9:35                     ` [kernel-hardening] " Ingo Molnar
2016-10-28  9:47                     ` Vegard Nossum
2016-10-28  9:47                       ` [kernel-hardening] " Vegard Nossum
2016-10-28  9:53                     ` Mark Rutland
2016-10-28 11:27                 ` Pavel Machek
2016-10-28 11:27                   ` [kernel-hardening] " Pavel Machek
2016-10-28  9:51           ` [kernel-hardening] " Mark Rutland
2016-10-28  9:51             ` Mark Rutland
2016-10-28 11:21             ` Pavel Machek
2016-10-28 11:21               ` Pavel Machek
2016-10-28 14:05               ` Mark Rutland
2016-10-28 14:05                 ` Mark Rutland
2016-10-28 14:18                 ` Peter Zijlstra
2016-10-28 14:18                   ` Peter Zijlstra
2016-10-28 18:30                   ` Pavel Machek
2016-10-28 18:30                     ` Pavel Machek
2016-10-28 18:48                     ` Peter Zijlstra
2016-10-28 18:48                       ` Peter Zijlstra
2016-11-02 18:13                   ` Pavel Machek
2016-11-02 18:13                     ` Pavel Machek
2016-10-28 17:27                 ` Pavel Machek
2016-10-28 17:27                   ` Pavel Machek
2016-10-29 13:06                   ` Daniel Gruss
2016-10-29 13:06                     ` Daniel Gruss
2016-10-29 19:42                     ` Pavel Machek
2016-10-29 19:42                       ` Pavel Machek
2016-10-29 20:05                       ` Daniel Gruss
2016-10-29 20:05                         ` Daniel Gruss
2016-10-29 20:14                         ` Daniel Gruss
2016-10-29 21:05                         ` Pavel Machek
2016-10-29 21:05                           ` Pavel Machek
2016-10-29 21:07                           ` Daniel Gruss
2016-10-29 21:07                             ` Daniel Gruss
2016-10-29 21:45                             ` Pavel Machek
2016-10-29 21:45                               ` Pavel Machek
2016-10-29 21:49                               ` Daniel Gruss
2016-10-29 21:49                                 ` Daniel Gruss
2016-10-29 22:01                                 ` Pavel Machek
2016-10-29 22:01                                   ` Pavel Machek
2016-10-29 22:02                                   ` Daniel Gruss
2016-10-29 22:02                                     ` Daniel Gruss
2016-10-31  8:27                 ` Pavel Machek
2016-10-31  8:27                   ` Pavel Machek
2016-10-31 14:47                   ` Mark Rutland
2016-10-31 14:47                     ` Mark Rutland
2016-10-31 21:13                     ` Pavel Machek
2016-10-31 21:13                       ` Pavel Machek
2016-10-31 22:09                       ` Mark Rutland
2016-10-31 22:09                         ` Mark Rutland
2016-11-01  6:33                   ` Ingo Molnar
2016-11-01  6:33                     ` Ingo Molnar
2016-11-01  7:20                     ` Daniel Micay
2016-11-01  7:53                     ` Daniel Gruss
2016-11-01  8:10                     ` Pavel Machek
2016-11-01  8:10                       ` Pavel Machek
2016-11-01  8:13                       ` Daniel Gruss

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.