All of lore.kernel.org
 help / color / mirror / Atom feed
* VDSO and dcache aliasing
@ 2018-08-28 16:02 Alexandre Belloni
  2018-08-30 18:01 ` [PATCH] MIPS: VDSO: Match data page cache colouring when D$ aliases Paul Burton
  0 siblings, 1 reply; 9+ messages in thread
From: Alexandre Belloni @ 2018-08-28 16:02 UTC (permalink / raw)
  To: James Hogan; +Cc: Paul Burton, linux-mips, linux-kernel

Hello James,

A year ago, you wrote that patch:

https://www.linux-mips.org/archives/linux-mips/2017-06/msg00658.html

You called it a hack but it has been used since then. As you will
certainly realize by now, Ocelot is one of the affected SoC so we would
pretty much like to see this going upstream.

What would be the way forward?

Regards,

-- 
Alexandre Belloni, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH] MIPS: VDSO: Match data page cache colouring when D$ aliases
  2018-08-28 16:02 VDSO and dcache aliasing Alexandre Belloni
@ 2018-08-30 18:01 ` Paul Burton
  2018-08-31  8:58     ` Rene.Nielsen
                     ` (2 more replies)
  0 siblings, 3 replies; 9+ messages in thread
From: Paul Burton @ 2018-08-30 18:01 UTC (permalink / raw)
  To: Alexandre Belloni, Rene Nielsen, Hauke Mehrtens
  Cc: linux-mips, Paul Burton, James Hogan, stable

When a system suffers from dcache aliasing a user program may observe
stale VDSO data from an aliased cache line. Notably this can break the
expectation that clock_gettime(CLOCK_MONOTONIC, ...) is, as its name
suggests, monotonic.

In order to ensure that users observe updates to the VDSO data page as
intended, align the user mappings of the VDSO data page such that their
cache colouring matches that of the virtual address range which the
kernel will use to update the data page - typically its unmapped address
within kseg0.

This ensures that we don't introduce aliasing cache lines for the VDSO
data page, and therefore that userland will observe updates without
requiring cache invalidation.

Signed-off-by: Paul Burton <paul.burton@mips.com>
Reported-by: Hauke Mehrtens <hauke@hauke-m.de>
Reported-by: Rene Nielsen <rene.nielsen@microsemi.com>
Reported-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Fixes: ebb5e78cc634 ("MIPS: Initial implementation of a VDSO")
Cc: James Hogan <jhogan@kernel.org>
Cc: linux-mips@linux-mips.org
Cc: stable@vger.kernel.org # v4.4+
---
Hi Alexandre,

Could you try this out on your Ocelot system? Hopefully it'll solve the
problem just as well as James' patch but doesn't need the questionable
change to arch_get_unmapped_area_common().

Thanks,
    Paul
---
 arch/mips/kernel/vdso.c | 20 ++++++++++++++++++++
 1 file changed, 20 insertions(+)

diff --git a/arch/mips/kernel/vdso.c b/arch/mips/kernel/vdso.c
index 019035d7225c..5fb617a42335 100644
--- a/arch/mips/kernel/vdso.c
+++ b/arch/mips/kernel/vdso.c
@@ -13,6 +13,7 @@
 #include <linux/err.h>
 #include <linux/init.h>
 #include <linux/ioport.h>
+#include <linux/kernel.h>
 #include <linux/mm.h>
 #include <linux/sched.h>
 #include <linux/slab.h>
@@ -20,6 +21,7 @@
 
 #include <asm/abi.h>
 #include <asm/mips-cps.h>
+#include <asm/page.h>
 #include <asm/vdso.h>
 
 /* Kernel-provided data used by the VDSO. */
@@ -128,12 +130,30 @@ int arch_setup_additional_pages(struct linux_binprm *bprm, int uses_interp)
 	vvar_size = gic_size + PAGE_SIZE;
 	size = vvar_size + image->size;
 
+	/*
+	 * Find a region that's large enough for us to perform the
+	 * colour-matching alignment below.
+	 */
+	if (cpu_has_dc_aliases)
+		size += shm_align_mask + 1;
+
 	base = get_unmapped_area(NULL, 0, size, 0, 0);
 	if (IS_ERR_VALUE(base)) {
 		ret = base;
 		goto out;
 	}
 
+	/*
+	 * If we suffer from dcache aliasing, ensure that the VDSO data page is
+	 * coloured the same as the kernel's mapping of that memory. This
+	 * ensures that when the kernel updates the VDSO data userland will see
+	 * it without requiring cache invalidations.
+	 */
+	if (cpu_has_dc_aliases) {
+		base = __ALIGN_MASK(base, shm_align_mask);
+		base += ((unsigned long)&vdso_data - gic_size) & shm_align_mask;
+	}
+
 	data_addr = base + gic_size;
 	vdso_addr = data_addr + PAGE_SIZE;
 
-- 
2.18.0

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* RE: [PATCH] MIPS: VDSO: Match data page cache colouring when D$ aliases
@ 2018-08-31  8:58     ` Rene.Nielsen
  0 siblings, 0 replies; 9+ messages in thread
From: Rene.Nielsen @ 2018-08-31  8:58 UTC (permalink / raw)
  To: paul.burton, alexandre.belloni, hauke; +Cc: linux-mips, jhogan, stable

[-- Attachment #1: Type: text/plain, Size: 9898 bytes --]

Hi guys,

I have looked a bit more at this issue and found ways to reproduce and looked
further at both James Hogan's and Paul Burton's patches.

--------------------------------------oOo--------------------------------------

The Problem
-----------

On our MIPS 24KEc, the dcache is 32 KBytes, 4-way and had Linux utilized a page
size of 8 Kbytes, we wouldn't have this dcache aliasing problem.

With a 4 Kbytes page size, however, it must be ensured that the color of
user-land pages is the same as the color of kernel-space pages when memory is
shared between the two.

In our case, this means that NOT only bits 11:0 (page-aligned) of the addresses
must be identical, but also bit 12 (making them color-aligned).

In order to expose the problem, we must therefore attempt to have VDSO in kernel
have a different page color than VDSO for the user-land mapping.

When a program loads, the first data that gets allocated is for glibc's loader
(ld-2.27.so in our case), and the next thing that gets allocated is the two
pages needed for VDSO ([vvar] and [vdso]).

Therefore, the page color of user-space VDSO highly depends on the size of
the loader's requested data. In my original post
(https://www.linux-mips.org/archives/linux-mips/2017-06/msg00621.html), I wrote
that it started happening when compiling glibc with
'-fasynchronous-unwind-tables'. This may have changed the loader's data size to
go from an even number of pages to an odd number of pages or vice versa, thereby
making the color of the subsequent VDSO user-space mapping likely to be
different from the kernel's.

A change in the linux kernel may also produce this, because of a change in page
color of the address where 'vdso_data' (declared in arch/mips/kernel/vdso.c)
starts.

For completeness, here's a snippet from the pagemap for some random process:

Section Name    Perm Virt Start Virt End   Virt Size  Phys Size
--------------- ---- ---------- ---------- ---------- ----------
...
                rwxp 0x77d04000 0x77d0e000      40960      36864
[vvar]          r--p 0x77d0e000 0x77d0f000       4096          0
[vdso]          r-xp 0x77d0f000 0x77d10000       4096       4096
/lib/ld-2.27.so r-xp 0x77d10000 0x77d11000       4096       4096
/lib/ld-2.27.so rwxp 0x77d11000 0x77d12000       4096       4096
[stack]         rwxp 0x7ff81000 0x7ffa2000     135168      28672

--------------------------------------oOo--------------------------------------

Modify kernel to provoke the issue
----------------------------------

In order to provoke the problem, we must first figure out whether the color of
'vdso_data' in kernel-space is different from [vvar] in user-space.

The attached patch named 'vdso-chk-1.patch' prints the address of &vdso_data and
the corresponding user-land address and "aligned" if bit 12 are identical and
"NOT ALIGNED" if not.

If "aligned" is printed for all started processes, I suggest trying the attached
patch named 'vdso-chk-2.patch'. This will declare a dummy variable in vdso.c
that will cause the linker to place vdso_data at a differently colored page.

--------------------------------------oOo--------------------------------------

Reproduce
---------

When the error is reproducible, you may want to attempt to provoke it.
I've attached a program, 'provoke.c', that will print to stderr whenever two
consecutive timestamps are received out of order from the kernel.

To increase the chance of errors to occur, the program must be instantiated
many times in parallel. The following shell command will create 50 simultaneous
instances of it:
    $ for i in $(seq 50); do provoke > /dev/null & done

An example of a snippet of the output when it goes wrong:
    ...
    [   46.926329] tgid = 171, pid = 171, comm =    timeofday: data_addr = 0x77f60000, &vdso_data = 0x80525000, &dummy = 0x80524000 => NOT ALIGNED
    [   46.986344] tgid = 172, pid = 172, comm =    timeofday: data_addr = 0x77126000, &vdso_data = 0x80525000, &dummy = 0x80524000 => NOT ALIGNED
    [   47.070821] tgid = 173, pid = 173, comm =    timeofday: data_addr = 0x7701c000, &vdso_data = 0x80525000, &dummy = 0x80524000 => NOT ALIGNED
    [   47.090460] tgid = 170, pid = 170, comm =    timeofday: data_addr = 0x779c2000, &vdso_data = 0x80525000, &dummy = 0x80524000 => NOT ALIGNED
    [   47.138366] tgid = 174, pid = 174, comm =    timeofday: data_addr = 0x77f60000, &vdso_data = 0x80525000, &dummy = 0x80524000 => NOT ALIGNED
    [   47.166330] tgid = 175, pid = 175, comm =    timeofday: data_addr = 0x77406000, &vdso_data = 0x80525000, &dummy = 0x80524000 => NOT ALIGNED
    tid = 126: Ran 10000 times. error_cnt = 0, success_cnt = 10000
    tid = 130: Ran 10000 times. error_cnt = 0, success_cnt = 10000
    Error: tid = 174: clock_gettime(): Prev = 56043, Cur = 53056, diff = -2987
    Error: tid = 161: clock_gettime(): Prev = 56247, Cur = 53060, diff = -3187
    Error: tid = 168: clock_gettime(): Prev = 56251, Cur = 53064, diff = -3187
    Error: tid = 137: clock_gettime(): Prev = 56255, Cur = 53068, diff = -3187
    Error: tid = 175: clock_gettime(): Prev = 56259, Cur = 53072, diff = -3187
    Error: tid = 129: clock_gettime(): Prev = 56263, Cur = 53076, diff = -3187
    tid = 129: Ran 10000 times. error_cnt = 1, success_cnt = 9999
    Error: tid = 155: clock_gettime(): Prev = 56267, Cur = 53078, diff = -3189
    Error: tid = 165: clock_gettime(): Prev = 56271, Cur = 53080, diff = -3191
    ...

--------------------------------------oOo--------------------------------------

Trying out James Hogan's patch
------------------------------

With the error-producing version of vdso-chk-X.patch applied, apply James'
patch and run the 'provoke' program again.

This works since kernel- and user-space coloring always becomes identical.

--------------------------------------oOo--------------------------------------

Trying out Paul Burton's patch
------------------------------

With the error-producing version of vdso-chk-X.patch applied, apply Paul's patch
and run the 'provoke' program again.

This also works.

Paul's patch allocates twice the amount of needed VM, but I guess that's fine,
as it's also less intrusive (no changes to mmap.c).

Regards,
René Nielsen

-----Original Message-----
From: Paul Burton [mailto:paul.burton@mips.com] 
Sent: 30. august 2018 20:01
To: Alexandre Belloni <alexandre.belloni@bootlin.com>; Rene Nielsen <rene.nielsen@microsemi.com>; Hauke Mehrtens <hauke@hauke-m.de>
Cc: linux-mips@linux-mips.org; Paul Burton <paul.burton@mips.com>; James Hogan <jhogan@kernel.org>; stable@vger.kernel.org
Subject: [PATCH] MIPS: VDSO: Match data page cache colouring when D$ aliases

EXTERNAL EMAIL


When a system suffers from dcache aliasing a user program may observe stale VDSO data from an aliased cache line. Notably this can break the expectation that clock_gettime(CLOCK_MONOTONIC, ...) is, as its name suggests, monotonic.

In order to ensure that users observe updates to the VDSO data page as intended, align the user mappings of the VDSO data page such that their cache colouring matches that of the virtual address range which the kernel will use to update the data page - typically its unmapped address within kseg0.

This ensures that we don't introduce aliasing cache lines for the VDSO data page, and therefore that userland will observe updates without requiring cache invalidation.

Signed-off-by: Paul Burton <paul.burton@mips.com>
Reported-by: Hauke Mehrtens <hauke@hauke-m.de>
Reported-by: Rene Nielsen <rene.nielsen@microsemi.com>
Reported-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Fixes: ebb5e78cc634 ("MIPS: Initial implementation of a VDSO")
Cc: James Hogan <jhogan@kernel.org>
Cc: linux-mips@linux-mips.org
Cc: stable@vger.kernel.org # v4.4+
---
Hi Alexandre,

Could you try this out on your Ocelot system? Hopefully it'll solve the problem just as well as James' patch but doesn't need the questionable change to arch_get_unmapped_area_common().

Thanks,
    Paul
---
 arch/mips/kernel/vdso.c | 20 ++++++++++++++++++++
 1 file changed, 20 insertions(+)

diff --git a/arch/mips/kernel/vdso.c b/arch/mips/kernel/vdso.c index 019035d7225c..5fb617a42335 100644
--- a/arch/mips/kernel/vdso.c
+++ b/arch/mips/kernel/vdso.c
@@ -13,6 +13,7 @@
 #include <linux/err.h>
 #include <linux/init.h>
 #include <linux/ioport.h>
+#include <linux/kernel.h>
 #include <linux/mm.h>
 #include <linux/sched.h>
 #include <linux/slab.h>
@@ -20,6 +21,7 @@

 #include <asm/abi.h>
 #include <asm/mips-cps.h>
+#include <asm/page.h>
 #include <asm/vdso.h>

 /* Kernel-provided data used by the VDSO. */ @@ -128,12 +130,30 @@ int arch_setup_additional_pages(struct linux_binprm *bprm, int uses_interp)
        vvar_size = gic_size + PAGE_SIZE;
        size = vvar_size + image->size;

+       /*
+        * Find a region that's large enough for us to perform the
+        * colour-matching alignment below.
+        */
+       if (cpu_has_dc_aliases)
+               size += shm_align_mask + 1;
+
        base = get_unmapped_area(NULL, 0, size, 0, 0);
        if (IS_ERR_VALUE(base)) {
                ret = base;
                goto out;
        }

+       /*
+        * If we suffer from dcache aliasing, ensure that the VDSO data page is
+        * coloured the same as the kernel's mapping of that memory. This
+        * ensures that when the kernel updates the VDSO data userland will see
+        * it without requiring cache invalidations.
+        */
+       if (cpu_has_dc_aliases) {
+               base = __ALIGN_MASK(base, shm_align_mask);
+               base += ((unsigned long)&vdso_data - gic_size) & shm_align_mask;
+       }
+
        data_addr = base + gic_size;
        vdso_addr = data_addr + PAGE_SIZE;

--
2.18.0


[-- Attachment #2: vdso-chk-1.patch --]
[-- Type: application/octet-stream, Size: 725 bytes --]

diff --git a/arch/mips/kernel/vdso.c b/arch/mips/kernel/vdso.c
index f9dbfb1..616fef4 100644
--- a/arch/mips/kernel/vdso.c
+++ b/arch/mips/kernel/vdso.c
@@ -181,6 +181,17 @@ int arch_setup_additional_pages(struct linux_binprm *bprm, int uses_interp)
 	ret = 0;
 
 out:
+	{
+		unsigned long d = data_addr  & 0x00001000LU;
+		unsigned long v = (unsigned long)&vdso_data & 0x00001000LU;
+
+		/* Bit 12 must be identical in user- and kernel-space */
+		printk(KERN_ERR "tgid = %u, pid = %u, comm = %12s: data_addr = "
+			"0x%08lx, &vdso_data = 0x%p => %s\n",
+			current->tgid, current->pid, current->comm, data_addr,
+			&vdso_data, d == v ? "aligned" : "NOT ALIGNED");
+        }
+
 	up_write(&mm->mmap_sem);
 	return ret;
 }

[-- Attachment #3: vdso-chk-2.patch --]
[-- Type: application/octet-stream, Size: 1116 bytes --]

diff --git a/arch/mips/kernel/vdso.c b/arch/mips/kernel/vdso.c
index f9dbfb1..62952ae 100644
--- a/arch/mips/kernel/vdso.c
+++ b/arch/mips/kernel/vdso.c
@@ -24,6 +24,7 @@
 
 /* Kernel-provided data used by the VDSO. */
 static union mips_vdso_data vdso_data __page_aligned_data;
+static union mips_vdso_data dummy     __page_aligned_data;
 
 /*
  * Mapping for the VDSO data/GIC pages. The real pages are mapped manually, as
@@ -181,6 +182,22 @@ int arch_setup_additional_pages(struct linux_binprm *bprm, int uses_interp)
 	ret = 0;
 
 out:
+	{
+		unsigned long d = data_addr  & 0x00001000LU;
+		unsigned long v = (unsigned long)&vdso_data & 0x00001000LU;
+
+		/*
+		 * Make sure we use #dummy for something or the linker will
+		 * discard it.
+		 */
+
+		/* Bit 12 must be identical in user- and kernel-space */
+		printk(KERN_ERR "tgid = %u, pid = %u, comm = %12s: data_addr = "
+			"0x%08lx, &vdso_data = 0x%p, &dummy = 0x%p => %s\n",
+			current->tgid, current->pid, current->comm, data_addr,
+			&vdso_data, &dummy, d == v ? "aligned" : "NOT ALIGNED");
+        }
+
 	up_write(&mm->mmap_sem);
 	return ret;
 }

[-- Attachment #4: provoke.c --]
[-- Type: text/plain, Size: 1548 bytes --]

#include <stdio.h>
#include <stdlib.h>
#include <stdint.h>
#include <time.h>
#include <signal.h>
#include <string.h>
#include <errno.h>
#include <unistd.h>
#include <sys/syscall.h>
#include <inttypes.h>

// for i in $(seq 50); do provoke > /dev/null & done

static volatile int run = 1;

static void ctrl_c_handler(int sig)
{
    run = 0;
}

static uint64_t milliseconds(void)
{
    struct timespec time;
    if (clock_gettime(CLOCK_MONOTONIC, &time) == 0) {
        return ((uint64_t)time.tv_sec * 1000ULL) + (time.tv_nsec / 1000000);
    }

    fprintf(stderr, "clock_gettime() failed: %s\n", strerror(errno));
    exit(-1);
}

int main(void)
{
    uint32_t i, error_cnt = 0;
    uint64_t prev_time = 0;
    int      tid = syscall(SYS_gettid);

    signal(SIGINT, ctrl_c_handler);

    for (i = 0; i < 10000 && run; i++) {
        uint64_t cur_time = milliseconds();

        if (cur_time < prev_time) {
            error_cnt++;
            fprintf(stderr, "Error: tid = %d: clock_gettime(): Prev = %" PRIu64
	            ", Cur = %" PRIu64 ", diff = -%" PRIu64 "\n",
		    tid, prev_time, cur_time, prev_time - cur_time);
        } else {
            fprintf(stdout, "Info:  tid = %d: clock_gettime(): Prev = %" PRIu64
	            ", Cur = %" PRIu64 ", diff =  %" PRIu64 "\n",
		    tid, prev_time, cur_time, cur_time - prev_time);
        }

        prev_time = cur_time;
    }

    fprintf(stderr, "tid = %d: Ran %u times. error_cnt = %u, success_cnt = %u\n",
            tid, i, error_cnt, i - error_cnt);

    return error_cnt ? -1 : 0;
}


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* RE: [PATCH] MIPS: VDSO: Match data page cache colouring when D$ aliases
@ 2018-08-31  8:58     ` Rene.Nielsen
  0 siblings, 0 replies; 9+ messages in thread
From: Rene.Nielsen @ 2018-08-31  8:58 UTC (permalink / raw)
  To: paul.burton, alexandre.belloni, hauke; +Cc: linux-mips, jhogan, stable

[-- Attachment #1: Type: text/plain, Size: 9898 bytes --]

Hi guys,

I have looked a bit more at this issue and found ways to reproduce and looked
further at both James Hogan's and Paul Burton's patches.

--------------------------------------oOo--------------------------------------

The Problem
-----------

On our MIPS 24KEc, the dcache is 32 KBytes, 4-way and had Linux utilized a page
size of 8 Kbytes, we wouldn't have this dcache aliasing problem.

With a 4 Kbytes page size, however, it must be ensured that the color of
user-land pages is the same as the color of kernel-space pages when memory is
shared between the two.

In our case, this means that NOT only bits 11:0 (page-aligned) of the addresses
must be identical, but also bit 12 (making them color-aligned).

In order to expose the problem, we must therefore attempt to have VDSO in kernel
have a different page color than VDSO for the user-land mapping.

When a program loads, the first data that gets allocated is for glibc's loader
(ld-2.27.so in our case), and the next thing that gets allocated is the two
pages needed for VDSO ([vvar] and [vdso]).

Therefore, the page color of user-space VDSO highly depends on the size of
the loader's requested data. In my original post
(https://www.linux-mips.org/archives/linux-mips/2017-06/msg00621.html), I wrote
that it started happening when compiling glibc with
'-fasynchronous-unwind-tables'. This may have changed the loader's data size to
go from an even number of pages to an odd number of pages or vice versa, thereby
making the color of the subsequent VDSO user-space mapping likely to be
different from the kernel's.

A change in the linux kernel may also produce this, because of a change in page
color of the address where 'vdso_data' (declared in arch/mips/kernel/vdso.c)
starts.

For completeness, here's a snippet from the pagemap for some random process:

Section Name    Perm Virt Start Virt End   Virt Size  Phys Size
--------------- ---- ---------- ---------- ---------- ----------
...
                rwxp 0x77d04000 0x77d0e000      40960      36864
[vvar]          r--p 0x77d0e000 0x77d0f000       4096          0
[vdso]          r-xp 0x77d0f000 0x77d10000       4096       4096
/lib/ld-2.27.so r-xp 0x77d10000 0x77d11000       4096       4096
/lib/ld-2.27.so rwxp 0x77d11000 0x77d12000       4096       4096
[stack]         rwxp 0x7ff81000 0x7ffa2000     135168      28672

--------------------------------------oOo--------------------------------------

Modify kernel to provoke the issue
----------------------------------

In order to provoke the problem, we must first figure out whether the color of
'vdso_data' in kernel-space is different from [vvar] in user-space.

The attached patch named 'vdso-chk-1.patch' prints the address of &vdso_data and
the corresponding user-land address and "aligned" if bit 12 are identical and
"NOT ALIGNED" if not.

If "aligned" is printed for all started processes, I suggest trying the attached
patch named 'vdso-chk-2.patch'. This will declare a dummy variable in vdso.c
that will cause the linker to place vdso_data at a differently colored page.

--------------------------------------oOo--------------------------------------

Reproduce
---------

When the error is reproducible, you may want to attempt to provoke it.
I've attached a program, 'provoke.c', that will print to stderr whenever two
consecutive timestamps are received out of order from the kernel.

To increase the chance of errors to occur, the program must be instantiated
many times in parallel. The following shell command will create 50 simultaneous
instances of it:
    $ for i in $(seq 50); do provoke > /dev/null & done

An example of a snippet of the output when it goes wrong:
    ...
    [   46.926329] tgid = 171, pid = 171, comm =    timeofday: data_addr = 0x77f60000, &vdso_data = 0x80525000, &dummy = 0x80524000 => NOT ALIGNED
    [   46.986344] tgid = 172, pid = 172, comm =    timeofday: data_addr = 0x77126000, &vdso_data = 0x80525000, &dummy = 0x80524000 => NOT ALIGNED
    [   47.070821] tgid = 173, pid = 173, comm =    timeofday: data_addr = 0x7701c000, &vdso_data = 0x80525000, &dummy = 0x80524000 => NOT ALIGNED
    [   47.090460] tgid = 170, pid = 170, comm =    timeofday: data_addr = 0x779c2000, &vdso_data = 0x80525000, &dummy = 0x80524000 => NOT ALIGNED
    [   47.138366] tgid = 174, pid = 174, comm =    timeofday: data_addr = 0x77f60000, &vdso_data = 0x80525000, &dummy = 0x80524000 => NOT ALIGNED
    [   47.166330] tgid = 175, pid = 175, comm =    timeofday: data_addr = 0x77406000, &vdso_data = 0x80525000, &dummy = 0x80524000 => NOT ALIGNED
    tid = 126: Ran 10000 times. error_cnt = 0, success_cnt = 10000
    tid = 130: Ran 10000 times. error_cnt = 0, success_cnt = 10000
    Error: tid = 174: clock_gettime(): Prev = 56043, Cur = 53056, diff = -2987
    Error: tid = 161: clock_gettime(): Prev = 56247, Cur = 53060, diff = -3187
    Error: tid = 168: clock_gettime(): Prev = 56251, Cur = 53064, diff = -3187
    Error: tid = 137: clock_gettime(): Prev = 56255, Cur = 53068, diff = -3187
    Error: tid = 175: clock_gettime(): Prev = 56259, Cur = 53072, diff = -3187
    Error: tid = 129: clock_gettime(): Prev = 56263, Cur = 53076, diff = -3187
    tid = 129: Ran 10000 times. error_cnt = 1, success_cnt = 9999
    Error: tid = 155: clock_gettime(): Prev = 56267, Cur = 53078, diff = -3189
    Error: tid = 165: clock_gettime(): Prev = 56271, Cur = 53080, diff = -3191
    ...

--------------------------------------oOo--------------------------------------

Trying out James Hogan's patch
------------------------------

With the error-producing version of vdso-chk-X.patch applied, apply James'
patch and run the 'provoke' program again.

This works since kernel- and user-space coloring always becomes identical.

--------------------------------------oOo--------------------------------------

Trying out Paul Burton's patch
------------------------------

With the error-producing version of vdso-chk-X.patch applied, apply Paul's patch
and run the 'provoke' program again.

This also works.

Paul's patch allocates twice the amount of needed VM, but I guess that's fine,
as it's also less intrusive (no changes to mmap.c).

Regards,
René Nielsen

-----Original Message-----
From: Paul Burton [mailto:paul.burton@mips.com] 
Sent: 30. august 2018 20:01
To: Alexandre Belloni <alexandre.belloni@bootlin.com>; Rene Nielsen <rene.nielsen@microsemi.com>; Hauke Mehrtens <hauke@hauke-m.de>
Cc: linux-mips@linux-mips.org; Paul Burton <paul.burton@mips.com>; James Hogan <jhogan@kernel.org>; stable@vger.kernel.org
Subject: [PATCH] MIPS: VDSO: Match data page cache colouring when D$ aliases

EXTERNAL EMAIL


When a system suffers from dcache aliasing a user program may observe stale VDSO data from an aliased cache line. Notably this can break the expectation that clock_gettime(CLOCK_MONOTONIC, ...) is, as its name suggests, monotonic.

In order to ensure that users observe updates to the VDSO data page as intended, align the user mappings of the VDSO data page such that their cache colouring matches that of the virtual address range which the kernel will use to update the data page - typically its unmapped address within kseg0.

This ensures that we don't introduce aliasing cache lines for the VDSO data page, and therefore that userland will observe updates without requiring cache invalidation.

Signed-off-by: Paul Burton <paul.burton@mips.com>
Reported-by: Hauke Mehrtens <hauke@hauke-m.de>
Reported-by: Rene Nielsen <rene.nielsen@microsemi.com>
Reported-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Fixes: ebb5e78cc634 ("MIPS: Initial implementation of a VDSO")
Cc: James Hogan <jhogan@kernel.org>
Cc: linux-mips@linux-mips.org
Cc: stable@vger.kernel.org # v4.4+
---
Hi Alexandre,

Could you try this out on your Ocelot system? Hopefully it'll solve the problem just as well as James' patch but doesn't need the questionable change to arch_get_unmapped_area_common().

Thanks,
    Paul
---
 arch/mips/kernel/vdso.c | 20 ++++++++++++++++++++
 1 file changed, 20 insertions(+)

diff --git a/arch/mips/kernel/vdso.c b/arch/mips/kernel/vdso.c index 019035d7225c..5fb617a42335 100644
--- a/arch/mips/kernel/vdso.c
+++ b/arch/mips/kernel/vdso.c
@@ -13,6 +13,7 @@
 #include <linux/err.h>
 #include <linux/init.h>
 #include <linux/ioport.h>
+#include <linux/kernel.h>
 #include <linux/mm.h>
 #include <linux/sched.h>
 #include <linux/slab.h>
@@ -20,6 +21,7 @@

 #include <asm/abi.h>
 #include <asm/mips-cps.h>
+#include <asm/page.h>
 #include <asm/vdso.h>

 /* Kernel-provided data used by the VDSO. */ @@ -128,12 +130,30 @@ int arch_setup_additional_pages(struct linux_binprm *bprm, int uses_interp)
        vvar_size = gic_size + PAGE_SIZE;
        size = vvar_size + image->size;

+       /*
+        * Find a region that's large enough for us to perform the
+        * colour-matching alignment below.
+        */
+       if (cpu_has_dc_aliases)
+               size += shm_align_mask + 1;
+
        base = get_unmapped_area(NULL, 0, size, 0, 0);
        if (IS_ERR_VALUE(base)) {
                ret = base;
                goto out;
        }

+       /*
+        * If we suffer from dcache aliasing, ensure that the VDSO data page is
+        * coloured the same as the kernel's mapping of that memory. This
+        * ensures that when the kernel updates the VDSO data userland will see
+        * it without requiring cache invalidations.
+        */
+       if (cpu_has_dc_aliases) {
+               base = __ALIGN_MASK(base, shm_align_mask);
+               base += ((unsigned long)&vdso_data - gic_size) & shm_align_mask;
+       }
+
        data_addr = base + gic_size;
        vdso_addr = data_addr + PAGE_SIZE;

--
2.18.0


[-- Attachment #2: vdso-chk-1.patch --]
[-- Type: application/octet-stream, Size: 725 bytes --]

diff --git a/arch/mips/kernel/vdso.c b/arch/mips/kernel/vdso.c
index f9dbfb1..616fef4 100644
--- a/arch/mips/kernel/vdso.c
+++ b/arch/mips/kernel/vdso.c
@@ -181,6 +181,17 @@ int arch_setup_additional_pages(struct linux_binprm *bprm, int uses_interp)
 	ret = 0;
 
 out:
+	{
+		unsigned long d = data_addr  & 0x00001000LU;
+		unsigned long v = (unsigned long)&vdso_data & 0x00001000LU;
+
+		/* Bit 12 must be identical in user- and kernel-space */
+		printk(KERN_ERR "tgid = %u, pid = %u, comm = %12s: data_addr = "
+			"0x%08lx, &vdso_data = 0x%p => %s\n",
+			current->tgid, current->pid, current->comm, data_addr,
+			&vdso_data, d == v ? "aligned" : "NOT ALIGNED");
+        }
+
 	up_write(&mm->mmap_sem);
 	return ret;
 }

[-- Attachment #3: vdso-chk-2.patch --]
[-- Type: application/octet-stream, Size: 1116 bytes --]

diff --git a/arch/mips/kernel/vdso.c b/arch/mips/kernel/vdso.c
index f9dbfb1..62952ae 100644
--- a/arch/mips/kernel/vdso.c
+++ b/arch/mips/kernel/vdso.c
@@ -24,6 +24,7 @@
 
 /* Kernel-provided data used by the VDSO. */
 static union mips_vdso_data vdso_data __page_aligned_data;
+static union mips_vdso_data dummy     __page_aligned_data;
 
 /*
  * Mapping for the VDSO data/GIC pages. The real pages are mapped manually, as
@@ -181,6 +182,22 @@ int arch_setup_additional_pages(struct linux_binprm *bprm, int uses_interp)
 	ret = 0;
 
 out:
+	{
+		unsigned long d = data_addr  & 0x00001000LU;
+		unsigned long v = (unsigned long)&vdso_data & 0x00001000LU;
+
+		/*
+		 * Make sure we use #dummy for something or the linker will
+		 * discard it.
+		 */
+
+		/* Bit 12 must be identical in user- and kernel-space */
+		printk(KERN_ERR "tgid = %u, pid = %u, comm = %12s: data_addr = "
+			"0x%08lx, &vdso_data = 0x%p, &dummy = 0x%p => %s\n",
+			current->tgid, current->pid, current->comm, data_addr,
+			&vdso_data, &dummy, d == v ? "aligned" : "NOT ALIGNED");
+        }
+
 	up_write(&mm->mmap_sem);
 	return ret;
 }

[-- Attachment #4: provoke.c --]
[-- Type: text/plain, Size: 1548 bytes --]

#include <stdio.h>
#include <stdlib.h>
#include <stdint.h>
#include <time.h>
#include <signal.h>
#include <string.h>
#include <errno.h>
#include <unistd.h>
#include <sys/syscall.h>
#include <inttypes.h>

// for i in $(seq 50); do provoke > /dev/null & done

static volatile int run = 1;

static void ctrl_c_handler(int sig)
{
    run = 0;
}

static uint64_t milliseconds(void)
{
    struct timespec time;
    if (clock_gettime(CLOCK_MONOTONIC, &time) == 0) {
        return ((uint64_t)time.tv_sec * 1000ULL) + (time.tv_nsec / 1000000);
    }

    fprintf(stderr, "clock_gettime() failed: %s\n", strerror(errno));
    exit(-1);
}

int main(void)
{
    uint32_t i, error_cnt = 0;
    uint64_t prev_time = 0;
    int      tid = syscall(SYS_gettid);

    signal(SIGINT, ctrl_c_handler);

    for (i = 0; i < 10000 && run; i++) {
        uint64_t cur_time = milliseconds();

        if (cur_time < prev_time) {
            error_cnt++;
            fprintf(stderr, "Error: tid = %d: clock_gettime(): Prev = %" PRIu64
	            ", Cur = %" PRIu64 ", diff = -%" PRIu64 "\n",
		    tid, prev_time, cur_time, prev_time - cur_time);
        } else {
            fprintf(stdout, "Info:  tid = %d: clock_gettime(): Prev = %" PRIu64
	            ", Cur = %" PRIu64 ", diff =  %" PRIu64 "\n",
		    tid, prev_time, cur_time, cur_time - prev_time);
        }

        prev_time = cur_time;
    }

    fprintf(stderr, "tid = %d: Ran %u times. error_cnt = %u, success_cnt = %u\n",
            tid, i, error_cnt, i - error_cnt);

    return error_cnt ? -1 : 0;
}


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH] MIPS: VDSO: Match data page cache colouring when D$ aliases
  2018-08-30 18:01 ` [PATCH] MIPS: VDSO: Match data page cache colouring when D$ aliases Paul Burton
  2018-08-31  8:58     ` Rene.Nielsen
@ 2018-08-31 13:17   ` Alexandre Belloni
  2018-08-31 15:12   ` Hauke Mehrtens
  2 siblings, 0 replies; 9+ messages in thread
From: Alexandre Belloni @ 2018-08-31 13:17 UTC (permalink / raw)
  To: Paul Burton; +Cc: Rene Nielsen, Hauke Mehrtens, linux-mips, James Hogan, stable

On 30/08/2018 11:01:21-0700, Paul Burton wrote:
> When a system suffers from dcache aliasing a user program may observe
> stale VDSO data from an aliased cache line. Notably this can break the
> expectation that clock_gettime(CLOCK_MONOTONIC, ...) is, as its name
> suggests, monotonic.
> 
> In order to ensure that users observe updates to the VDSO data page as
> intended, align the user mappings of the VDSO data page such that their
> cache colouring matches that of the virtual address range which the
> kernel will use to update the data page - typically its unmapped address
> within kseg0.
> 
> This ensures that we don't introduce aliasing cache lines for the VDSO
> data page, and therefore that userland will observe updates without
> requiring cache invalidation.
> 
> Signed-off-by: Paul Burton <paul.burton@mips.com>
> Reported-by: Hauke Mehrtens <hauke@hauke-m.de>
> Reported-by: Rene Nielsen <rene.nielsen@microsemi.com>
> Reported-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Tested-by: Alexandre Belloni <alexandre.belloni@bootlin.com>

> Fixes: ebb5e78cc634 ("MIPS: Initial implementation of a VDSO")
> Cc: James Hogan <jhogan@kernel.org>
> Cc: linux-mips@linux-mips.org
> Cc: stable@vger.kernel.org # v4.4+
> ---
> Hi Alexandre,
> 
> Could you try this out on your Ocelot system? Hopefully it'll solve the
> problem just as well as James' patch but doesn't need the questionable
> change to arch_get_unmapped_area_common().
> 
> Thanks,
>     Paul
> ---
>  arch/mips/kernel/vdso.c | 20 ++++++++++++++++++++
>  1 file changed, 20 insertions(+)
> 
> diff --git a/arch/mips/kernel/vdso.c b/arch/mips/kernel/vdso.c
> index 019035d7225c..5fb617a42335 100644
> --- a/arch/mips/kernel/vdso.c
> +++ b/arch/mips/kernel/vdso.c
> @@ -13,6 +13,7 @@
>  #include <linux/err.h>
>  #include <linux/init.h>
>  #include <linux/ioport.h>
> +#include <linux/kernel.h>
>  #include <linux/mm.h>
>  #include <linux/sched.h>
>  #include <linux/slab.h>
> @@ -20,6 +21,7 @@
>  
>  #include <asm/abi.h>
>  #include <asm/mips-cps.h>
> +#include <asm/page.h>
>  #include <asm/vdso.h>
>  
>  /* Kernel-provided data used by the VDSO. */
> @@ -128,12 +130,30 @@ int arch_setup_additional_pages(struct linux_binprm *bprm, int uses_interp)
>  	vvar_size = gic_size + PAGE_SIZE;
>  	size = vvar_size + image->size;
>  
> +	/*
> +	 * Find a region that's large enough for us to perform the
> +	 * colour-matching alignment below.
> +	 */
> +	if (cpu_has_dc_aliases)
> +		size += shm_align_mask + 1;
> +
>  	base = get_unmapped_area(NULL, 0, size, 0, 0);
>  	if (IS_ERR_VALUE(base)) {
>  		ret = base;
>  		goto out;
>  	}
>  
> +	/*
> +	 * If we suffer from dcache aliasing, ensure that the VDSO data page is
> +	 * coloured the same as the kernel's mapping of that memory. This
> +	 * ensures that when the kernel updates the VDSO data userland will see
> +	 * it without requiring cache invalidations.
> +	 */
> +	if (cpu_has_dc_aliases) {
> +		base = __ALIGN_MASK(base, shm_align_mask);
> +		base += ((unsigned long)&vdso_data - gic_size) & shm_align_mask;
> +	}
> +
>  	data_addr = base + gic_size;
>  	vdso_addr = data_addr + PAGE_SIZE;
>  
> -- 
> 2.18.0
> 

-- 
Alexandre Belloni, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] MIPS: VDSO: Match data page cache colouring when D$ aliases
  2018-08-30 18:01 ` [PATCH] MIPS: VDSO: Match data page cache colouring when D$ aliases Paul Burton
  2018-08-31  8:58     ` Rene.Nielsen
  2018-08-31 13:17   ` Alexandre Belloni
@ 2018-08-31 15:12   ` Hauke Mehrtens
  2018-09-03  7:29       ` Rene.Nielsen
  2 siblings, 1 reply; 9+ messages in thread
From: Hauke Mehrtens @ 2018-08-31 15:12 UTC (permalink / raw)
  To: Paul Burton, Alexandre Belloni, Rene Nielsen
  Cc: linux-mips, James Hogan, stable


[-- Attachment #1.1: Type: text/plain, Size: 4507 bytes --]

On 08/30/2018 08:01 PM, Paul Burton wrote:
> When a system suffers from dcache aliasing a user program may observe
> stale VDSO data from an aliased cache line. Notably this can break the
> expectation that clock_gettime(CLOCK_MONOTONIC, ...) is, as its name
> suggests, monotonic.
> 
> In order to ensure that users observe updates to the VDSO data page as
> intended, align the user mappings of the VDSO data page such that their
> cache colouring matches that of the virtual address range which the
> kernel will use to update the data page - typically its unmapped address
> within kseg0.
> 
> This ensures that we don't introduce aliasing cache lines for the VDSO
> data page, and therefore that userland will observe updates without
> requiring cache invalidation.
> 
> Signed-off-by: Paul Burton <paul.burton@mips.com>
> Reported-by: Hauke Mehrtens <hauke@hauke-m.de>
> Reported-by: Rene Nielsen <rene.nielsen@microsemi.com>
> Reported-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
> Fixes: ebb5e78cc634 ("MIPS: Initial implementation of a VDSO")
> Cc: James Hogan <jhogan@kernel.org>
> Cc: linux-mips@linux-mips.org
> Cc: stable@vger.kernel.org # v4.4+

Tested-by: Hauke Mehrtens <hauke@hauke-m.de>


Without this patch ping shows these results on kernel 4.19-rc1 on the
Lantiq VR9 SoC to a PC directly connected to the LAN port:

root@OpenWrt:~# ping 192.168.1.195
PING 192.168.1.195 (192.168.1.195): 56 data bytes
64 bytes from 192.168.1.195: seq=0 ttl=64 time=0.689 ms
64 bytes from 192.168.1.195: seq=1 ttl=64 time=236.527 ms
64 bytes from 192.168.1.195: seq=2 ttl=64 time=4294963.829 ms
64 bytes from 192.168.1.195: seq=3 ttl=64 time=4294423.824 ms
64 bytes from 192.168.1.195: seq=4 ttl=64 time=960.527 ms
64 bytes from 192.168.1.195: seq=5 ttl=64 time=472.530 ms
64 bytes from 192.168.1.195: seq=6 ttl=64 time=464.530 ms
64 bytes from 192.168.1.195: seq=7 ttl=64 time=452.530 ms

With this patch it looks like this:

root@OpenWrt:~# ping 192.168.1.195
PING 192.168.1.195 (192.168.1.195): 56 data bytes
64 bytes from 192.168.1.195: seq=0 ttl=64 time=0.638 ms
64 bytes from 192.168.1.195: seq=1 ttl=64 time=0.573 ms
64 bytes from 192.168.1.195: seq=2 ttl=64 time=0.605 ms
64 bytes from 192.168.1.195: seq=3 ttl=64 time=0.524 ms
64 bytes from 192.168.1.195: seq=4 ttl=64 time=0.534 ms
64 bytes from 192.168.1.195: seq=5 ttl=64 time=0.518 ms
64 bytes from 192.168.1.195: seq=6 ttl=64 time=0.485 ms
64 bytes from 192.168.1.195: seq=7 ttl=64 time=0.501 ms


> ---
> Hi Alexandre,
> 
> Could you try this out on your Ocelot system? Hopefully it'll solve the
> problem just as well as James' patch but doesn't need the questionable
> change to arch_get_unmapped_area_common().
> 
> Thanks,
>     Paul
> ---
>  arch/mips/kernel/vdso.c | 20 ++++++++++++++++++++
>  1 file changed, 20 insertions(+)
> 
> diff --git a/arch/mips/kernel/vdso.c b/arch/mips/kernel/vdso.c
> index 019035d7225c..5fb617a42335 100644
> --- a/arch/mips/kernel/vdso.c
> +++ b/arch/mips/kernel/vdso.c
> @@ -13,6 +13,7 @@
>  #include <linux/err.h>
>  #include <linux/init.h>
>  #include <linux/ioport.h>
> +#include <linux/kernel.h>
>  #include <linux/mm.h>
>  #include <linux/sched.h>
>  #include <linux/slab.h>
> @@ -20,6 +21,7 @@
>  
>  #include <asm/abi.h>
>  #include <asm/mips-cps.h>
> +#include <asm/page.h>
>  #include <asm/vdso.h>
>  
>  /* Kernel-provided data used by the VDSO. */
> @@ -128,12 +130,30 @@ int arch_setup_additional_pages(struct linux_binprm *bprm, int uses_interp)
>  	vvar_size = gic_size + PAGE_SIZE;
>  	size = vvar_size + image->size;
>  
> +	/*
> +	 * Find a region that's large enough for us to perform the
> +	 * colour-matching alignment below.
> +	 */
> +	if (cpu_has_dc_aliases)
> +		size += shm_align_mask + 1;
> +
>  	base = get_unmapped_area(NULL, 0, size, 0, 0);
>  	if (IS_ERR_VALUE(base)) {
>  		ret = base;
>  		goto out;
>  	}
>  
> +	/*
> +	 * If we suffer from dcache aliasing, ensure that the VDSO data page is
> +	 * coloured the same as the kernel's mapping of that memory. This
> +	 * ensures that when the kernel updates the VDSO data userland will see
> +	 * it without requiring cache invalidations.
> +	 */
> +	if (cpu_has_dc_aliases) {
> +		base = __ALIGN_MASK(base, shm_align_mask);
> +		base += ((unsigned long)&vdso_data - gic_size) & shm_align_mask;
> +	}
> +
>  	data_addr = base + gic_size;
>  	vdso_addr = data_addr + PAGE_SIZE;
>  
> 



[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] MIPS: VDSO: Match data page cache colouring when D$ aliases
  2018-08-31  8:58     ` Rene.Nielsen
  (?)
@ 2018-08-31 16:47     ` Paul Burton
  -1 siblings, 0 replies; 9+ messages in thread
From: Paul Burton @ 2018-08-31 16:47 UTC (permalink / raw)
  To: Rene.Nielsen; +Cc: alexandre.belloni, hauke, linux-mips, jhogan, stable

Hi Rene,

On Fri, Aug 31, 2018 at 08:58:27AM +0000, Rene.Nielsen@microchip.com wrote:
> With the error-producing version of vdso-chk-X.patch applied, apply
> Paul's patch and run the 'provoke' program again.
> 
> This also works.

Thanks - can I add your Tested-by?

> Paul's patch allocates twice the amount of needed VM, but I guess
> that's fine, as it's also less intrusive (no changes to mmap.c).

Note that get_unmapped_area() only finds an unmapped area - it doesn't
allocate it in the sense that further calls to get_unmapped_area() can
return the exact same memory. The actual use of the virtual address
ranges takes place when we call _install_special_mapping, and we still
use the same sizes there that we were before.

So my patch just causes us to look for a larger unmapped area (which
will typically be easy to find since we haven't even executed the
program yet), it doesn't actually use any more memory. We just find a
large area & then use only the part of it that we need to.

Since James' patch constrained get_unmapped_area() to look for a
suitably aligned piece of memory the result would have been much the
same - see the code in unmapped_area() where it also adds align_mask to
the length of the area we're looking for. It's just that with my patch
the alignment is being done by arch_setup_additional_pages() rather than
get_unmapped_area()/vm_unmapped_area(). That's not unprecedented either
- arch/nds32 already does something similar.

Thanks,
    Paul

^ permalink raw reply	[flat|nested] 9+ messages in thread

* RE: [PATCH] MIPS: VDSO: Match data page cache colouring when D$ aliases
@ 2018-09-03  7:29       ` Rene.Nielsen
  0 siblings, 0 replies; 9+ messages in thread
From: Rene.Nielsen @ 2018-09-03  7:29 UTC (permalink / raw)
  To: hauke, paul.burton, alexandre.belloni
  Cc: linux-mips, jhogan, stable, rene.nielsen

On 08/31/2018 05:12 PM, Hauke Mertens wrote:
> On 08/30/2018 08:01 PM, Paul Burton wrote:
>> When a system suffers from dcache aliasing a user program may observe 
>> stale VDSO data from an aliased cache line. Notably this can break the 
>> expectation that clock_gettime(CLOCK_MONOTONIC, ...) is, as its name 
>> suggests, monotonic.
>> 
>> In order to ensure that users observe updates to the VDSO data page as 
>> intended, align the user mappings of the VDSO data page such that 
>> their cache colouring matches that of the virtual address range which 
>> the kernel will use to update the data page - typically its unmapped 
>> address within kseg0.
>> 
>> This ensures that we don't introduce aliasing cache lines for the VDSO 
>> data page, and therefore that userland will observe updates without 
>> requiring cache invalidation.
>> 
>> Signed-off-by: Paul Burton <paul.burton@mips.com>
>> Reported-by: Hauke Mehrtens <hauke@hauke-m.de>
>> Reported-by: Rene Nielsen <rene.nielsen@microsemi.com>
>> Reported-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
>> Fixes: ebb5e78cc634 ("MIPS: Initial implementation of a VDSO")
>> Cc: James Hogan <jhogan@kernel.org>
>> Cc: linux-mips@linux-mips.org
>> Cc: stable@vger.kernel.org # v4.4+

> Tested-by: Hauke Mehrtens <hauke@hauke-m.de>

Tested-by: Rene Nielsen <rene.nielsen@microchip.com>

> Without this patch ping shows these results on kernel 4.19-rc1 on the Lantiq VR9 SoC to a PC directly connected to the LAN port:

> root@OpenWrt:~# ping 192.168.1.195
> PING 192.168.1.195 (192.168.1.195): 56 data bytes
> 64 bytes from 192.168.1.195: seq=0 ttl=64 time=0.689 ms
> 64 bytes from 192.168.1.195: seq=1 ttl=64 time=236.527 ms
> 64 bytes from 192.168.1.195: seq=2 ttl=64 time=4294963.829 ms
> 64 bytes from 192.168.1.195: seq=3 ttl=64 time=4294423.824 ms
> 64 bytes from 192.168.1.195: seq=4 ttl=64 time=960.527 ms
> 64 bytes from 192.168.1.195: seq=5 ttl=64 time=472.530 ms
> 64 bytes from 192.168.1.195: seq=6 ttl=64 time=464.530 ms
> 64 bytes from 192.168.1.195: seq=7 ttl=64 time=452.530 ms
>
> With this patch it looks like this:
>
>root@OpenWrt:~# ping 192.168.1.195
> PING 192.168.1.195 (192.168.1.195): 56 data bytes
> 64 bytes from 192.168.1.195: seq=0 ttl=64 time=0.638 ms
> 64 bytes from 192.168.1.195: seq=1 ttl=64 time=0.573 ms
> 64 bytes from 192.168.1.195: seq=2 ttl=64 time=0.605 ms
> 64 bytes from 192.168.1.195: seq=3 ttl=64 time=0.524 ms
> 64 bytes from 192.168.1.195: seq=4 ttl=64 time=0.534 ms
> 64 bytes from 192.168.1.195: seq=5 ttl=64 time=0.518 ms
> 64 bytes from 192.168.1.195: seq=6 ttl=64 time=0.485 ms
> 64 bytes from 192.168.1.195: seq=7 ttl=64 time=0.501 ms
>
>
>> ---
>> Hi Alexandre,
>> 
>> Could you try this out on your Ocelot system? Hopefully it'll solve 
>> the problem just as well as James' patch but doesn't need the 
>> questionable change to arch_get_unmapped_area_common().
>> 
>> Thanks,
>>     Paul
>> ---
>>  arch/mips/kernel/vdso.c | 20 ++++++++++++++++++++
>>  1 file changed, 20 insertions(+)
>> 
>> diff --git a/arch/mips/kernel/vdso.c b/arch/mips/kernel/vdso.c index 
>> 019035d7225c..5fb617a42335 100644
>> --- a/arch/mips/kernel/vdso.c
>> +++ b/arch/mips/kernel/vdso.c
>> @@ -13,6 +13,7 @@
>>  #include <linux/err.h>
>>  #include <linux/init.h>
>>  #include <linux/ioport.h>
>> +#include <linux/kernel.h>
>>  #include <linux/mm.h>
>>  #include <linux/sched.h>
>>  #include <linux/slab.h>
>> @@ -20,6 +21,7 @@
>>  
>>  #include <asm/abi.h>
>>  #include <asm/mips-cps.h>
>> +#include <asm/page.h>
>>  #include <asm/vdso.h>
>>  
>>  /* Kernel-provided data used by the VDSO. */ @@ -128,12 +130,30 @@ 
>> int arch_setup_additional_pages(struct linux_binprm *bprm, int uses_interp)
>>  	vvar_size = gic_size + PAGE_SIZE;
>>  	size = vvar_size + image->size;
>>  
>> +	/*
>> +	 * Find a region that's large enough for us to perform the
>> +	 * colour-matching alignment below.
>> +	 */
>> +	if (cpu_has_dc_aliases)
>> +		size += shm_align_mask + 1;
>> +
>>  	base = get_unmapped_area(NULL, 0, size, 0, 0);
>>  	if (IS_ERR_VALUE(base)) {
>>  		ret = base;
>>  		goto out;
>>  	}
>>  
>> +	/*
>> +	 * If we suffer from dcache aliasing, ensure that the VDSO data page is
>> +	 * coloured the same as the kernel's mapping of that memory. This
>> +	 * ensures that when the kernel updates the VDSO data userland will see
>> +	 * it without requiring cache invalidations.
>> +	 */
>> +	if (cpu_has_dc_aliases) {
>> +		base = __ALIGN_MASK(base, shm_align_mask);
>> +		base += ((unsigned long)&vdso_data - gic_size) & shm_align_mask;
>> +	}
>> +
>>  	data_addr = base + gic_size;
>>  	vdso_addr = data_addr + PAGE_SIZE;
>>  
>> 



^ permalink raw reply	[flat|nested] 9+ messages in thread

* RE: [PATCH] MIPS: VDSO: Match data page cache colouring when D$ aliases
@ 2018-09-03  7:29       ` Rene.Nielsen
  0 siblings, 0 replies; 9+ messages in thread
From: Rene.Nielsen @ 2018-09-03  7:29 UTC (permalink / raw)
  To: hauke, paul.burton, alexandre.belloni
  Cc: linux-mips, jhogan, stable, rene.nielsen

On 08/31/2018 05:12 PM, Hauke Mertens wrote:
> On 08/30/2018 08:01 PM, Paul Burton wrote:
>> When a system suffers from dcache aliasing a user program may observe 
>> stale VDSO data from an aliased cache line. Notably this can break the 
>> expectation that clock_gettime(CLOCK_MONOTONIC, ...) is, as its name 
>> suggests, monotonic.
>> 
>> In order to ensure that users observe updates to the VDSO data page as 
>> intended, align the user mappings of the VDSO data page such that 
>> their cache colouring matches that of the virtual address range which 
>> the kernel will use to update the data page - typically its unmapped 
>> address within kseg0.
>> 
>> This ensures that we don't introduce aliasing cache lines for the VDSO 
>> data page, and therefore that userland will observe updates without 
>> requiring cache invalidation.
>> 
>> Signed-off-by: Paul Burton <paul.burton@mips.com>
>> Reported-by: Hauke Mehrtens <hauke@hauke-m.de>
>> Reported-by: Rene Nielsen <rene.nielsen@microsemi.com>
>> Reported-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
>> Fixes: ebb5e78cc634 ("MIPS: Initial implementation of a VDSO")
>> Cc: James Hogan <jhogan@kernel.org>
>> Cc: linux-mips@linux-mips.org
>> Cc: stable@vger.kernel.org # v4.4+

> Tested-by: Hauke Mehrtens <hauke@hauke-m.de>

Tested-by: Rene Nielsen <rene.nielsen@microchip.com>

> Without this patch ping shows these results on kernel 4.19-rc1 on the Lantiq VR9 SoC to a PC directly connected to the LAN port:

> root@OpenWrt:~# ping 192.168.1.195
> PING 192.168.1.195 (192.168.1.195): 56 data bytes
> 64 bytes from 192.168.1.195: seq=0 ttl=64 time=0.689 ms
> 64 bytes from 192.168.1.195: seq=1 ttl=64 time=236.527 ms
> 64 bytes from 192.168.1.195: seq=2 ttl=64 time=4294963.829 ms
> 64 bytes from 192.168.1.195: seq=3 ttl=64 time=4294423.824 ms
> 64 bytes from 192.168.1.195: seq=4 ttl=64 time=960.527 ms
> 64 bytes from 192.168.1.195: seq=5 ttl=64 time=472.530 ms
> 64 bytes from 192.168.1.195: seq=6 ttl=64 time=464.530 ms
> 64 bytes from 192.168.1.195: seq=7 ttl=64 time=452.530 ms
>
> With this patch it looks like this:
>
>root@OpenWrt:~# ping 192.168.1.195
> PING 192.168.1.195 (192.168.1.195): 56 data bytes
> 64 bytes from 192.168.1.195: seq=0 ttl=64 time=0.638 ms
> 64 bytes from 192.168.1.195: seq=1 ttl=64 time=0.573 ms
> 64 bytes from 192.168.1.195: seq=2 ttl=64 time=0.605 ms
> 64 bytes from 192.168.1.195: seq=3 ttl=64 time=0.524 ms
> 64 bytes from 192.168.1.195: seq=4 ttl=64 time=0.534 ms
> 64 bytes from 192.168.1.195: seq=5 ttl=64 time=0.518 ms
> 64 bytes from 192.168.1.195: seq=6 ttl=64 time=0.485 ms
> 64 bytes from 192.168.1.195: seq=7 ttl=64 time=0.501 ms
>
>
>> ---
>> Hi Alexandre,
>> 
>> Could you try this out on your Ocelot system? Hopefully it'll solve 
>> the problem just as well as James' patch but doesn't need the 
>> questionable change to arch_get_unmapped_area_common().
>> 
>> Thanks,
>>     Paul
>> ---
>>  arch/mips/kernel/vdso.c | 20 ++++++++++++++++++++
>>  1 file changed, 20 insertions(+)
>> 
>> diff --git a/arch/mips/kernel/vdso.c b/arch/mips/kernel/vdso.c index 
>> 019035d7225c..5fb617a42335 100644
>> --- a/arch/mips/kernel/vdso.c
>> +++ b/arch/mips/kernel/vdso.c
>> @@ -13,6 +13,7 @@
>>  #include <linux/err.h>
>>  #include <linux/init.h>
>>  #include <linux/ioport.h>
>> +#include <linux/kernel.h>
>>  #include <linux/mm.h>
>>  #include <linux/sched.h>
>>  #include <linux/slab.h>
>> @@ -20,6 +21,7 @@
>>  
>>  #include <asm/abi.h>
>>  #include <asm/mips-cps.h>
>> +#include <asm/page.h>
>>  #include <asm/vdso.h>
>>  
>>  /* Kernel-provided data used by the VDSO. */ @@ -128,12 +130,30 @@ 
>> int arch_setup_additional_pages(struct linux_binprm *bprm, int uses_interp)
>>  	vvar_size = gic_size + PAGE_SIZE;
>>  	size = vvar_size + image->size;
>>  
>> +	/*
>> +	 * Find a region that's large enough for us to perform the
>> +	 * colour-matching alignment below.
>> +	 */
>> +	if (cpu_has_dc_aliases)
>> +		size += shm_align_mask + 1;
>> +
>>  	base = get_unmapped_area(NULL, 0, size, 0, 0);
>>  	if (IS_ERR_VALUE(base)) {
>>  		ret = base;
>>  		goto out;
>>  	}
>>  
>> +	/*
>> +	 * If we suffer from dcache aliasing, ensure that the VDSO data page is
>> +	 * coloured the same as the kernel's mapping of that memory. This
>> +	 * ensures that when the kernel updates the VDSO data userland will see
>> +	 * it without requiring cache invalidations.
>> +	 */
>> +	if (cpu_has_dc_aliases) {
>> +		base = __ALIGN_MASK(base, shm_align_mask);
>> +		base += ((unsigned long)&vdso_data - gic_size) & shm_align_mask;
>> +	}
>> +
>>  	data_addr = base + gic_size;
>>  	vdso_addr = data_addr + PAGE_SIZE;
>>  
>> 



^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2018-09-03 11:49 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-08-28 16:02 VDSO and dcache aliasing Alexandre Belloni
2018-08-30 18:01 ` [PATCH] MIPS: VDSO: Match data page cache colouring when D$ aliases Paul Burton
2018-08-31  8:58   ` Rene.Nielsen
2018-08-31  8:58     ` Rene.Nielsen
2018-08-31 16:47     ` Paul Burton
2018-08-31 13:17   ` Alexandre Belloni
2018-08-31 15:12   ` Hauke Mehrtens
2018-09-03  7:29     ` Rene.Nielsen
2018-09-03  7:29       ` Rene.Nielsen

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.