linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] x86/mm: Fix incorrect for loop count calculation in sync_global_pgds
@ 2017-05-01 11:41 Baoquan He
  2017-05-01 14:15 ` Thomas Garnier
                   ` (2 more replies)
  0 siblings, 3 replies; 10+ messages in thread
From: Baoquan He @ 2017-05-01 11:41 UTC (permalink / raw)
  To: linux-kernel
  Cc: Baoquan He, Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x86,
	Kees Cook, Thomas Garnier, Andrew Morton, Yasuaki Ishimatsu,
	Jinbum Park, Dave Hansen, Kirill A. Shutemov, Yinghai Lu,
	Dan Williams, Dave Young

Jeff Moyer reported that on his system with two memory regions 0~64G and
1T~1T+192G, and kernel option "memmap=192G!1024G" added, enabling kaslr
will make system hang intermittently during boot. While adding 'nokaslr'
won't.

This is because the for loop count calculation in sync_global_pgds is
not correct. When a mapping area crosses pgd entries, we should
calculate the starting address of region which next pgd covers and assign
it to next for loop count, but not add PGDIR_SIZE directly. The old
code works right only if the mapping area is times of PGDIR_SIZE,
otherwize the end region could be skipped so that it can't be synchronized
to all other processes from kernel pgd init_mm.pgd.

In Jeff's system, emulated pmem area [1024G, 1216G) is smaller than
PGDIR_SIZE. While 'nokaslr' works because PAGE_OFFSET is 1T aligned, it
makes this area be mapped inside one pgd entry. With kaslr enabled,
this area could cross two pgd entries, then the next pgd entry won't
be synced to all other processes. That is why we saw empty PGD.

Fix it in this patch.

The back trace is pasted as below:

[    9.988867] IP: memcpy_erms+0x6/0x10
[    9.988868] PGD 0
[    9.988868]
[    9.988870] Oops: 0000 [#1] SMP
[    9.988871] Modules linked in: isci(E) mgag200(E+) drm_kms_helper(E) syscopyarea(E) sysfillrect(E) sysimgblt(E) fb_sys_fops(E) igb(E) ahci(E) ttm(E) libsas(E) libahci(E) scsi_transport_sas(E) ptp(E) pps_core(E) nd_pmem(E) dca(E) drm(E) i2c_algo_bit(E) libata(E) crc32c_intel(E) nd_btt(E) i2c_core(E) dm_mirror(E) dm_region_hash(E) dm_log(E) dm_mod(E)
[    9.988886] CPU: 0 PID: 442 Comm: systemd-udevd Tainted: G            E   4.11.0-rc5+ #43
[    9.988887] Hardware name: Intel Corporation LH Pass/SVRBD-ROW_P, BIOS SE5C600.86B.02.01.SP06.050920141054 05/09/2014
[    9.988888] task: ffff9267dc2f8000 task.stack: ffffba92c783c000
[    9.988890] RIP: 0010:memcpy_erms+0x6/0x10
[    9.988891] RSP: 0018:ffffba92c783f9b8 EFLAGS: 00010286
[    9.988892] RAX: ffff925f19e27000 RBX: 0000000000000000 RCX: 0000000000001000
[    9.988893] RDX: 0000000000001000 RSI: ffff9387bfff0000 RDI: ffff925f19e27000
[    9.988893] RBP: ffffba92c783fa38 R08: 0000000000000000 R09: 0000000017ffff80
[    9.988894] R10: 0000000000000000 R11: ffff9387bfff0000 R12: ffff925fde811ed8
[    9.988895] R13: 0000002fffff0000 R14: 0000000000001000 R15: ffff925f19e27000
[    9.988896] FS:  00007f1ee18e68c0(0000) GS:ffff925fdec00000(0000) knlGS:0000000000000000
[    9.988896] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    9.988897] CR2: ffff9387bfff0000 CR3: 000000081ba28000 CR4: 00000000001406f0
[    9.988897] Call Trace:
[    9.988902]  ? pmem_do_bvec+0x93/0x290 [nd_pmem]
[    9.988904]  ? radix_tree_node_alloc.constprop.20+0x85/0xc0
[    9.988905]  ? radix_tree_node_alloc.constprop.20+0x85/0xc0
[    9.988907]  pmem_rw_page+0x3a/0x60 [nd_pmem]
[    9.988909]  bdev_read_page+0x81/0xb0
[    9.988911]  do_mpage_readpage+0x56f/0x770
[    9.988912]  ? I_BDEV+0x20/0x20
[    9.988915]  ? lru_cache_add+0xe/0x10
[    9.988917]  mpage_readpages+0x148/0x1e0
[    9.988917]  ? I_BDEV+0x20/0x20
[    9.988918]  ? I_BDEV+0x20/0x20
[    9.988921]  ? alloc_pages_current+0x88/0x120
[    9.988923]  blkdev_readpages+0x1d/0x20
[    9.988924]  __do_page_cache_readahead+0x1ce/0x2c0
[    9.988926]  force_page_cache_readahead+0xa2/0x100
[    9.988927]  page_cache_sync_readahead+0x3f/0x50
[    9.988930]  generic_file_read_iter+0x60d/0x8c0
[    9.988931]  blkdev_read_iter+0x37/0x40
[    9.988933]  __vfs_read+0xe0/0x150
[    9.988934]  vfs_read+0x8c/0x130
[    9.988936]  SyS_read+0x55/0xc0
[    9.988939]  entry_SYSCALL_64_fastpath+0x1a/0xa9
[    9.988940] RIP: 0033:0x7f1ee0822480
[    9.988941] RSP: 002b:00007ffcf9e741f8 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
[    9.988942] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f1ee0822480
[    9.988943] RDX: 0000000000000040 RSI: 0000561b7e1aabc8 RDI: 0000000000000008
[    9.988943] RBP: 0000561b7e1a86a0 R08: 0000000000000005 R09: 0000000000000068
[    9.988944] R10: 00007ffcf9e73f80 R11: 0000000000000246 R12: 0000000000000000
[    9.988945] R13: 0000000000000001 R14: 0000561b7e1a61b0 R15: 0000561b7e1a55e0
[    9.988946] Code: ff 90 90 90 90 eb 1e 0f 1f 00 48 89 f8 48 89 d1 48 c1 e9 03 83 e2 07 f3 48 a5 89 d1 f3 a4 c3 66 0f 1f 44 00 00 48 89 f8 48 89 d1 <f3> a4 c3 0f 1f 80 00 00 00 00 48 89 f8 48 83 fa 20 72 7e 40 38
[    9.988962] RIP: memcpy_erms+0x6/0x10 RSP: ffffba92c783f9b8
[    9.988962] CR2: ffff9387bfff0000
[    9.989022] ---[ end trace fe34c0fc0fe685ab ]---
[    9.998690] Kernel panic - not syncing: Fatal exception
[   10.004708] Kernel Offset: 0x11000000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)

Reported-by: Jeff Moyer <jmoyer@redhat.com>
Signed-off-by: Baoquan He <bhe@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com> 
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: x86@kernel.org 
Cc: Kees Cook <keescook@chromium.org>
Cc: Thomas Garnier <thgarnie@google.com> 
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Yasuaki Ishimatsu <yasu.isimatu@gmail.com>
Cc: Jinbum Park <jinb.park7@gmail.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com> 
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Dave Young <dyoung@redhat.com>
---
 arch/x86/mm/init_64.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
index 15173d3..dbf4f00 100644
--- a/arch/x86/mm/init_64.c
+++ b/arch/x86/mm/init_64.c
@@ -94,12 +94,14 @@ __setup("noexec32=", nonx32_setup);
  */
 void sync_global_pgds(unsigned long start, unsigned long end)
 {
-	unsigned long address;
+	unsigned long address, address_next;
 
-	for (address = start; address <= end; address += PGDIR_SIZE) {
+	for (address = start; address <= end; address = address_next) {
 		const pgd_t *pgd_ref = pgd_offset_k(address);
 		struct page *page;
 
+		address_next = (address & PGDIR_MASK) + PGDIR_SIZE;
+
 		if (pgd_none(*pgd_ref))
 			continue;
 
-- 
2.5.5

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] x86/mm: Fix incorrect for loop count calculation in sync_global_pgds
  2017-05-01 11:41 [PATCH] x86/mm: Fix incorrect for loop count calculation in sync_global_pgds Baoquan He
@ 2017-05-01 14:15 ` Thomas Garnier
  2017-05-01 14:40 ` Dan Williams
  2017-05-01 22:37 ` Yinghai Lu
  2 siblings, 0 replies; 10+ messages in thread
From: Thomas Garnier @ 2017-05-01 14:15 UTC (permalink / raw)
  To: Baoquan He
  Cc: LKML, Thomas Gleixner, Ingo Molnar, H. Peter Anvin,
	the arch/x86 maintainers, Kees Cook, Andrew Morton,
	Yasuaki Ishimatsu, Jinbum Park, Dave Hansen, Kirill A. Shutemov,
	Yinghai Lu, Dan Williams, Dave Young

On Mon, May 1, 2017 at 4:41 AM, Baoquan He <bhe@redhat.com> wrote:
>
> Jeff Moyer reported that on his system with two memory regions 0~64G and
> 1T~1T+192G, and kernel option "memmap=192G!1024G" added, enabling kaslr
> will make system hang intermittently during boot. While adding 'nokaslr'
> won't.
>
> This is because the for loop count calculation in sync_global_pgds is
> not correct. When a mapping area crosses pgd entries, we should
> calculate the starting address of region which next pgd covers and assign
> it to next for loop count, but not add PGDIR_SIZE directly. The old
> code works right only if the mapping area is times of PGDIR_SIZE,
> otherwize the end region could be skipped so that it can't be synchronized
> to all other processes from kernel pgd init_mm.pgd.
>
> In Jeff's system, emulated pmem area [1024G, 1216G) is smaller than
> PGDIR_SIZE. While 'nokaslr' works because PAGE_OFFSET is 1T aligned, it
> makes this area be mapped inside one pgd entry. With kaslr enabled,
> this area could cross two pgd entries, then the next pgd entry won't
> be synced to all other processes. That is why we saw empty PGD.

Make a lot of sense. Thanks a lot for investigating this issue!

Acked-by: Thomas Garnier <thgarnie@google.com>

>
> Fix it in this patch.
>
> The back trace is pasted as below:
>
> [    9.988867] IP: memcpy_erms+0x6/0x10
> [    9.988868] PGD 0
> [    9.988868]
> [    9.988870] Oops: 0000 [#1] SMP
> [    9.988871] Modules linked in: isci(E) mgag200(E+) drm_kms_helper(E) syscopyarea(E) sysfillrect(E) sysimgblt(E) fb_sys_fops(E) igb(E) ahci(E) ttm(E) libsas(E) libahci(E) scsi_transport_sas(E) ptp(E) pps_core(E) nd_pmem(E) dca(E) drm(E) i2c_algo_bit(E) libata(E) crc32c_intel(E) nd_btt(E) i2c_core(E) dm_mirror(E) dm_region_hash(E) dm_log(E) dm_mod(E)
> [    9.988886] CPU: 0 PID: 442 Comm: systemd-udevd Tainted: G            E   4.11.0-rc5+ #43
> [    9.988887] Hardware name: Intel Corporation LH Pass/SVRBD-ROW_P, BIOS SE5C600.86B.02.01.SP06.050920141054 05/09/2014
> [    9.988888] task: ffff9267dc2f8000 task.stack: ffffba92c783c000
> [    9.988890] RIP: 0010:memcpy_erms+0x6/0x10
> [    9.988891] RSP: 0018:ffffba92c783f9b8 EFLAGS: 00010286
> [    9.988892] RAX: ffff925f19e27000 RBX: 0000000000000000 RCX: 0000000000001000
> [    9.988893] RDX: 0000000000001000 RSI: ffff9387bfff0000 RDI: ffff925f19e27000
> [    9.988893] RBP: ffffba92c783fa38 R08: 0000000000000000 R09: 0000000017ffff80
> [    9.988894] R10: 0000000000000000 R11: ffff9387bfff0000 R12: ffff925fde811ed8
> [    9.988895] R13: 0000002fffff0000 R14: 0000000000001000 R15: ffff925f19e27000
> [    9.988896] FS:  00007f1ee18e68c0(0000) GS:ffff925fdec00000(0000) knlGS:0000000000000000
> [    9.988896] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [    9.988897] CR2: ffff9387bfff0000 CR3: 000000081ba28000 CR4: 00000000001406f0
> [    9.988897] Call Trace:
> [    9.988902]  ? pmem_do_bvec+0x93/0x290 [nd_pmem]
> [    9.988904]  ? radix_tree_node_alloc.constprop.20+0x85/0xc0
> [    9.988905]  ? radix_tree_node_alloc.constprop.20+0x85/0xc0
> [    9.988907]  pmem_rw_page+0x3a/0x60 [nd_pmem]
> [    9.988909]  bdev_read_page+0x81/0xb0
> [    9.988911]  do_mpage_readpage+0x56f/0x770
> [    9.988912]  ? I_BDEV+0x20/0x20
> [    9.988915]  ? lru_cache_add+0xe/0x10
> [    9.988917]  mpage_readpages+0x148/0x1e0
> [    9.988917]  ? I_BDEV+0x20/0x20
> [    9.988918]  ? I_BDEV+0x20/0x20
> [    9.988921]  ? alloc_pages_current+0x88/0x120
> [    9.988923]  blkdev_readpages+0x1d/0x20
> [    9.988924]  __do_page_cache_readahead+0x1ce/0x2c0
> [    9.988926]  force_page_cache_readahead+0xa2/0x100
> [    9.988927]  page_cache_sync_readahead+0x3f/0x50
> [    9.988930]  generic_file_read_iter+0x60d/0x8c0
> [    9.988931]  blkdev_read_iter+0x37/0x40
> [    9.988933]  __vfs_read+0xe0/0x150
> [    9.988934]  vfs_read+0x8c/0x130
> [    9.988936]  SyS_read+0x55/0xc0
> [    9.988939]  entry_SYSCALL_64_fastpath+0x1a/0xa9
> [    9.988940] RIP: 0033:0x7f1ee0822480
> [    9.988941] RSP: 002b:00007ffcf9e741f8 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
> [    9.988942] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f1ee0822480
> [    9.988943] RDX: 0000000000000040 RSI: 0000561b7e1aabc8 RDI: 0000000000000008
> [    9.988943] RBP: 0000561b7e1a86a0 R08: 0000000000000005 R09: 0000000000000068
> [    9.988944] R10: 00007ffcf9e73f80 R11: 0000000000000246 R12: 0000000000000000
> [    9.988945] R13: 0000000000000001 R14: 0000561b7e1a61b0 R15: 0000561b7e1a55e0
> [    9.988946] Code: ff 90 90 90 90 eb 1e 0f 1f 00 48 89 f8 48 89 d1 48 c1 e9 03 83 e2 07 f3 48 a5 89 d1 f3 a4 c3 66 0f 1f 44 00 00 48 89 f8 48 89 d1 <f3> a4 c3 0f 1f 80 00 00 00 00 48 89 f8 48 83 fa 20 72 7e 40 38
> [    9.988962] RIP: memcpy_erms+0x6/0x10 RSP: ffffba92c783f9b8
> [    9.988962] CR2: ffff9387bfff0000
> [    9.989022] ---[ end trace fe34c0fc0fe685ab ]---
> [    9.998690] Kernel panic - not syncing: Fatal exception
> [   10.004708] Kernel Offset: 0x11000000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
>
> Reported-by: Jeff Moyer <jmoyer@redhat.com>
> Signed-off-by: Baoquan He <bhe@redhat.com>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Ingo Molnar <mingo@redhat.com>
> Cc: "H. Peter Anvin" <hpa@zytor.com>
> Cc: x86@kernel.org
> Cc: Kees Cook <keescook@chromium.org>
> Cc: Thomas Garnier <thgarnie@google.com>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Yasuaki Ishimatsu <yasu.isimatu@gmail.com>
> Cc: Jinbum Park <jinb.park7@gmail.com>
> Cc: Dave Hansen <dave.hansen@linux.intel.com>
> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
> Cc: Yinghai Lu <yinghai@kernel.org>
> Cc: Dan Williams <dan.j.williams@intel.com>
> Cc: Dave Young <dyoung@redhat.com>
> ---
>  arch/x86/mm/init_64.c | 6 ++++--
>  1 file changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
> index 15173d3..dbf4f00 100644
> --- a/arch/x86/mm/init_64.c
> +++ b/arch/x86/mm/init_64.c
> @@ -94,12 +94,14 @@ __setup("noexec32=", nonx32_setup);
>   */
>  void sync_global_pgds(unsigned long start, unsigned long end)
>  {
> -       unsigned long address;
> +       unsigned long address, address_next;
>
> -       for (address = start; address <= end; address += PGDIR_SIZE) {
> +       for (address = start; address <= end; address = address_next) {
>                 const pgd_t *pgd_ref = pgd_offset_k(address);
>                 struct page *page;
>
> +               address_next = (address & PGDIR_MASK) + PGDIR_SIZE;
> +
>                 if (pgd_none(*pgd_ref))
>                         continue;
>
> --
> 2.5.5
>



-- 
Thomas

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] x86/mm: Fix incorrect for loop count calculation in sync_global_pgds
  2017-05-01 11:41 [PATCH] x86/mm: Fix incorrect for loop count calculation in sync_global_pgds Baoquan He
  2017-05-01 14:15 ` Thomas Garnier
@ 2017-05-01 14:40 ` Dan Williams
  2017-05-01 14:52   ` Baoquan He
  2017-05-01 22:37 ` Yinghai Lu
  2 siblings, 1 reply; 10+ messages in thread
From: Dan Williams @ 2017-05-01 14:40 UTC (permalink / raw)
  To: Baoquan He
  Cc: linux-kernel, Thomas Gleixner, Ingo Molnar, H. Peter Anvin,
	X86 ML, Kees Cook, Thomas Garnier, Andrew Morton,
	Yasuaki Ishimatsu, Jinbum Park, Dave Hansen, Kirill A. Shutemov,
	Yinghai Lu, Dave Young

On Mon, May 1, 2017 at 4:41 AM, Baoquan He <bhe@redhat.com> wrote:
> Jeff Moyer reported that on his system with two memory regions 0~64G and
> 1T~1T+192G, and kernel option "memmap=192G!1024G" added, enabling kaslr
> will make system hang intermittently during boot. While adding 'nokaslr'
> won't.
>
> This is because the for loop count calculation in sync_global_pgds is
> not correct. When a mapping area crosses pgd entries, we should
> calculate the starting address of region which next pgd covers and assign
> it to next for loop count, but not add PGDIR_SIZE directly. The old
> code works right only if the mapping area is times of PGDIR_SIZE,
> otherwize the end region could be skipped so that it can't be synchronized
> to all other processes from kernel pgd init_mm.pgd.
>
> In Jeff's system, emulated pmem area [1024G, 1216G) is smaller than
> PGDIR_SIZE. While 'nokaslr' works because PAGE_OFFSET is 1T aligned, it
> makes this area be mapped inside one pgd entry. With kaslr enabled,
> this area could cross two pgd entries, then the next pgd entry won't
> be synced to all other processes. That is why we saw empty PGD.
>
> Fix it in this patch.
>
> The back trace is pasted as below:
>
> [    9.988867] IP: memcpy_erms+0x6/0x10
> [    9.988868] PGD 0
> [    9.988868]
> [    9.988870] Oops: 0000 [#1] SMP
> [    9.988871] Modules linked in: isci(E) mgag200(E+) drm_kms_helper(E) syscopyarea(E) sysfillrect(E) sysimgblt(E) fb_sys_fops(E) igb(E) ahci(E) ttm(E) libsas(E) libahci(E) scsi_transport_sas(E) ptp(E) pps_core(E) nd_pmem(E) dca(E) drm(E) i2c_algo_bit(E) libata(E) crc32c_intel(E) nd_btt(E) i2c_core(E) dm_mirror(E) dm_region_hash(E) dm_log(E) dm_mod(E)
> [    9.988886] CPU: 0 PID: 442 Comm: systemd-udevd Tainted: G            E   4.11.0-rc5+ #43
> [    9.988887] Hardware name: Intel Corporation LH Pass/SVRBD-ROW_P, BIOS SE5C600.86B.02.01.SP06.050920141054 05/09/2014
> [    9.988888] task: ffff9267dc2f8000 task.stack: ffffba92c783c000
> [    9.988890] RIP: 0010:memcpy_erms+0x6/0x10
> [    9.988891] RSP: 0018:ffffba92c783f9b8 EFLAGS: 00010286
> [    9.988892] RAX: ffff925f19e27000 RBX: 0000000000000000 RCX: 0000000000001000
> [    9.988893] RDX: 0000000000001000 RSI: ffff9387bfff0000 RDI: ffff925f19e27000
> [    9.988893] RBP: ffffba92c783fa38 R08: 0000000000000000 R09: 0000000017ffff80
> [    9.988894] R10: 0000000000000000 R11: ffff9387bfff0000 R12: ffff925fde811ed8
> [    9.988895] R13: 0000002fffff0000 R14: 0000000000001000 R15: ffff925f19e27000
> [    9.988896] FS:  00007f1ee18e68c0(0000) GS:ffff925fdec00000(0000) knlGS:0000000000000000
> [    9.988896] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [    9.988897] CR2: ffff9387bfff0000 CR3: 000000081ba28000 CR4: 00000000001406f0
> [    9.988897] Call Trace:
> [    9.988902]  ? pmem_do_bvec+0x93/0x290 [nd_pmem]
> [    9.988904]  ? radix_tree_node_alloc.constprop.20+0x85/0xc0
> [    9.988905]  ? radix_tree_node_alloc.constprop.20+0x85/0xc0
> [    9.988907]  pmem_rw_page+0x3a/0x60 [nd_pmem]
> [    9.988909]  bdev_read_page+0x81/0xb0
> [    9.988911]  do_mpage_readpage+0x56f/0x770
> [    9.988912]  ? I_BDEV+0x20/0x20
> [    9.988915]  ? lru_cache_add+0xe/0x10
> [    9.988917]  mpage_readpages+0x148/0x1e0
> [    9.988917]  ? I_BDEV+0x20/0x20
> [    9.988918]  ? I_BDEV+0x20/0x20
> [    9.988921]  ? alloc_pages_current+0x88/0x120
> [    9.988923]  blkdev_readpages+0x1d/0x20
> [    9.988924]  __do_page_cache_readahead+0x1ce/0x2c0
> [    9.988926]  force_page_cache_readahead+0xa2/0x100
> [    9.988927]  page_cache_sync_readahead+0x3f/0x50
> [    9.988930]  generic_file_read_iter+0x60d/0x8c0
> [    9.988931]  blkdev_read_iter+0x37/0x40
> [    9.988933]  __vfs_read+0xe0/0x150
> [    9.988934]  vfs_read+0x8c/0x130
> [    9.988936]  SyS_read+0x55/0xc0
> [    9.988939]  entry_SYSCALL_64_fastpath+0x1a/0xa9
> [    9.988940] RIP: 0033:0x7f1ee0822480
> [    9.988941] RSP: 002b:00007ffcf9e741f8 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
> [    9.988942] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f1ee0822480
> [    9.988943] RDX: 0000000000000040 RSI: 0000561b7e1aabc8 RDI: 0000000000000008
> [    9.988943] RBP: 0000561b7e1a86a0 R08: 0000000000000005 R09: 0000000000000068
> [    9.988944] R10: 00007ffcf9e73f80 R11: 0000000000000246 R12: 0000000000000000
> [    9.988945] R13: 0000000000000001 R14: 0000561b7e1a61b0 R15: 0000561b7e1a55e0
> [    9.988946] Code: ff 90 90 90 90 eb 1e 0f 1f 00 48 89 f8 48 89 d1 48 c1 e9 03 83 e2 07 f3 48 a5 89 d1 f3 a4 c3 66 0f 1f 44 00 00 48 89 f8 48 89 d1 <f3> a4 c3 0f 1f 80 00 00 00 00 48 89 f8 48 83 fa 20 72 7e 40 38
> [    9.988962] RIP: memcpy_erms+0x6/0x10 RSP: ffffba92c783f9b8
> [    9.988962] CR2: ffff9387bfff0000
> [    9.989022] ---[ end trace fe34c0fc0fe685ab ]---
> [    9.998690] Kernel panic - not syncing: Fatal exception
> [   10.004708] Kernel Offset: 0x11000000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
>
> Reported-by: Jeff Moyer <jmoyer@redhat.com>
> Signed-off-by: Baoquan He <bhe@redhat.com>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Ingo Molnar <mingo@redhat.com>
> Cc: "H. Peter Anvin" <hpa@zytor.com>
> Cc: x86@kernel.org
> Cc: Kees Cook <keescook@chromium.org>
> Cc: Thomas Garnier <thgarnie@google.com>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Yasuaki Ishimatsu <yasu.isimatu@gmail.com>
> Cc: Jinbum Park <jinb.park7@gmail.com>
> Cc: Dave Hansen <dave.hansen@linux.intel.com>
> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
> Cc: Yinghai Lu <yinghai@kernel.org>
> Cc: Dan Williams <dan.j.williams@intel.com>
> Cc: Dave Young <dyoung@redhat.com>
> ---

Good catch!

>  arch/x86/mm/init_64.c | 6 ++++--
>  1 file changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
> index 15173d3..dbf4f00 100644
> --- a/arch/x86/mm/init_64.c
> +++ b/arch/x86/mm/init_64.c
> @@ -94,12 +94,14 @@ __setup("noexec32=", nonx32_setup);
>   */
>  void sync_global_pgds(unsigned long start, unsigned long end)
>  {
> -       unsigned long address;
> +       unsigned long address, address_next;
>
> -       for (address = start; address <= end; address += PGDIR_SIZE) {
> +       for (address = start; address <= end; address = address_next) {
>                 const pgd_t *pgd_ref = pgd_offset_k(address);
>                 struct page *page;
>
> +               address_next = (address & PGDIR_MASK) + PGDIR_SIZE;
> +

Let's change this to put the next address calculation in the for loop
directly and use the ALIGN macro. Something like:

 for (address = start; address <= end; address = ALIGN(address + 1, PGDIR_SIZE))

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] x86/mm: Fix incorrect for loop count calculation in sync_global_pgds
  2017-05-01 14:40 ` Dan Williams
@ 2017-05-01 14:52   ` Baoquan He
  2017-05-01 15:24     ` Dan Williams
  0 siblings, 1 reply; 10+ messages in thread
From: Baoquan He @ 2017-05-01 14:52 UTC (permalink / raw)
  To: Dan Williams
  Cc: linux-kernel, Thomas Gleixner, Ingo Molnar, H. Peter Anvin,
	X86 ML, Kees Cook, Thomas Garnier, Andrew Morton,
	Yasuaki Ishimatsu, Jinbum Park, Dave Hansen, Kirill A. Shutemov,
	Yinghai Lu, Dave Young

On 05/01/17 at 07:40am, Dan Williams wrote:
> On Mon, May 1, 2017 at 4:41 AM, Baoquan He <bhe@redhat.com> wrote:
> >  arch/x86/mm/init_64.c | 6 ++++--
> >  1 file changed, 4 insertions(+), 2 deletions(-)
> >
> > diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
> > index 15173d3..dbf4f00 100644
> > --- a/arch/x86/mm/init_64.c
> > +++ b/arch/x86/mm/init_64.c
> > @@ -94,12 +94,14 @@ __setup("noexec32=", nonx32_setup);
> >   */
> >  void sync_global_pgds(unsigned long start, unsigned long end)
> >  {
> > -       unsigned long address;
> > +       unsigned long address, address_next;
> >
> > -       for (address = start; address <= end; address += PGDIR_SIZE) {
> > +       for (address = start; address <= end; address = address_next) {
> >                 const pgd_t *pgd_ref = pgd_offset_k(address);
> >                 struct page *page;
> >
> > +               address_next = (address & PGDIR_MASK) + PGDIR_SIZE;
> > +
> 
> Let's change this to put the next address calculation in the for loop
> directly and use the ALIGN macro. Something like:
> 
>  for (address = start; address <= end; address = ALIGN(address + 1, PGDIR_SIZE))

Hi Dan,

Good idea!

Do you think below change is OK for you? Taking out the initialization
can make the for loop line be shorter than 80 char.


diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
index 15173d3..0840311 100644
--- a/arch/x86/mm/init_64.c
+++ b/arch/x86/mm/init_64.c
@@ -94,12 +94,14 @@ __setup("noexec32=", nonx32_setup);
  */
 void sync_global_pgds(unsigned long start, unsigned long end)
 {
-       unsigned long address;
+       unsigned long address = start;
 
-       for (address = start; address <= end; address += PGDIR_SIZE) {
+       for (; address <= end; address = ALIGN(address + 1, PGDIR_SIZE))
{
                const pgd_t *pgd_ref = pgd_offset_k(address);
                struct page *page;
 
+               address_next = (address & PGDIR_MASK) + PGDIR_SIZE;
+
                if (pgd_none(*pgd_ref))
                        continue;

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] x86/mm: Fix incorrect for loop count calculation in sync_global_pgds
  2017-05-01 14:52   ` Baoquan He
@ 2017-05-01 15:24     ` Dan Williams
  2017-05-01 15:31       ` Baoquan He
  0 siblings, 1 reply; 10+ messages in thread
From: Dan Williams @ 2017-05-01 15:24 UTC (permalink / raw)
  To: Baoquan He
  Cc: linux-kernel, Thomas Gleixner, Ingo Molnar, H. Peter Anvin,
	X86 ML, Kees Cook, Thomas Garnier, Andrew Morton,
	Yasuaki Ishimatsu, Jinbum Park, Dave Hansen, Kirill A. Shutemov,
	Yinghai Lu, Dave Young

On Mon, May 1, 2017 at 7:52 AM, Baoquan He <bhe@redhat.com> wrote:
> On 05/01/17 at 07:40am, Dan Williams wrote:
>> On Mon, May 1, 2017 at 4:41 AM, Baoquan He <bhe@redhat.com> wrote:
>> >  arch/x86/mm/init_64.c | 6 ++++--
>> >  1 file changed, 4 insertions(+), 2 deletions(-)
>> >
>> > diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
>> > index 15173d3..dbf4f00 100644
>> > --- a/arch/x86/mm/init_64.c
>> > +++ b/arch/x86/mm/init_64.c
>> > @@ -94,12 +94,14 @@ __setup("noexec32=", nonx32_setup);
>> >   */
>> >  void sync_global_pgds(unsigned long start, unsigned long end)
>> >  {
>> > -       unsigned long address;
>> > +       unsigned long address, address_next;
>> >
>> > -       for (address = start; address <= end; address += PGDIR_SIZE) {
>> > +       for (address = start; address <= end; address = address_next) {
>> >                 const pgd_t *pgd_ref = pgd_offset_k(address);
>> >                 struct page *page;
>> >
>> > +               address_next = (address & PGDIR_MASK) + PGDIR_SIZE;
>> > +
>>
>> Let's change this to put the next address calculation in the for loop
>> directly and use the ALIGN macro. Something like:
>>
>>  for (address = start; address <= end; address = ALIGN(address + 1, PGDIR_SIZE))
>
> Hi Dan,
>
> Good idea!
>
> Do you think below change is OK for you? Taking out the initialization
> can make the for loop line be shorter than 80 char.
>

I would just wrap the "address = ALIGN(address + 1, PGDIR_SIZE)" if it
doesn't fit.

>
> diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
> index 15173d3..0840311 100644
> --- a/arch/x86/mm/init_64.c
> +++ b/arch/x86/mm/init_64.c
> @@ -94,12 +94,14 @@ __setup("noexec32=", nonx32_setup);
>   */
>  void sync_global_pgds(unsigned long start, unsigned long end)
>  {
> -       unsigned long address;
> +       unsigned long address = start;
>
> -       for (address = start; address <= end; address += PGDIR_SIZE) {
> +       for (; address <= end; address = ALIGN(address + 1, PGDIR_SIZE))
> {
>                 const pgd_t *pgd_ref = pgd_offset_k(address);
>                 struct page *page;
>
> +               address_next = (address & PGDIR_MASK) + PGDIR_SIZE;
> +

This gets deleted of course.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] x86/mm: Fix incorrect for loop count calculation in sync_global_pgds
  2017-05-01 15:24     ` Dan Williams
@ 2017-05-01 15:31       ` Baoquan He
  0 siblings, 0 replies; 10+ messages in thread
From: Baoquan He @ 2017-05-01 15:31 UTC (permalink / raw)
  To: Dan Williams
  Cc: linux-kernel, Thomas Gleixner, Ingo Molnar, H. Peter Anvin,
	X86 ML, Kees Cook, Thomas Garnier, Andrew Morton,
	Yasuaki Ishimatsu, Jinbum Park, Dave Hansen, Kirill A. Shutemov,
	Yinghai Lu, Dave Young

On 05/01/17 at 08:24am, Dan Williams wrote:
> On Mon, May 1, 2017 at 7:52 AM, Baoquan He <bhe@redhat.com> wrote:
> > On 05/01/17 at 07:40am, Dan Williams wrote:
> >> On Mon, May 1, 2017 at 4:41 AM, Baoquan He <bhe@redhat.com> wrote:
> >> >  arch/x86/mm/init_64.c | 6 ++++--
> >> >  1 file changed, 4 insertions(+), 2 deletions(-)
> >> >
> >> > diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
> >> > index 15173d3..dbf4f00 100644
> >> > --- a/arch/x86/mm/init_64.c
> >> > +++ b/arch/x86/mm/init_64.c
> >> > @@ -94,12 +94,14 @@ __setup("noexec32=", nonx32_setup);
> >> >   */
> >> >  void sync_global_pgds(unsigned long start, unsigned long end)
> >> >  {
> >> > -       unsigned long address;
> >> > +       unsigned long address, address_next;
> >> >
> >> > -       for (address = start; address <= end; address += PGDIR_SIZE) {
> >> > +       for (address = start; address <= end; address = address_next) {
> >> >                 const pgd_t *pgd_ref = pgd_offset_k(address);
> >> >                 struct page *page;
> >> >
> >> > +               address_next = (address & PGDIR_MASK) + PGDIR_SIZE;
> >> > +
> >>
> >> Let's change this to put the next address calculation in the for loop
> >> directly and use the ALIGN macro. Something like:
> >>
> >>  for (address = start; address <= end; address = ALIGN(address + 1, PGDIR_SIZE))
> >
> > Hi Dan,
> >
> > Good idea!
> >
> > Do you think below change is OK for you? Taking out the initialization
> > can make the for loop line be shorter than 80 char.
> >
> 
> I would just wrap the "address = ALIGN(address + 1, PGDIR_SIZE)" if it
> doesn't fit.

OK, it's fine, will do wrapping.

> 
> >
> > diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
> > index 15173d3..0840311 100644
> > --- a/arch/x86/mm/init_64.c
> > +++ b/arch/x86/mm/init_64.c
> > @@ -94,12 +94,14 @@ __setup("noexec32=", nonx32_setup);
> >   */
> >  void sync_global_pgds(unsigned long start, unsigned long end)
> >  {
> > -       unsigned long address;
> > +       unsigned long address = start;
> >
> > -       for (address = start; address <= end; address += PGDIR_SIZE) {
> > +       for (; address <= end; address = ALIGN(address + 1, PGDIR_SIZE))
> > {
> >                 const pgd_t *pgd_ref = pgd_offset_k(address);
> >                 struct page *page;
> >
> > +               address_next = (address & PGDIR_MASK) + PGDIR_SIZE;
> > +
> 
> This gets deleted of course.

Sure, forget deleting it. while I am also testing on Jeff's system, that
code is right, otherwise compiling won't pass.

Will repost after the pgd crossing case seen and passed.

Thanks
Baoquan

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] x86/mm: Fix incorrect for loop count calculation in sync_global_pgds
  2017-05-01 11:41 [PATCH] x86/mm: Fix incorrect for loop count calculation in sync_global_pgds Baoquan He
  2017-05-01 14:15 ` Thomas Garnier
  2017-05-01 14:40 ` Dan Williams
@ 2017-05-01 22:37 ` Yinghai Lu
  2017-05-02  7:18   ` Baoquan He
  2 siblings, 1 reply; 10+ messages in thread
From: Yinghai Lu @ 2017-05-01 22:37 UTC (permalink / raw)
  To: Baoquan He
  Cc: Linux Kernel Mailing List, Thomas Gleixner, Ingo Molnar,
	H. Peter Anvin, the arch/x86 maintainers, Kees Cook,
	Thomas Garnier, Andrew Morton, Yasuaki Ishimatsu, Jinbum Park,
	Dave Hansen, Kirill A. Shutemov, Dan Williams, Dave Young

On Mon, May 1, 2017 at 4:41 AM, Baoquan He <bhe@redhat.com> wrote:
> Jeff Moyer reported that on his system with two memory regions 0~64G and
> 1T~1T+192G, and kernel option "memmap=192G!1024G" added, enabling kaslr
> will make system hang intermittently during boot. While adding 'nokaslr'
> won't.
>
> This is because the for loop count calculation in sync_global_pgds is
> not correct. When a mapping area crosses pgd entries, we should
> calculate the starting address of region which next pgd covers and assign
> it to next for loop count, but not add PGDIR_SIZE directly. The old
> code works right only if the mapping area is times of PGDIR_SIZE,
> otherwize the end region could be skipped so that it can't be synchronized
> to all other processes from kernel pgd init_mm.pgd.
>
> In Jeff's system, emulated pmem area [1024G, 1216G) is smaller than
> PGDIR_SIZE. While 'nokaslr' works because PAGE_OFFSET is 1T aligned, it
> makes this area be mapped inside one pgd entry. With kaslr enabled,
> this area could cross two pgd entries, then the next pgd entry won't
> be synced to all other processes. That is why we saw empty PGD.
>
> Fix it in this patch.
>
> The back trace is pasted as below:
>
> [    9.988867] IP: memcpy_erms+0x6/0x10
> [    9.988868] PGD 0
> [    9.988868]
> [    9.988870] Oops: 0000 [#1] SMP
> [    9.988871] Modules linked in: isci(E) mgag200(E+) drm_kms_helper(E) syscopyarea(E) sysfillrect(E) sysimgblt(E) fb_sys_fops(E) igb(E) ahci(E) ttm(E) libsas(E) libahci(E) scsi_transport_sas(E) ptp(E) pps_core(E) nd_pmem(E) dca(E) drm(E) i2c_algo_bit(E) libata(E) crc32c_intel(E) nd_btt(E) i2c_core(E) dm_mirror(E) dm_region_hash(E) dm_log(E) dm_mod(E)
> [    9.988886] CPU: 0 PID: 442 Comm: systemd-udevd Tainted: G            E   4.11.0-rc5+ #43
> [    9.988887] Hardware name: Intel Corporation LH Pass/SVRBD-ROW_P, BIOS SE5C600.86B.02.01.SP06.050920141054 05/09/2014
> [    9.988888] task: ffff9267dc2f8000 task.stack: ffffba92c783c000
> [    9.988890] RIP: 0010:memcpy_erms+0x6/0x10
> [    9.988891] RSP: 0018:ffffba92c783f9b8 EFLAGS: 00010286
> [    9.988892] RAX: ffff925f19e27000 RBX: 0000000000000000 RCX: 0000000000001000
> [    9.988893] RDX: 0000000000001000 RSI: ffff9387bfff0000 RDI: ffff925f19e27000
> [    9.988893] RBP: ffffba92c783fa38 R08: 0000000000000000 R09: 0000000017ffff80
> [    9.988894] R10: 0000000000000000 R11: ffff9387bfff0000 R12: ffff925fde811ed8
> [    9.988895] R13: 0000002fffff0000 R14: 0000000000001000 R15: ffff925f19e27000
> [    9.988896] FS:  00007f1ee18e68c0(0000) GS:ffff925fdec00000(0000) knlGS:0000000000000000
> [    9.988896] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [    9.988897] CR2: ffff9387bfff0000 CR3: 000000081ba28000 CR4: 00000000001406f0
> [    9.988897] Call Trace:
> [    9.988902]  ? pmem_do_bvec+0x93/0x290 [nd_pmem]
> [    9.988904]  ? radix_tree_node_alloc.constprop.20+0x85/0xc0
> [    9.988905]  ? radix_tree_node_alloc.constprop.20+0x85/0xc0
> [    9.988907]  pmem_rw_page+0x3a/0x60 [nd_pmem]
> [    9.988909]  bdev_read_page+0x81/0xb0
> [    9.988911]  do_mpage_readpage+0x56f/0x770
> [    9.988912]  ? I_BDEV+0x20/0x20
> [    9.988915]  ? lru_cache_add+0xe/0x10
> [    9.988917]  mpage_readpages+0x148/0x1e0
> [    9.988917]  ? I_BDEV+0x20/0x20
> [    9.988918]  ? I_BDEV+0x20/0x20
> [    9.988921]  ? alloc_pages_current+0x88/0x120
> [    9.988923]  blkdev_readpages+0x1d/0x20
> [    9.988924]  __do_page_cache_readahead+0x1ce/0x2c0
> [    9.988926]  force_page_cache_readahead+0xa2/0x100
> [    9.988927]  page_cache_sync_readahead+0x3f/0x50
> [    9.988930]  generic_file_read_iter+0x60d/0x8c0
> [    9.988931]  blkdev_read_iter+0x37/0x40
> [    9.988933]  __vfs_read+0xe0/0x150
> [    9.988934]  vfs_read+0x8c/0x130
> [    9.988936]  SyS_read+0x55/0xc0
> [    9.988939]  entry_SYSCALL_64_fastpath+0x1a/0xa9
> [    9.988940] RIP: 0033:0x7f1ee0822480
> [    9.988941] RSP: 002b:00007ffcf9e741f8 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
> [    9.988942] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f1ee0822480
> [    9.988943] RDX: 0000000000000040 RSI: 0000561b7e1aabc8 RDI: 0000000000000008
> [    9.988943] RBP: 0000561b7e1a86a0 R08: 0000000000000005 R09: 0000000000000068
> [    9.988944] R10: 00007ffcf9e73f80 R11: 0000000000000246 R12: 0000000000000000
> [    9.988945] R13: 0000000000000001 R14: 0000561b7e1a61b0 R15: 0000561b7e1a55e0
> [    9.988946] Code: ff 90 90 90 90 eb 1e 0f 1f 00 48 89 f8 48 89 d1 48 c1 e9 03 83 e2 07 f3 48 a5 89 d1 f3 a4 c3 66 0f 1f 44 00 00 48 89 f8 48 89 d1 <f3> a4 c3 0f 1f 80 00 00 00 00 48 89 f8 48 83 fa 20 72 7e 40 38
> [    9.988962] RIP: memcpy_erms+0x6/0x10 RSP: ffffba92c783f9b8
> [    9.988962] CR2: ffff9387bfff0000
> [    9.989022] ---[ end trace fe34c0fc0fe685ab ]---
> [    9.998690] Kernel panic - not syncing: Fatal exception
> [   10.004708] Kernel Offset: 0x11000000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
>
> Reported-by: Jeff Moyer <jmoyer@redhat.com>
> Signed-off-by: Baoquan He <bhe@redhat.com>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Ingo Molnar <mingo@redhat.com>
> Cc: "H. Peter Anvin" <hpa@zytor.com>
> Cc: x86@kernel.org
> Cc: Kees Cook <keescook@chromium.org>
> Cc: Thomas Garnier <thgarnie@google.com>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Yasuaki Ishimatsu <yasu.isimatu@gmail.com>
> Cc: Jinbum Park <jinb.park7@gmail.com>
> Cc: Dave Hansen <dave.hansen@linux.intel.com>
> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
> Cc: Yinghai Lu <yinghai@kernel.org>
> Cc: Dan Williams <dan.j.williams@intel.com>
> Cc: Dave Young <dyoung@redhat.com>
> ---
>  arch/x86/mm/init_64.c | 6 ++++--
>  1 file changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
> index 15173d3..dbf4f00 100644
> --- a/arch/x86/mm/init_64.c
> +++ b/arch/x86/mm/init_64.c
> @@ -94,12 +94,14 @@ __setup("noexec32=", nonx32_setup);
>   */
>  void sync_global_pgds(unsigned long start, unsigned long end)
>  {
> -       unsigned long address;
> +       unsigned long address, address_next;
>
> -       for (address = start; address <= end; address += PGDIR_SIZE) {
> +       for (address = start; address <= end; address = address_next) {
>                 const pgd_t *pgd_ref = pgd_offset_k(address);
>                 struct page *page;
>
> +               address_next = (address & PGDIR_MASK) + PGDIR_SIZE;
> +
>                 if (pgd_none(*pgd_ref))
>                         continue;
>

This one is better than V2.

It would better if could rename address to addr as Ingo suggested.

Acked-by: Yinghai Lu <yinghai@kernel.org>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] x86/mm: Fix incorrect for loop count calculation in sync_global_pgds
  2017-05-01 22:37 ` Yinghai Lu
@ 2017-05-02  7:18   ` Baoquan He
  2017-05-02  7:24     ` Ingo Molnar
  0 siblings, 1 reply; 10+ messages in thread
From: Baoquan He @ 2017-05-02  7:18 UTC (permalink / raw)
  To: Yinghai Lu
  Cc: Linux Kernel Mailing List, Thomas Gleixner, Ingo Molnar,
	H. Peter Anvin, the arch/x86 maintainers, Kees Cook,
	Thomas Garnier, Andrew Morton, Yasuaki Ishimatsu, Jinbum Park,
	Dave Hansen, Kirill A. Shutemov, Dan Williams, Dave Young

On 05/01/17 at 03:37pm, Yinghai Lu wrote:
> On Mon, May 1, 2017 at 4:41 AM, Baoquan He <bhe@redhat.com> wrote:
> >  arch/x86/mm/init_64.c | 6 ++++--
> >  1 file changed, 4 insertions(+), 2 deletions(-)
> >
> > diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
> > index 15173d3..dbf4f00 100644
> > --- a/arch/x86/mm/init_64.c
> > +++ b/arch/x86/mm/init_64.c
> > @@ -94,12 +94,14 @@ __setup("noexec32=", nonx32_setup);
> >   */
> >  void sync_global_pgds(unsigned long start, unsigned long end)
> >  {
> > -       unsigned long address;
> > +       unsigned long address, address_next;
> >
> > -       for (address = start; address <= end; address += PGDIR_SIZE) {
> > +       for (address = start; address <= end; address = address_next) {
> >                 const pgd_t *pgd_ref = pgd_offset_k(address);
> >                 struct page *page;
> >
> > +               address_next = (address & PGDIR_MASK) + PGDIR_SIZE;
> > +
> >                 if (pgd_none(*pgd_ref))
> >                         continue;
> >
> 
> This one is better than V2.
> 
> It would better if could rename address to addr as Ingo suggested.

Thanks for your checking and suggestion, Yinghai.

Both v1 and v2 are fine to me. As you said, code in v1 is easily
understood, while v2 code is more compact, less line. The line of v1 is
not more than 80. Maybe Ingo can help choose one which he likes better.

Thanks
Baoquan

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] x86/mm: Fix incorrect for loop count calculation in sync_global_pgds
  2017-05-02  7:18   ` Baoquan He
@ 2017-05-02  7:24     ` Ingo Molnar
  2017-05-02  7:40       ` Baoquan He
  0 siblings, 1 reply; 10+ messages in thread
From: Ingo Molnar @ 2017-05-02  7:24 UTC (permalink / raw)
  To: Baoquan He
  Cc: Yinghai Lu, Linux Kernel Mailing List, Thomas Gleixner,
	Ingo Molnar, H. Peter Anvin, the arch/x86 maintainers, Kees Cook,
	Thomas Garnier, Andrew Morton, Yasuaki Ishimatsu, Jinbum Park,
	Dave Hansen, Kirill A. Shutemov, Dan Williams, Dave Young


* Baoquan He <bhe@redhat.com> wrote:

> On 05/01/17 at 03:37pm, Yinghai Lu wrote:
> > On Mon, May 1, 2017 at 4:41 AM, Baoquan He <bhe@redhat.com> wrote:
> > >  arch/x86/mm/init_64.c | 6 ++++--
> > >  1 file changed, 4 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
> > > index 15173d3..dbf4f00 100644
> > > --- a/arch/x86/mm/init_64.c
> > > +++ b/arch/x86/mm/init_64.c
> > > @@ -94,12 +94,14 @@ __setup("noexec32=", nonx32_setup);
> > >   */
> > >  void sync_global_pgds(unsigned long start, unsigned long end)
> > >  {
> > > -       unsigned long address;
> > > +       unsigned long address, address_next;
> > >
> > > -       for (address = start; address <= end; address += PGDIR_SIZE) {
> > > +       for (address = start; address <= end; address = address_next) {
> > >                 const pgd_t *pgd_ref = pgd_offset_k(address);
> > >                 struct page *page;
> > >
> > > +               address_next = (address & PGDIR_MASK) + PGDIR_SIZE;
> > > +
> > >                 if (pgd_none(*pgd_ref))
> > >                         continue;
> > >
> > 
> > This one is better than V2.
> > 
> > It would better if could rename address to addr as Ingo suggested.
> 
> Thanks for your checking and suggestion, Yinghai.
> 
> Both v1 and v2 are fine to me. As you said, code in v1 is easily
> understood, while v2 code is more compact, less line. The line of v1 is
> not more than 80. Maybe Ingo can help choose one which he likes better.

Let's do the variant I suggested - that makes the loop self-contained ('continue' 
would work as-is, etc.) and makes the code all around more readable.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] x86/mm: Fix incorrect for loop count calculation in sync_global_pgds
  2017-05-02  7:24     ` Ingo Molnar
@ 2017-05-02  7:40       ` Baoquan He
  0 siblings, 0 replies; 10+ messages in thread
From: Baoquan He @ 2017-05-02  7:40 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Yinghai Lu, Linux Kernel Mailing List, Thomas Gleixner,
	Ingo Molnar, H. Peter Anvin, the arch/x86 maintainers, Kees Cook,
	Thomas Garnier, Andrew Morton, Yasuaki Ishimatsu, Jinbum Park,
	Dave Hansen, Kirill A. Shutemov, Dan Williams, Dave Young

On 05/02/17 at 09:24am, Ingo Molnar wrote:
> 
> * Baoquan He <bhe@redhat.com> wrote:
> 
> > On 05/01/17 at 03:37pm, Yinghai Lu wrote:
> > > On Mon, May 1, 2017 at 4:41 AM, Baoquan He <bhe@redhat.com> wrote:
> > > >  arch/x86/mm/init_64.c | 6 ++++--
> > > >  1 file changed, 4 insertions(+), 2 deletions(-)
> > > >
> > > > diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
> > > > index 15173d3..dbf4f00 100644
> > > > --- a/arch/x86/mm/init_64.c
> > > > +++ b/arch/x86/mm/init_64.c
> > > > @@ -94,12 +94,14 @@ __setup("noexec32=", nonx32_setup);
> > > >   */
> > > >  void sync_global_pgds(unsigned long start, unsigned long end)
> > > >  {
> > > > -       unsigned long address;
> > > > +       unsigned long address, address_next;
> > > >
> > > > -       for (address = start; address <= end; address += PGDIR_SIZE) {
> > > > +       for (address = start; address <= end; address = address_next) {
> > > >                 const pgd_t *pgd_ref = pgd_offset_k(address);
> > > >                 struct page *page;
> > > >
> > > > +               address_next = (address & PGDIR_MASK) + PGDIR_SIZE;
> > > > +
> > > >                 if (pgd_none(*pgd_ref))
> > > >                         continue;
> > > >
> > > 
> > > This one is better than V2.
> > > 
> > > It would better if could rename address to addr as Ingo suggested.
> > 
> > Thanks for your checking and suggestion, Yinghai.
> > 
> > Both v1 and v2 are fine to me. As you said, code in v1 is easily
> > understood, while v2 code is more compact, less line. The line of v1 is
> > not more than 80. Maybe Ingo can help choose one which he likes better.
> 
> Let's do the variant I suggested - that makes the loop self-contained ('continue' 
> would work as-is, etc.) and makes the code all around more readable.

Sure, let me change as you said, will post after testing passed.

Thanks a lot!

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2017-05-02  7:40 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-05-01 11:41 [PATCH] x86/mm: Fix incorrect for loop count calculation in sync_global_pgds Baoquan He
2017-05-01 14:15 ` Thomas Garnier
2017-05-01 14:40 ` Dan Williams
2017-05-01 14:52   ` Baoquan He
2017-05-01 15:24     ` Dan Williams
2017-05-01 15:31       ` Baoquan He
2017-05-01 22:37 ` Yinghai Lu
2017-05-02  7:18   ` Baoquan He
2017-05-02  7:24     ` Ingo Molnar
2017-05-02  7:40       ` Baoquan He

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).