Linux-NVDIMM Archive on lore.kernel.org
 help / color / Atom feed
* [QUESTION] Error on initializing dax by using different struct page size
@ 2019-11-07 15:29 Won-Kyo Choe
  2019-11-07 15:54 ` Dan Williams
  0 siblings, 1 reply; 5+ messages in thread
From: Won-Kyo Choe @ 2019-11-07 15:29 UTC (permalink / raw)
  To: linux-nvdimm; +Cc: dan.j.williams

Hi, there. I'm using Opatne DC memory to use it a volatile memory. Recently,
I found that if sizeof(struct page) is above 64 bytes (e.g. 128 byes),
`device_dax` cannot be initialized when system boots. I am aware that
for some reason there is a function, `__mm_zero_struct_page`, which limits
the size of struct page when it exceeds 80 bytes. However, due to the
research purpose, I do not use that constraint and I'm quite certain
that using different page size is usable in main memory. So, I'm
wondering why this is not possible in persistent memory and which
patches are related to this problem.

I will attach the system log for clarification. The test is run in
linux-5.3.9 and linuxt-5.3-rc5

[   23.493230] WARNING: CPU: 23 PID: 890 at arch/x86/mm/init_64.c:852 add_pages+0x5d/0x70
[   23.493231] Modules linked in: device_dax(+) nd_pmem dax_pmem nd_btt dax_pmem_core skx_edac(+) rpcsec_gss_krb5 auth_rpcgss nfsv4 nfs lockd grace fscache ipmi_ssif x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm nls_iso8859_1 irqbypass intel_cstate intel_rapl_perf input_leds joydev ioatdma mei_me lpc_ich mei dca ipmi_si ipmi_devintf ipmi_msghandler nfit acpi_power_meter acpi_pad mac_hid sch
_fq_codel ib_iser rdma_cm iw_cm ib_cm ib_core sunrpc iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi ip_tables x_tables autofs4 btrfs zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear hid_generic usbhid hid ast i2c_algo_bit drm_vram_helper ttm drm_kms_helper crct10dif_pclmul crc32_pclmul ghash_clmulni_intel syscopyarea sysfillrect aesni_intel sysimgblt fb_sys_fops aes_x86_64 crypto_simd drm i40e nvme cryptd ptp glue_helper nvme_core ahci pps_core libahci wmi
[   23.493271] CPU: 23 PID: 890 Comm: systemd-udevd Not tainted 5.3.9 #1
[   23.493272] Hardware name: Supermicro Super Server/X11DPH-T, BIOS 3.1 05/22/2019
[   23.493275] RIP: 0010:add_pages+0x5d/0x70
[   23.493277] Code: 2e c0 01 76 20 48 89 15 69 2e c0 01 48 89 15 72 2e c0 01 48 c1 e2 0c 48 03 15 8f 5a 36 01 48 89 15 18 09 c0 01 5b 41 5c 5d c3 <0f> 0b eb ba 0f 1f 44 00 00 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44
[   23.493278] RSP: 0018:ffff9d75c7d73a18 EFLAGS: 00010282
[   23.493280] RAX: 00000000fffffff4 RBX: 0000000000880000 RCX: 0000000000000000
[   23.493281] RDX: 0000000000000020 RSI: 0000000000000020 RDI: 0000000000000282
[   23.493282] RBP: ffff9d75c7d73a28 R08: ffff91c7c0200000 R09: 0000000000000000
[   23.493283] R10: ffff9d75c7d73830 R11: 0000000000000000 R12: 0000000003f00000
[   23.493283] R13: 0000000000000000 R14: ffff9d75c7d73a78 R15: 000000000000003e
[   23.493285] FS:  00007fb03c99b680(0000) GS:ffff920ebfbc0000(0000) knlGS:0000000000000000
[   23.493286] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   23.493287] CR2: 000000c420be7010 CR3: 000000083fd82002 CR4: 00000000007606e0
[   23.493288] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[   23.493289] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[   23.493289] PKRU: 55555554
[   23.493290] Call Trace:
[   23.493295]  arch_add_memory+0x41/0x50
[   23.493300]  devm_memremap_pages+0x460/0x600
[   23.493304]  dev_dax_probe+0x6a/0x180 [device_dax]
[   23.493307]  really_probe+0xf5/0x3e0
[   23.493309]  driver_probe_device+0x11b/0x130
[   23.493311]  device_driver_attach+0x58/0x60
[   23.493312]  __driver_attach+0xa3/0x140
[   23.493314]  ? device_driver_attach+0x60/0x60
[   23.493315]  ? device_driver_attach+0x60/0x60
[   23.493320]  bus_for_each_dev+0x74/0xb0
[   23.493322]  ? kmem_cache_alloc_trace+0x1ff/0x210
[   23.493324]  driver_attach+0x1e/0x20
[   23.493325]  bus_add_driver+0x147/0x220
[   23.493327]  ? 0xffffffffc0e36000
[   23.493329]  driver_register+0x60/0x100
[   23.493330]  ? 0xffffffffc0e36000
[   23.493333]  __dax_driver_register+0x6c/0xa0
[   23.493336]  dax_init+0x23/0x1000 [device_dax]
[   23.493343]  do_one_initcall+0x4a/0x1fa
[   23.493347]  ? _cond_resched+0x19/0x40
[   23.493349]  ? kmem_cache_alloc_trace+0x3f/0x210
[   23.493352]  do_init_module+0x5f/0x227
[   23.493360]  load_module+0x244f/0x2c10
[   23.493365]  __do_sys_finit_module+0xfc/0x120
[   23.493367]  ? __do_sys_finit_module+0xfc/0x120
[   23.493370]  __x64_sys_finit_module+0x1a/0x20
[   23.493372]  do_syscall_64+0x5a/0x130
[   23.493377]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[   23.493378] RIP: 0033:0x7fb03c4b1839
[   23.493380] Code: 00 f3 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 1f f6 2c 00 f7 d8 64 89 01 48
[   23.493381] RSP: 002b:00007ffe4422f458 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
[   23.493382] RAX: ffffffffffffffda RBX: 000055fcfc92b130 RCX: 00007fb03c4b1839
[   23.493383] RDX: 0000000000000000 RSI: 00007fb03c190145 RDI: 0000000000000007
[   23.493384] RBP: 00007fb03c190145 R08: 0000000000000000 R09: 00007ffe4422f570
[   23.493385] R10: 0000000000000007 R11: 0000000000000246 R12: 0000000000000000
[   23.493386] R13: 000055fcfc91a830 R14: 0000000000020000 R15: 000055fcfc92b130
[   23.493388] ---[ end trace 0a14fa412f3d5c6d ]---
[   23.542919] device_dax: probe of dax0.0 failed with error -12

...

[   23.564220] [ffffdbb6a7680000-ffffdbb6a77fffff] potential offnode page_structs
[   23.564230] [ffffdbb6a7d80000-ffffdbb6a7dfffff] potential offnode page_structs
[   23.564235] [ffffdbb6a8100000-ffffdbb6a81fffff] potential offnode page_structs
[   23.564240] [ffffdbb6a8480000-ffffdbb6a85fffff] potential offnode page_structs
[   23.598937] device_dax: probe of dax1.0 failed with error -12

Thanks,
Won-Kyo Choe
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [QUESTION] Error on initializing dax by using different struct page size
  2019-11-07 15:29 [QUESTION] Error on initializing dax by using different struct page size Won-Kyo Choe
@ 2019-11-07 15:54 ` Dan Williams
  2019-11-07 19:00   ` Won-Kyo Choe
  0 siblings, 1 reply; 5+ messages in thread
From: Dan Williams @ 2019-11-07 15:54 UTC (permalink / raw)
  To: Won-Kyo Choe; +Cc: linux-nvdimm

On Thu, Nov 7, 2019 at 7:30 AM Won-Kyo Choe <wkyo.choe@gmail.com> wrote:
>
> Hi, there. I'm using Opatne DC memory to use it a volatile memory. Recently,
> I found that if sizeof(struct page) is above 64 bytes (e.g. 128 byes),
> `device_dax` cannot be initialized when system boots. I am aware that
> for some reason there is a function, `__mm_zero_struct_page`, which limits
> the size of struct page when it exceeds 80 bytes. However, due to the
> research purpose, I do not use that constraint and I'm quite certain
> that using different page size is usable in main memory. So, I'm
> wondering why this is not possible in persistent memory and which
> patches are related to this problem.
>
> I will attach the system log for clarification. The test is run in
> linux-5.3.9 and linuxt-5.3-rc5

How did you manage to build the kernel with a 128byte struct page
size? This build assert in drivers/nvdimm/pfn_devs.c

                BUILD_BUG_ON(sizeof(struct page) > MAX_STRUCT_PAGE_SIZE);

...will start to trigger in v5.4 to explicitly prevent this going
forward. See commit e96f0bf2ec92 "libnvdimm/pfn_dev: Add a build check
to make sure we notice when struct page size change" for more details.

In general 64-bytes per page is already expensive 128 bytes is a
gigantic struct page.
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [QUESTION] Error on initializing dax by using different struct page size
  2019-11-07 15:54 ` Dan Williams
@ 2019-11-07 19:00   ` Won-Kyo Choe
  2019-11-07 19:24     ` Dan Williams
  0 siblings, 1 reply; 5+ messages in thread
From: Won-Kyo Choe @ 2019-11-07 19:00 UTC (permalink / raw)
  To: Dan Williams; +Cc: linux-nvdimm

On Thu, Nov 07, 2019 at 07:54:21AM -0800, Dan Williams wrote:
> On Thu, Nov 7, 2019 at 7:30 AM Won-Kyo Choe <wkyo.choe@gmail.com> wrote:
> >
> > Hi, there. I'm using Opatne DC memory to use it a volatile memory. Recently,
> > I found that if sizeof(struct page) is above 64 bytes (e.g. 128 byes),
> > `device_dax` cannot be initialized when system boots. I am aware that
> > for some reason there is a function, `__mm_zero_struct_page`, which limits
> > the size of struct page when it exceeds 80 bytes. However, due to the
> > research purpose, I do not use that constraint and I'm quite certain
> > that using different page size is usable in main memory. So, I'm
> > wondering why this is not possible in persistent memory and which
> > patches are related to this problem.
> >
> > I will attach the system log for clarification. The test is run in
> > linux-5.3.9 and linuxt-5.3-rc5
> 
> How did you manage to build the kernel with a 128byte struct page
> size? This build assert in drivers/nvdimm/pfn_devs.c
> 
>                 BUILD_BUG_ON(sizeof(struct page) > MAX_STRUCT_PAGE_SIZE);
> 
> ...will start to trigger in v5.4 to explicitly prevent this going
> forward. See commit e96f0bf2ec92 "libnvdimm/pfn_dev: Add a build check
> to make sure we notice when struct page size change" for more details.
> 
Thanks for the related commit. The kernel that I am using (5.3.9 / 5.3-rc5) does
not have the assert so that I was able to build it by little bit modifying lines
in include/linux/mm.h

        BUILD_BUG_ON(sizeof(struct page) > 80);
        ...

, which is quite similar with the assert you referred.

> In general 64-bytes per page is already expensive 128 bytes is a
> gigantic struct page.
Yes. I am aware that issue. I just wanted to add hot-page tracking
feature by inserting some data structure collecting it inside struct page
but the size is matter. I should find another way to get that stat :)

(Sorry, I should've put cc on this mail)

Thanks,

Won-Kyo
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [QUESTION] Error on initializing dax by using different struct page size
  2019-11-07 19:00   ` Won-Kyo Choe
@ 2019-11-07 19:24     ` Dan Williams
  2019-11-08  4:26       ` Won-Kyo Choe
  0 siblings, 1 reply; 5+ messages in thread
From: Dan Williams @ 2019-11-07 19:24 UTC (permalink / raw)
  To: Won-Kyo Choe; +Cc: linux-nvdimm

On Thu, Nov 7, 2019 at 11:00 AM Won-Kyo Choe <wkyo.choe@gmail.com> wrote:
>
> On Thu, Nov 07, 2019 at 07:54:21AM -0800, Dan Williams wrote:
> > On Thu, Nov 7, 2019 at 7:30 AM Won-Kyo Choe <wkyo.choe@gmail.com> wrote:
> > >
> > > Hi, there. I'm using Opatne DC memory to use it a volatile memory. Recently,
> > > I found that if sizeof(struct page) is above 64 bytes (e.g. 128 byes),
> > > `device_dax` cannot be initialized when system boots. I am aware that
> > > for some reason there is a function, `__mm_zero_struct_page`, which limits
> > > the size of struct page when it exceeds 80 bytes. However, due to the
> > > research purpose, I do not use that constraint and I'm quite certain
> > > that using different page size is usable in main memory. So, I'm
> > > wondering why this is not possible in persistent memory and which
> > > patches are related to this problem.
> > >
> > > I will attach the system log for clarification. The test is run in
> > > linux-5.3.9 and linuxt-5.3-rc5
> >
> > How did you manage to build the kernel with a 128byte struct page
> > size? This build assert in drivers/nvdimm/pfn_devs.c
> >
> >                 BUILD_BUG_ON(sizeof(struct page) > MAX_STRUCT_PAGE_SIZE);
> >
> > ...will start to trigger in v5.4 to explicitly prevent this going
> > forward. See commit e96f0bf2ec92 "libnvdimm/pfn_dev: Add a build check
> > to make sure we notice when struct page size change" for more details.
> >
> Thanks for the related commit. The kernel that I am using (5.3.9 / 5.3-rc5) does
> not have the assert so that I was able to build it by little bit modifying lines
> in include/linux/mm.h
>
>         BUILD_BUG_ON(sizeof(struct page) > 80);
>         ...
>
> , which is quite similar with the assert you referred.
>
> > In general 64-bytes per page is already expensive 128 bytes is a
> > gigantic struct page.
> Yes. I am aware that issue. I just wanted to add hot-page tracking
> feature by inserting some data structure collecting it inside struct page
> but the size is matter. I should find another way to get that stat :)

For hot page tracking you may want to look at some of the discussion
around Memory Hierarchy support:

https://lore.kernel.org/linux-mm/c3d6de4d-f7c3-b505-2e64-8ee5f70b2118@intel.com/
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [QUESTION] Error on initializing dax by using different struct page size
  2019-11-07 19:24     ` Dan Williams
@ 2019-11-08  4:26       ` Won-Kyo Choe
  0 siblings, 0 replies; 5+ messages in thread
From: Won-Kyo Choe @ 2019-11-08  4:26 UTC (permalink / raw)
  To: Dan Williams; +Cc: linux-nvdimm

Thanks for the information. This is actually what I've been looking for!

Regards,
Won-Kyo
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, back to index

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-11-07 15:29 [QUESTION] Error on initializing dax by using different struct page size Won-Kyo Choe
2019-11-07 15:54 ` Dan Williams
2019-11-07 19:00   ` Won-Kyo Choe
2019-11-07 19:24     ` Dan Williams
2019-11-08  4:26       ` Won-Kyo Choe

Linux-NVDIMM Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-nvdimm/0 linux-nvdimm/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-nvdimm linux-nvdimm/ https://lore.kernel.org/linux-nvdimm \
		linux-nvdimm@lists.01.org
	public-inbox-index linux-nvdimm

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.01.lists.linux-nvdimm


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git