From: Daniel Jordan <daniel.m.jordan@oracle.com> To: "Elliott, Robert (Persistent Memory)" <elliott@hpe.com> Cc: Daniel Jordan <daniel.m.jordan@oracle.com>, "linux-mm@kvack.org" <linux-mm@kvack.org>, "kvm@vger.kernel.org" <kvm@vger.kernel.org>, "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>, "aarcange@redhat.com" <aarcange@redhat.com>, "aaron.lu@intel.com" <aaron.lu@intel.com>, "akpm@linux-foundation.org" <akpm@linux-foundation.org>, "alex.williamson@redhat.com" <alex.williamson@redhat.com>, "bsd@redhat.com" <bsd@redhat.com>, "darrick.wong@oracle.com" <darrick.wong@oracle.com>, "dave.hansen@linux.intel.com" <dave.hansen@linux.intel.com>, "jgg@mellanox.com" <jgg@mellanox.com>, "jwadams@google.com" <jwadams@google.com>, "jiangshanlai@gmail.com" <jiangshanlai@gmail.com>, "mhocko@kernel.org" <mhocko@kernel.org>, "mike.kravetz@oracle.com" <mike.kravetz@oracle.com>, "Pavel.Tatashin@microsoft.com" <Pavel.Tatashin@microsoft.com>, "prasad.singamsetty@oracle.com" <prasad.singamsetty@oracle.com>, "rdunlap@infradead.org" <rdunlap@infradead.org>, "steven.sistare@oracle.com" <steven.sistare@oracle.com>, "tim.c.chen@intel.com" <tim.c.chen@intel.com>, "tj@kernel.org" <tj@kernel.org>, "vbabka@suse.cz" <vbabka@suse.cz> Subject: Re: [RFC PATCH v4 11/13] mm: parallelize deferred struct page initialization within each node Date: Mon, 12 Nov 2018 08:54:12 -0800 [thread overview] Message-ID: <20181112165412.vizeiv6oimsuxkbk@ca-dmjordan1.us.oracle.com> (raw) In-Reply-To: <AT5PR8401MB1169798EBEF1EE5EBA3ABFFFABC70@AT5PR8401MB1169.NAMPRD84.PROD.OUTLOOK.COM> On Sat, Nov 10, 2018 at 03:48:14AM +0000, Elliott, Robert (Persistent Memory) wrote: > > -----Original Message----- > > From: linux-kernel-owner@vger.kernel.org <linux-kernel- > > owner@vger.kernel.org> On Behalf Of Daniel Jordan > > Sent: Monday, November 05, 2018 10:56 AM > > Subject: [RFC PATCH v4 11/13] mm: parallelize deferred struct page > > initialization within each node > > > > ... The kernel doesn't > > know the memory bandwidth of a given system to get the most efficient > > number of threads, so there's some guesswork involved. > > The ACPI HMAT (Heterogeneous Memory Attribute Table) is designed to report > that kind of information, and could facilitate automatic tuning. > > There was discussion last year about kernel support for it: > https://lore.kernel.org/lkml/20171214021019.13579-1-ross.zwisler@linux.intel.com/ Thanks for bringing this up. I'm traveling but will take a closer look when I get back. > > In testing, a reasonable value turned out to be about a quarter of the > > CPUs on the node. > ... > > + /* > > + * We'd like to know the memory bandwidth of the chip to > > calculate the > > + * most efficient number of threads to start, but we can't. > > + * In testing, a good value for a variety of systems was a > > quarter of the CPUs on the node. > > + */ > > + nr_node_cpus = DIV_ROUND_UP(cpumask_weight(cpumask), 4); > > > You might want to base that calculation on and limit the threads to > physical cores, not hyperthreaded cores. Why? Hyperthreads can be beneficial when waiting on memory. That said, I don't have data that shows that in this case.
WARNING: multiple messages have this Message-ID (diff)
From: Daniel Jordan <daniel.m.jordan@oracle.com> To: "Elliott, Robert (Persistent Memory)" <elliott@hpe.com> Cc: Daniel Jordan <daniel.m.jordan@oracle.com>, "linux-mm@kvack.org" <linux-mm@kvack.org>, "kvm@vger.kernel.org" <kvm@vger.kernel.org>, "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>, "aarcange@redhat.com" <aarcange@redhat.com>, "aaron.lu@intel.com" <aaron.lu@intel.com>, "akpm@linux-foundation.org" <akpm@linux-foundation.org>, "alex.williamson@redhat.com" <alex.williamson@redhat.com>, "bsd@redhat.com" <bsd@redhat.com>, "darrick.wong@oracle.com" <darrick.wong@oracle.com>, "dave.hansen@linux.intel.com" <dave.hansen@linux.intel.com>, "jgg@mellanox.com" <jgg@mellanox.com>, "jwadams@google.com" <jwadams@google.com>, "jiangshanlai@gmail.com" <jiangshanlai@gmail.com>, "mhocko@kernel.org" <mhocko@kernel.org>, "mike.kravetz@oracle.com" <mike.kravetz@oracle.com>, Subject: Re: [RFC PATCH v4 11/13] mm: parallelize deferred struct page initialization within each node Date: Mon, 12 Nov 2018 08:54:12 -0800 [thread overview] Message-ID: <20181112165412.vizeiv6oimsuxkbk@ca-dmjordan1.us.oracle.com> (raw) In-Reply-To: <AT5PR8401MB1169798EBEF1EE5EBA3ABFFFABC70@AT5PR8401MB1169.NAMPRD84.PROD.OUTLOOK.COM> On Sat, Nov 10, 2018 at 03:48:14AM +0000, Elliott, Robert (Persistent Memory) wrote: > > -----Original Message----- > > From: linux-kernel-owner@vger.kernel.org <linux-kernel- > > owner@vger.kernel.org> On Behalf Of Daniel Jordan > > Sent: Monday, November 05, 2018 10:56 AM > > Subject: [RFC PATCH v4 11/13] mm: parallelize deferred struct page > > initialization within each node > > > > ... The kernel doesn't > > know the memory bandwidth of a given system to get the most efficient > > number of threads, so there's some guesswork involved. > > The ACPI HMAT (Heterogeneous Memory Attribute Table) is designed to report > that kind of information, and could facilitate automatic tuning. > > There was discussion last year about kernel support for it: > https://lore.kernel.org/lkml/20171214021019.13579-1-ross.zwisler@linux.intel.com/ Thanks for bringing this up. I'm traveling but will take a closer look when I get back. > > In testing, a reasonable value turned out to be about a quarter of the > > CPUs on the node. > ... > > + /* > > + * We'd like to know the memory bandwidth of the chip to > > calculate the > > + * most efficient number of threads to start, but we can't. > > + * In testing, a good value for a variety of systems was a > > quarter of the CPUs on the node. > > + */ > > + nr_node_cpus = DIV_ROUND_UP(cpumask_weight(cpumask), 4); > > > You might want to base that calculation on and limit the threads to > physical cores, not hyperthreaded cores. Why? Hyperthreads can be beneficial when waiting on memory. That said, I don't have data that shows that in this case.
next prev parent reply other threads:[~2018-11-12 16:54 UTC|newest] Thread overview: 71+ messages / expand[flat|nested] mbox.gz Atom feed top 2018-11-05 16:55 [RFC PATCH v4 00/13] ktask: multithread CPU-intensive kernel work Daniel Jordan 2018-11-05 16:55 ` [RFC PATCH v4 01/13] ktask: add documentation Daniel Jordan 2018-11-05 21:19 ` Randy Dunlap 2018-11-06 2:27 ` Daniel Jordan 2018-11-06 8:49 ` Peter Zijlstra 2018-11-06 20:34 ` Daniel Jordan 2018-11-06 20:34 ` Daniel Jordan 2018-11-06 20:34 ` Daniel Jordan 2018-11-06 20:51 ` Jason Gunthorpe 2018-11-06 20:51 ` Jason Gunthorpe 2018-11-06 20:51 ` Jason Gunthorpe 2018-11-07 10:27 ` Peter Zijlstra 2018-11-07 10:27 ` Peter Zijlstra 2018-11-07 10:27 ` Peter Zijlstra 2018-11-07 20:21 ` Daniel Jordan 2018-11-07 20:21 ` Daniel Jordan 2018-11-07 20:21 ` Daniel Jordan 2018-11-07 10:35 ` Peter Zijlstra 2018-11-07 21:20 ` Daniel Jordan 2018-11-08 17:26 ` Jonathan Corbet 2018-11-08 19:15 ` Daniel Jordan 2018-11-08 19:24 ` Jonathan Corbet 2018-11-27 19:50 ` Pavel Machek 2018-11-28 16:56 ` Daniel Jordan 2018-11-05 16:55 ` [RFC PATCH v4 02/13] ktask: multithread CPU-intensive kernel work Daniel Jordan 2018-11-05 20:51 ` Randy Dunlap 2018-11-06 2:24 ` Daniel Jordan 2018-11-05 16:55 ` [RFC PATCH v4 03/13] ktask: add undo support Daniel Jordan 2018-11-05 16:55 ` [RFC PATCH v4 04/13] ktask: run helper threads at MAX_NICE Daniel Jordan 2018-11-05 16:55 ` [RFC PATCH v4 05/13] workqueue, ktask: renice helper threads to prevent starvation Daniel Jordan 2018-11-13 16:34 ` Tejun Heo 2018-11-19 16:45 ` Daniel Jordan 2018-11-20 16:33 ` Tejun Heo 2018-11-20 17:03 ` Daniel Jordan 2018-11-05 16:55 ` [RFC PATCH v4 06/13] vfio: parallelize vfio_pin_map_dma Daniel Jordan 2018-11-05 21:51 ` Alex Williamson 2018-11-06 2:42 ` Daniel Jordan 2018-11-05 16:55 ` [RFC PATCH v4 07/13] mm: change locked_vm's type from unsigned long to atomic_long_t Daniel Jordan 2018-11-05 16:55 ` [RFC PATCH v4 08/13] vfio: remove unnecessary mmap_sem writer acquisition around locked_vm Daniel Jordan 2018-11-05 16:55 ` [RFC PATCH v4 09/13] vfio: relieve mmap_sem reader cacheline bouncing by holding it longer Daniel Jordan 2018-11-05 16:55 ` Daniel Jordan 2018-11-05 16:55 ` [RFC PATCH v4 10/13] mm: enlarge type of offset argument in mem_map_offset and mem_map_next Daniel Jordan 2018-11-05 16:55 ` [RFC PATCH v4 11/13] mm: parallelize deferred struct page initialization within each node Daniel Jordan 2018-11-10 3:48 ` Elliott, Robert (Persistent Memory) 2018-11-10 3:48 ` Elliott, Robert (Persistent Memory) 2018-11-12 16:54 ` Daniel Jordan [this message] 2018-11-12 16:54 ` Daniel Jordan 2018-11-12 22:15 ` Elliott, Robert (Persistent Memory) 2018-11-12 22:15 ` Elliott, Robert (Persistent Memory) 2018-11-19 16:01 ` Daniel Jordan 2018-11-19 16:01 ` Daniel Jordan 2018-11-27 0:12 ` Elliott, Robert (Persistent Memory) 2018-11-27 0:12 ` Elliott, Robert (Persistent Memory) 2018-11-27 20:23 ` Daniel Jordan 2018-11-27 20:23 ` Daniel Jordan 2018-11-19 16:29 ` Daniel Jordan 2018-11-19 16:29 ` Daniel Jordan 2018-11-05 16:55 ` [RFC PATCH v4 12/13] mm: parallelize clear_gigantic_page Daniel Jordan 2018-11-05 16:55 ` [RFC PATCH v4 13/13] hugetlbfs: parallelize hugetlbfs_fallocate with ktask Daniel Jordan 2018-11-05 17:29 ` [RFC PATCH v4 00/13] ktask: multithread CPU-intensive kernel work Michal Hocko 2018-11-06 1:29 ` Daniel Jordan 2018-11-06 9:21 ` Michal Hocko 2018-11-07 20:17 ` Daniel Jordan 2018-11-07 20:17 ` Daniel Jordan 2018-11-05 18:49 ` Zi Yan 2018-11-06 2:20 ` Daniel Jordan 2018-11-06 2:48 ` Zi Yan 2018-11-06 19:00 ` Daniel Jordan 2018-11-30 19:18 ` Tejun Heo 2018-12-01 0:13 ` Daniel Jordan 2018-12-03 16:16 ` Tejun Heo
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20181112165412.vizeiv6oimsuxkbk@ca-dmjordan1.us.oracle.com \ --to=daniel.m.jordan@oracle.com \ --cc=Pavel.Tatashin@microsoft.com \ --cc=aarcange@redhat.com \ --cc=aaron.lu@intel.com \ --cc=akpm@linux-foundation.org \ --cc=alex.williamson@redhat.com \ --cc=bsd@redhat.com \ --cc=darrick.wong@oracle.com \ --cc=dave.hansen@linux.intel.com \ --cc=elliott@hpe.com \ --cc=jgg@mellanox.com \ --cc=jiangshanlai@gmail.com \ --cc=jwadams@google.com \ --cc=kvm@vger.kernel.org \ --cc=linux-kernel@vger.kernel.org \ --cc=linux-mm@kvack.org \ --cc=mhocko@kernel.org \ --cc=mike.kravetz@oracle.com \ --cc=prasad.singamsetty@oracle.com \ --cc=rdunlap@infradead.org \ --cc=steven.sistare@oracle.com \ --cc=tim.c.chen@intel.com \ --cc=tj@kernel.org \ --cc=vbabka@suse.cz \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.