[PATCH v1 0/2] hugetlb: two small improvements of hugetlb init parallelization

linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH v1 0/2] hugetlb: two small improvements of hugetlb init parallelization
@ 2024-02-13 11:13 Gang Li
  2024-02-13 11:13 ` [PATCH v1 1/2] padata: downgrade padata_do_multithreaded to serial execution for non-SMP Gang Li
                   ` (3 more replies)
  0 siblings, 4 replies; 11+ messages in thread
From: Gang Li @ 2024-02-13 11:13 UTC (permalink / raw)
  To: Andrew Morton, Steffen Klassert, Daniel Jordan, Jane Chu,
	Paul E . McKenney, Muchun Song, Gang Li, Randy Dunlap,
	kernel test robot, Gang Li
  Cc: linux-kernel, linux-mm

This series includes two improvements: fixing the PADATA Kconfig warning
and a potential bug in gather_bootmem_prealloc_parallel. Please refer to
the specific commit message for details.

For Andrew:
If you want me to include these two fixes into the previous series[1], I
would be happy to send v6. Otherwise, you can directly apply these two
patches.

[1]. https://lore.kernel.org/lkml/20240126152411.1238072-1-gang.li@linux.dev/

Gang Li (2):
  padata: downgrade padata_do_multithreaded to serial execution for
    non-SMP
  hugetlb: process multiple lists in gather_bootmem_prealloc_parallel

 fs/Kconfig             |  2 +-
 include/linux/padata.h | 13 +++++++++----
 mm/hugetlb.c           | 15 +++++++++++----
 3 files changed, 21 insertions(+), 9 deletions(-)

-- 
2.20.1


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH v1 1/2] padata: downgrade padata_do_multithreaded to serial execution for non-SMP
  2024-02-13 11:13 [PATCH v1 0/2] hugetlb: two small improvements of hugetlb init parallelization Gang Li
@ 2024-02-13 11:13 ` Gang Li
  2024-02-13 14:52   ` Muchun Song
  2024-02-13 11:13 ` [PATCH v1 2/2] hugetlb: process multiple lists in gather_bootmem_prealloc_parallel Gang Li
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 11+ messages in thread
From: Gang Li @ 2024-02-13 11:13 UTC (permalink / raw)
  To: Andrew Morton, Steffen Klassert, Daniel Jordan, Jane Chu,
	Paul E . McKenney, Muchun Song, Gang Li, Randy Dunlap,
	kernel test robot, Gang Li
  Cc: linux-kernel, linux-mm

Randy Dunlap and kernel test robot reported a warning:

```
WARNING: unmet direct dependencies detected for PADATA
  Depends on [n]: SMP [=n]
  Selected by [y]:
  - HUGETLBFS [=y] && (X86 [=y] || SPARC64 || ARCH_SUPPORTS_HUGETLBFS [=n] || BROKEN [=n]) && (SYSFS [=y] || SYSCTL [=n])
```

hugetlb parallelization depends on PADATA, and PADATA depends on SMP.

PADATA consists of two distinct functionality: One part is
padata_do_multithreaded which disregards order and simply divides
tasks into several groups for parallel execution. Hugetlb
init parallelization depends on padata_do_multithreaded.

The other part is composed of a set of APIs that, while handling data in
an out-of-order parallel manner, can eventually return the data with
ordered sequence. Currently Only `crypto/pcrypt.c` use them.

All users of PADATA of non-SMP case currently only use
padata_do_multithreaded. It is easy to implement a serial one in
include/linux/padata.h. And it is not necessary to implement another
functionality unless the only user of crypto/pcrypt.c does not depend on
SMP in the future.

Fixes: a2cefb08be66 ("hugetlb: have CONFIG_HUGETLBFS select CONFIG_PADATA")
Reported-by: Randy Dunlap <rdunlap@infradead.org>
Closes: https://lore.kernel.org/lkml/ec5dc528-2c3c-4444-9e88-d2c48395b433@infradead.org/
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202402020454.6EPkP1hi-lkp@intel.com/
Signed-off-by: Gang Li <ligang.bdlg@bytedance.com>
---
 fs/Kconfig             |  2 +-
 include/linux/padata.h | 13 +++++++++----
 2 files changed, 10 insertions(+), 5 deletions(-)

diff --git a/fs/Kconfig b/fs/Kconfig
index 4a51331f172e5..7963939592d70 100644
--- a/fs/Kconfig
+++ b/fs/Kconfig
@@ -261,7 +261,7 @@ menuconfig HUGETLBFS
 	depends on X86 || SPARC64 || ARCH_SUPPORTS_HUGETLBFS || BROKEN
 	depends on (SYSFS || SYSCTL)
 	select MEMFD_CREATE
-	select PADATA
+	select PADATA if SMP
 	help
 	  hugetlbfs is a filesystem backing for HugeTLB pages, based on
 	  ramfs. For architectures that support it, say Y here and read
diff --git a/include/linux/padata.h b/include/linux/padata.h
index 8f418711351bc..7b84eb7d73e7f 100644
--- a/include/linux/padata.h
+++ b/include/linux/padata.h
@@ -180,10 +180,6 @@ struct padata_instance {
 
 #ifdef CONFIG_PADATA
 extern void __init padata_init(void);
-#else
-static inline void __init padata_init(void) {}
-#endif
-
 extern struct padata_instance *padata_alloc(const char *name);
 extern void padata_free(struct padata_instance *pinst);
 extern struct padata_shell *padata_alloc_shell(struct padata_instance *pinst);
@@ -194,4 +190,13 @@ extern void padata_do_serial(struct padata_priv *padata);
 extern void __init padata_do_multithreaded(struct padata_mt_job *job);
 extern int padata_set_cpumask(struct padata_instance *pinst, int cpumask_type,
 			      cpumask_var_t cpumask);
+#else
+static inline void __init padata_init(void) {}
+static inline void __init padata_do_multithreaded(struct padata_mt_job *job)
+{
+	if (job->size)
+		job->thread_fn(job->start, job->start + job->size, job->fn_arg);
+}
+#endif
+
 #endif
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH v1 2/2] hugetlb: process multiple lists in gather_bootmem_prealloc_parallel
  2024-02-13 11:13 [PATCH v1 0/2] hugetlb: two small improvements of hugetlb init parallelization Gang Li
  2024-02-13 11:13 ` [PATCH v1 1/2] padata: downgrade padata_do_multithreaded to serial execution for non-SMP Gang Li
@ 2024-02-13 11:13 ` Gang Li
  2024-02-13 14:54   ` Muchun Song
  2024-02-13 14:16 ` [PATCH v1 0/2] hugetlb: two small improvements of hugetlb init parallelization Paul E. McKenney
  2024-02-19  3:03 ` Gang Li
  3 siblings, 1 reply; 11+ messages in thread
From: Gang Li @ 2024-02-13 11:13 UTC (permalink / raw)
  To: Andrew Morton, Steffen Klassert, Daniel Jordan, Jane Chu,
	Paul E . McKenney, Muchun Song, Gang Li, Randy Dunlap,
	kernel test robot, Gang Li
  Cc: linux-kernel, linux-mm

gather_bootmem_prealloc_node currently only process one list in
huge_boot_pages array. So gather_bootmem_prealloc expects
padata_do_multithreaded to run num_node_state(N_MEMORY) instances of
gather_bootmem_prealloc_node to process all lists in huge_boot_pages.

This works well in current padata_do_multithreaded implementation.
It guarantees that size/min_chunk <= thread num <= max_threads.

```
/* Ensure at least one thread when size < min_chunk. */
nworks = max(job->size / max(job->min_chunk, job->align), 1ul);
nworks = min(nworks, job->max_threads);

ps.nworks      = padata_work_alloc_mt(nworks, &ps, &works);
```

However, the comment of padata_do_multithreaded API only promises a
maximum value for the number of threads and does not specify a
minimum value. Which may pass multiple nodes to
gather_bootmem_prealloc_node and only one node will be processed.

To avoid potential errors, introduce gather_bootmem_prealloc_parallel
to handle the case where the number of threads does not meet the
requirement of max_threads.

Fixes: 0306f03dcbd7 ("hugetlb: parallelize 1G hugetlb initialization")
Signed-off-by: Gang Li <ligang.bdlg@bytedance.com>
---
 mm/hugetlb.c | 15 +++++++++++----
 1 file changed, 11 insertions(+), 4 deletions(-)

diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 25069ca6ec248..2799a7ea098c1 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -3414,10 +3414,8 @@ static void __init prep_and_add_bootmem_folios(struct hstate *h,
  * Put bootmem huge pages into the standard lists after mem_map is up.
  * Note: This only applies to gigantic (order > MAX_PAGE_ORDER) pages.
  */
-static void __init gather_bootmem_prealloc_node(unsigned long start, unsigned long end, void *arg)
-
+static void __init gather_bootmem_prealloc_node(unsigned long nid)
 {
-	int nid = start;
 	LIST_HEAD(folio_list);
 	struct huge_bootmem_page *m;
 	struct hstate *h = NULL, *prev_h = NULL;
@@ -3455,10 +3453,19 @@ static void __init gather_bootmem_prealloc_node(unsigned long start, unsigned lo
 	prep_and_add_bootmem_folios(h, &folio_list);
 }
 
+static void __init gather_bootmem_prealloc_parallel(unsigned long start,
+						    unsigned long end, void *arg)
+{
+	int nid;
+
+	for (nid = start; nid < end; nid++)
+		gather_bootmem_prealloc_node(nid);
+}
+
 static void __init gather_bootmem_prealloc(void)
 {
 	struct padata_mt_job job = {
-		.thread_fn	= gather_bootmem_prealloc_node,
+		.thread_fn	= gather_bootmem_prealloc_parallel,
 		.fn_arg		= NULL,
 		.start		= 0,
 		.size		= num_node_state(N_MEMORY),
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH v1 0/2] hugetlb: two small improvements of hugetlb init parallelization
  2024-02-13 11:13 [PATCH v1 0/2] hugetlb: two small improvements of hugetlb init parallelization Gang Li
  2024-02-13 11:13 ` [PATCH v1 1/2] padata: downgrade padata_do_multithreaded to serial execution for non-SMP Gang Li
  2024-02-13 11:13 ` [PATCH v1 2/2] hugetlb: process multiple lists in gather_bootmem_prealloc_parallel Gang Li
@ 2024-02-13 14:16 ` Paul E. McKenney
  2024-02-19  3:03 ` Gang Li
  3 siblings, 0 replies; 11+ messages in thread
From: Paul E. McKenney @ 2024-02-13 14:16 UTC (permalink / raw)
  To: Gang Li
  Cc: Andrew Morton, Steffen Klassert, Daniel Jordan, Jane Chu,
	Muchun Song, Randy Dunlap, kernel test robot, Gang Li,
	linux-kernel, linux-mm

On Tue, Feb 13, 2024 at 07:13:45PM +0800, Gang Li wrote:
> This series includes two improvements: fixing the PADATA Kconfig warning
> and a potential bug in gather_bootmem_prealloc_parallel. Please refer to
> the specific commit message for details.
> 
> For Andrew:
> If you want me to include these two fixes into the previous series[1], I
> would be happy to send v6. Otherwise, you can directly apply these two
> patches.
> 
> [1]. https://lore.kernel.org/lkml/20240126152411.1238072-1-gang.li@linux.dev/

Thank you!

Tested-by: Paul E. McKenney <paulmck@kernel.org>

> Gang Li (2):
>   padata: downgrade padata_do_multithreaded to serial execution for
>     non-SMP
>   hugetlb: process multiple lists in gather_bootmem_prealloc_parallel
> 
>  fs/Kconfig             |  2 +-
>  include/linux/padata.h | 13 +++++++++----
>  mm/hugetlb.c           | 15 +++++++++++----
>  3 files changed, 21 insertions(+), 9 deletions(-)
> 
> -- 
> 2.20.1
> 

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v1 1/2] padata: downgrade padata_do_multithreaded to serial execution for non-SMP
  2024-02-13 11:13 ` [PATCH v1 1/2] padata: downgrade padata_do_multithreaded to serial execution for non-SMP Gang Li
@ 2024-02-13 14:52   ` Muchun Song
  2024-02-13 15:15     ` Gang Li
  0 siblings, 1 reply; 11+ messages in thread
From: Muchun Song @ 2024-02-13 14:52 UTC (permalink / raw)
  To: Gang Li, Andrew Morton, Steffen Klassert, Daniel Jordan,
	Jane Chu, Paul E . McKenney, Randy Dunlap, kernel test robot,
	Gang Li
  Cc: linux-kernel, linux-mm



On 2024/2/13 19:13, Gang Li wrote:
> Randy Dunlap and kernel test robot reported a warning:
>
> ```
> WARNING: unmet direct dependencies detected for PADATA
>    Depends on [n]: SMP [=n]
>    Selected by [y]:
>    - HUGETLBFS [=y] && (X86 [=y] || SPARC64 || ARCH_SUPPORTS_HUGETLBFS [=n] || BROKEN [=n]) && (SYSFS [=y] || SYSCTL [=n])
> ```
>
> hugetlb parallelization depends on PADATA, and PADATA depends on SMP.
>
> PADATA consists of two distinct functionality: One part is
> padata_do_multithreaded which disregards order and simply divides
> tasks into several groups for parallel execution. Hugetlb
> init parallelization depends on padata_do_multithreaded.
>
> The other part is composed of a set of APIs that, while handling data in
> an out-of-order parallel manner, can eventually return the data with
> ordered sequence. Currently Only `crypto/pcrypt.c` use them.
>
> All users of PADATA of non-SMP case currently only use
> padata_do_multithreaded. It is easy to implement a serial one in
> include/linux/padata.h. And it is not necessary to implement another
> functionality unless the only user of crypto/pcrypt.c does not depend on
> SMP in the future.
>
> Fixes: a2cefb08be66 ("hugetlb: have CONFIG_HUGETLBFS select CONFIG_PADATA")
> Reported-by: Randy Dunlap <rdunlap@infradead.org>
> Closes: https://lore.kernel.org/lkml/ec5dc528-2c3c-4444-9e88-d2c48395b433@infradead.org/
> Reported-by: kernel test robot <lkp@intel.com>
> Closes: https://lore.kernel.org/oe-kbuild-all/202402020454.6EPkP1hi-lkp@intel.com/
> Signed-off-by: Gang Li <ligang.bdlg@bytedance.com>
> ---
>   fs/Kconfig             |  2 +-
>   include/linux/padata.h | 13 +++++++++----
>   2 files changed, 10 insertions(+), 5 deletions(-)
>
> diff --git a/fs/Kconfig b/fs/Kconfig
> index 4a51331f172e5..7963939592d70 100644
> --- a/fs/Kconfig
> +++ b/fs/Kconfig
> @@ -261,7 +261,7 @@ menuconfig HUGETLBFS
>   	depends on X86 || SPARC64 || ARCH_SUPPORTS_HUGETLBFS || BROKEN
>   	depends on (SYSFS || SYSCTL)
>   	select MEMFD_CREATE
> -	select PADATA
> +	select PADATA if SMP

I'd like to drop this dependence since HugeTLB does not depend
on PADATA anymore. If some users take care about the kernel
image size, it also can disable PADATA individually.

>   	help
>   	  hugetlbfs is a filesystem backing for HugeTLB pages, based on
>   	  ramfs. For architectures that support it, say Y here and read
> diff --git a/include/linux/padata.h b/include/linux/padata.h
> index 8f418711351bc..7b84eb7d73e7f 100644
> --- a/include/linux/padata.h
> +++ b/include/linux/padata.h
> @@ -180,10 +180,6 @@ struct padata_instance {
>   
>   #ifdef CONFIG_PADATA
>   extern void __init padata_init(void);
> -#else
> -static inline void __init padata_init(void) {}
> -#endif
> -
>   extern struct padata_instance *padata_alloc(const char *name);
>   extern void padata_free(struct padata_instance *pinst);
>   extern struct padata_shell *padata_alloc_shell(struct padata_instance *pinst);
> @@ -194,4 +190,13 @@ extern void padata_do_serial(struct padata_priv *padata);
>   extern void __init padata_do_multithreaded(struct padata_mt_job *job);
>   extern int padata_set_cpumask(struct padata_instance *pinst, int cpumask_type,
>   			      cpumask_var_t cpumask);
> +#else
> +static inline void __init padata_init(void) {}
> +static inline void __init padata_do_multithreaded(struct padata_mt_job *job)
> +{
> +	if (job->size)

I think we could drop this check, at least now there is no users will
pass a zero of ->size to this function, and even if someone does in the
future, I think it is really a corner case, it is unnecessary to optimize
it and ->thread_fn is supporsed to handle case of zero size if it dose
pass a zero size.

Thanks.

> +		job->thread_fn(job->start, job->start + job->size, job->fn_arg);
> +}
> +#endif
> +
>   #endif


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v1 2/2] hugetlb: process multiple lists in gather_bootmem_prealloc_parallel
  2024-02-13 11:13 ` [PATCH v1 2/2] hugetlb: process multiple lists in gather_bootmem_prealloc_parallel Gang Li
@ 2024-02-13 14:54   ` Muchun Song
  0 siblings, 0 replies; 11+ messages in thread
From: Muchun Song @ 2024-02-13 14:54 UTC (permalink / raw)
  To: Gang Li
  Cc: Andrew Morton, Steffen Klassert, Daniel Jordan, Jane Chu,
	Paul E . McKenney, Randy Dunlap, kernel test robot, Gang Li,
	linux-kernel, linux-mm



> On Feb 13, 2024, at 19:13, Gang Li <gang.li@linux.dev> wrote:
> 
> gather_bootmem_prealloc_node currently only process one list in
> huge_boot_pages array. So gather_bootmem_prealloc expects
> padata_do_multithreaded to run num_node_state(N_MEMORY) instances of
> gather_bootmem_prealloc_node to process all lists in huge_boot_pages.
> 
> This works well in current padata_do_multithreaded implementation.
> It guarantees that size/min_chunk <= thread num <= max_threads.
> 
> ```
> /* Ensure at least one thread when size < min_chunk. */
> nworks = max(job->size / max(job->min_chunk, job->align), 1ul);
> nworks = min(nworks, job->max_threads);
> 
> ps.nworks      = padata_work_alloc_mt(nworks, &ps, &works);
> ```
> 
> However, the comment of padata_do_multithreaded API only promises a
> maximum value for the number of threads and does not specify a
> minimum value. Which may pass multiple nodes to
> gather_bootmem_prealloc_node and only one node will be processed.
> 
> To avoid potential errors, introduce gather_bootmem_prealloc_parallel
> to handle the case where the number of threads does not meet the
> requirement of max_threads.
> 
> Fixes: 0306f03dcbd7 ("hugetlb: parallelize 1G hugetlb initialization")
> Signed-off-by: Gang Li <ligang.bdlg@bytedance.com>

Reviewed-by: Muchun Song <muchun.song@linux.dev>

Thanks.


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v1 1/2] padata: downgrade padata_do_multithreaded to serial execution for non-SMP
  2024-02-13 14:52   ` Muchun Song
@ 2024-02-13 15:15     ` Gang Li
  2024-02-15 10:49       ` Muchun Song
  0 siblings, 1 reply; 11+ messages in thread
From: Gang Li @ 2024-02-13 15:15 UTC (permalink / raw)
  To: Muchun Song, Andrew Morton, Steffen Klassert, Daniel Jordan,
	Jane Chu, Paul E . McKenney, Randy Dunlap, kernel test robot,
	Gang Li
  Cc: linux-kernel, linux-mm



On 2024/2/13 22:52, Muchun Song wrote:
> On 2024/2/13 19:13, Gang Li wrote:
>> Randy Dunlap and kernel test robot reported a warning:
>>
>> ```
>> WARNING: unmet direct dependencies detected for PADATA
>>    Depends on [n]: SMP [=n]
>>    Selected by [y]:
>>    - HUGETLBFS [=y] && (X86 [=y] || SPARC64 || ARCH_SUPPORTS_HUGETLBFS 
>> [=n] || BROKEN [=n]) && (SYSFS [=y] || SYSCTL [=n])
>> ```
>>
>> hugetlb parallelization depends on PADATA, and PADATA depends on SMP.
>>
>> PADATA consists of two distinct functionality: One part is
>> padata_do_multithreaded which disregards order and simply divides
>> tasks into several groups for parallel execution. Hugetlb
>> init parallelization depends on padata_do_multithreaded.
>>
>> The other part is composed of a set of APIs that, while handling data in
>> an out-of-order parallel manner, can eventually return the data with
>> ordered sequence. Currently Only `crypto/pcrypt.c` use them.
>>
>> All users of PADATA of non-SMP case currently only use
>> padata_do_multithreaded. It is easy to implement a serial one in
>> include/linux/padata.h. And it is not necessary to implement another
>> functionality unless the only user of crypto/pcrypt.c does not depend on
>> SMP in the future.
>>
>> Fixes: a2cefb08be66 ("hugetlb: have CONFIG_HUGETLBFS select 
>> CONFIG_PADATA")
>> Reported-by: Randy Dunlap <rdunlap@infradead.org>
>> Closes: 
>> https://lore.kernel.org/lkml/ec5dc528-2c3c-4444-9e88-d2c48395b433@infradead.org/
>> Reported-by: kernel test robot <lkp@intel.com>
>> Closes: 
>> https://lore.kernel.org/oe-kbuild-all/202402020454.6EPkP1hi-lkp@intel.com/
>> Signed-off-by: Gang Li <ligang.bdlg@bytedance.com>
>> ---
>>   fs/Kconfig             |  2 +-
>>   include/linux/padata.h | 13 +++++++++----
>>   2 files changed, 10 insertions(+), 5 deletions(-)
>>
>> diff --git a/fs/Kconfig b/fs/Kconfig
>> index 4a51331f172e5..7963939592d70 100644
>> --- a/fs/Kconfig
>> +++ b/fs/Kconfig
>> @@ -261,7 +261,7 @@ menuconfig HUGETLBFS
>>       depends on X86 || SPARC64 || ARCH_SUPPORTS_HUGETLBFS || BROKEN
>>       depends on (SYSFS || SYSCTL)
>>       select MEMFD_CREATE
>> -    select PADATA
>> +    select PADATA if SMP
> 
> I'd like to drop this dependence since HugeTLB does not depend
> on PADATA anymore. If some users take care about the kernel
> image size, it also can disable PADATA individually.
> 

Only CRYPTO_PCRYPT, HUGETLBFS and DEFERRED_STRUCT_PAGE_INIT select
PADATA. If drop this dependence, hugetlb init parallelization may not
work at all.

Maybe we can set PADATA enabled on default?

>>       help
>>         hugetlbfs is a filesystem backing for HugeTLB pages, based on
>>         ramfs. For architectures that support it, say Y here and read
>> diff --git a/include/linux/padata.h b/include/linux/padata.h
>> index 8f418711351bc..7b84eb7d73e7f 100644
>> --- a/include/linux/padata.h
>> +++ b/include/linux/padata.h
>> @@ -180,10 +180,6 @@ struct padata_instance {
>>   #ifdef CONFIG_PADATA
>>   extern void __init padata_init(void);
>> -#else
>> -static inline void __init padata_init(void) {}
>> -#endif
>> -
>>   extern struct padata_instance *padata_alloc(const char *name);
>>   extern void padata_free(struct padata_instance *pinst);
>>   extern struct padata_shell *padata_alloc_shell(struct 
>> padata_instance *pinst);
>> @@ -194,4 +190,13 @@ extern void padata_do_serial(struct padata_priv 
>> *padata);
>>   extern void __init padata_do_multithreaded(struct padata_mt_job *job);
>>   extern int padata_set_cpumask(struct padata_instance *pinst, int 
>> cpumask_type,
>>                     cpumask_var_t cpumask);
>> +#else
>> +static inline void __init padata_init(void) {}
>> +static inline void __init padata_do_multithreaded(struct 
>> padata_mt_job *job)
>> +{
>> +    if (job->size)
> 
> I think we could drop this check, at least now there is no users will
> pass a zero of ->size to this function, and even if someone does in the
> future, I think it is really a corner case, it is unnecessary to optimize
> it and ->thread_fn is supporsed to handle case of zero size if it dose
> pass a zero size.
> 
> Thanks.
> 
>> +        job->thread_fn(job->start, job->start + job->size, job->fn_arg);
>> +}
>> +#endif
>> +
>>   #endif
> 

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v1 1/2] padata: downgrade padata_do_multithreaded to serial execution for non-SMP
  2024-02-13 15:15     ` Gang Li
@ 2024-02-15 10:49       ` Muchun Song
  0 siblings, 0 replies; 11+ messages in thread
From: Muchun Song @ 2024-02-15 10:49 UTC (permalink / raw)
  To: Gang Li
  Cc: Andrew Morton, Steffen Klassert, Daniel Jordan, Jane Chu,
	Paul E . McKenney, Randy Dunlap, kernel test robot, Gang Li,
	linux-kernel, linux-mm



> On Feb 13, 2024, at 23:15, Gang Li <gang.li@linux.dev> wrote:
> 
> 
> 
>> On 2024/2/13 22:52, Muchun Song wrote:
>>> On 2024/2/13 19:13, Gang Li wrote:
>>> Randy Dunlap and kernel test robot reported a warning:
>>> 
>>> ```
>>> WARNING: unmet direct dependencies detected for PADATA
>>>    Depends on [n]: SMP [=n]
>>>    Selected by [y]:
>>>    - HUGETLBFS [=y] && (X86 [=y] || SPARC64 || ARCH_SUPPORTS_HUGETLBFS [=n] || BROKEN [=n]) && (SYSFS [=y] || SYSCTL [=n])
>>> ```
>>> 
>>> hugetlb parallelization depends on PADATA, and PADATA depends on SMP.
>>> 
>>> PADATA consists of two distinct functionality: One part is
>>> padata_do_multithreaded which disregards order and simply divides
>>> tasks into several groups for parallel execution. Hugetlb
>>> init parallelization depends on padata_do_multithreaded.
>>> 
>>> The other part is composed of a set of APIs that, while handling data in
>>> an out-of-order parallel manner, can eventually return the data with
>>> ordered sequence. Currently Only `crypto/pcrypt.c` use them.
>>> 
>>> All users of PADATA of non-SMP case currently only use
>>> padata_do_multithreaded. It is easy to implement a serial one in
>>> include/linux/padata.h. And it is not necessary to implement another
>>> functionality unless the only user of crypto/pcrypt.c does not depend on
>>> SMP in the future.
>>> 
>>> Fixes: a2cefb08be66 ("hugetlb: have CONFIG_HUGETLBFS select CONFIG_PADATA")
>>> Reported-by: Randy Dunlap <rdunlap@infradead.org>
>>> Closes: https://lore.kernel.org/lkml/ec5dc528-2c3c-4444-9e88-d2c48395b433@infradead.org/
>>> Reported-by: kernel test robot <lkp@intel.com>
>>> Closes: https://lore.kernel.org/oe-kbuild-all/202402020454.6EPkP1hi-lkp@intel.com/
>>> Signed-off-by: Gang Li <ligang.bdlg@bytedance.com>
>>> ---
>>>   fs/Kconfig             |  2 +-
>>>   include/linux/padata.h | 13 +++++++++----
>>>   2 files changed, 10 insertions(+), 5 deletions(-)
>>> 
>>> diff --git a/fs/Kconfig b/fs/Kconfig
>>> index 4a51331f172e5..7963939592d70 100644
>>> --- a/fs/Kconfig
>>> +++ b/fs/Kconfig
>>> @@ -261,7 +261,7 @@ menuconfig HUGETLBFS
>>>       depends on X86 || SPARC64 || ARCH_SUPPORTS_HUGETLBFS || BROKEN
>>>       depends on (SYSFS || SYSCTL)
>>>       select MEMFD_CREATE
>>> -    select PADATA
>>> +    select PADATA if SMP
>> I'd like to drop this dependence since HugeTLB does not depend
>> on PADATA anymore. If some users take care about the kernel
>> image size, it also can disable PADATA individually.
> 
> Only CRYPTO_PCRYPT, HUGETLBFS and DEFERRED_STRUCT_PAGE_INIT select
> PADATA. If drop this dependence, hugetlb init parallelization may not
> work at all.

Oh, right. In this case, maybe current choice is better.

> 
> Maybe we can set PADATA enabled on default?
> 
>>>       help
>>>         hugetlbfs is a filesystem backing for HugeTLB pages, based on
>>>         ramfs. For architectures that support it, say Y here and read
>>> diff --git a/include/linux/padata.h b/include/linux/padata.h
>>> index 8f418711351bc..7b84eb7d73e7f 100644
>>> --- a/include/linux/padata.h
>>> +++ b/include/linux/padata.h
>>> @@ -180,10 +180,6 @@ struct padata_instance {
>>>   #ifdef CONFIG_PADATA
>>>   extern void __init padata_init(void);
>>> -#else
>>> -static inline void __init padata_init(void) {}
>>> -#endif
>>> -
>>>   extern struct padata_instance *padata_alloc(const char *name);
>>>   extern void padata_free(struct padata_instance *pinst);
>>>   extern struct padata_shell *padata_alloc_shell(struct padata_instance *pinst);
>>> @@ -194,4 +190,13 @@ extern void padata_do_serial(struct padata_priv *padata);
>>>   extern void __init padata_do_multithreaded(struct padata_mt_job *job);
>>>   extern int padata_set_cpumask(struct padata_instance *pinst, int cpumask_type,
>>>                     cpumask_var_t cpumask);
>>> +#else
>>> +static inline void __init padata_init(void) {}
>>> +static inline void __init padata_do_multithreaded(struct padata_mt_job *job)
>>> +{
>>> +    if (job->size)
>> I think we could drop this check, at least now there is no users will
>> pass a zero of ->size to this function, and even if someone does in the
>> future, I think it is really a corner case, it is unnecessary to optimize
>> it and ->thread_fn is supporsed to handle case of zero size if it dose
>> pass a zero size.
>> Thanks.
>>> +        job->thread_fn(job->start, job->start + job->size, job->fn_arg);
>>> +}
>>> +#endif
>>> +
>>>   #endif

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v1 0/2] hugetlb: two small improvements of hugetlb init parallelization
  2024-02-13 11:13 [PATCH v1 0/2] hugetlb: two small improvements of hugetlb init parallelization Gang Li
                   ` (2 preceding siblings ...)
  2024-02-13 14:16 ` [PATCH v1 0/2] hugetlb: two small improvements of hugetlb init parallelization Paul E. McKenney
@ 2024-02-19  3:03 ` Gang Li
  2024-02-20  1:52   ` Andrew Morton
  3 siblings, 1 reply; 11+ messages in thread
From: Gang Li @ 2024-02-19  3:03 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-kernel, linux-mm

On 2024/2/13 19:13, Gang Li wrote:
> This series includes two improvements: fixing the PADATA Kconfig warning
> and a potential bug in gather_bootmem_prealloc_parallel. Please refer to
> the specific commit message for details.
> 
> For Andrew:
> If you want me to include these two fixes into the previous series[1], I
> would be happy to send v6. Otherwise, you can directly apply these two
> patches.
> 
> [1]. https://lore.kernel.org/lkml/20240126152411.1238072-1-gang.li@linux.dev/

Hi Andrew,
A gentle ping here :).

Do you want to apply these two patches, or would you like me to
include them into the original patch and send out v6?

> 
> Gang Li (2):
>    padata: downgrade padata_do_multithreaded to serial execution for
>      non-SMP
>    hugetlb: process multiple lists in gather_bootmem_prealloc_parallel
> 
>   fs/Kconfig             |  2 +-
>   include/linux/padata.h | 13 +++++++++----
>   mm/hugetlb.c           | 15 +++++++++++----
>   3 files changed, 21 insertions(+), 9 deletions(-)
> 

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v1 0/2] hugetlb: two small improvements of hugetlb init parallelization
  2024-02-19  3:03 ` Gang Li
@ 2024-02-20  1:52   ` Andrew Morton
  2024-02-20  6:17     ` Gang Li
  0 siblings, 1 reply; 11+ messages in thread
From: Andrew Morton @ 2024-02-20  1:52 UTC (permalink / raw)
  To: Gang Li; +Cc: linux-kernel, linux-mm

On Mon, 19 Feb 2024 11:03:29 +0800 Gang Li <gang.li@linux.dev> wrote:

> On 2024/2/13 19:13, Gang Li wrote:
> > This series includes two improvements: fixing the PADATA Kconfig warning
> > and a potential bug in gather_bootmem_prealloc_parallel. Please refer to
> > the specific commit message for details.
> > 
> > For Andrew:
> > If you want me to include these two fixes into the previous series[1], I
> > would be happy to send v6. Otherwise, you can directly apply these two
> > patches.
> > 
> > [1]. https://lore.kernel.org/lkml/20240126152411.1238072-1-gang.li@linux.dev/
> 
> Hi Andrew,
> A gentle ping here :).
> 
> Do you want to apply these two patches, or would you like me to
> include them into the original patch and send out v6?

The patchset is now rather a mess and I'm not confidently identifying
which issues remain open and which are addressed.

So yes, a full redo and resend would be preferred, please.

Links which I collected are:

https://lkml.kernel.org/r/202402020454.6EPkP1hi-lkp@intel.com
https://lkml.kernel.org/r/20240204072525.1986626-1-gang.li@linux.dev
https://lkml.kernel.org/r/6A148F29-68B2-4365-872C-E6AB599C55F6@linux.dev
https://lkml.kernel.org/r/j7xb7m5cy374ngbdm23rvryq6vy6jxtewtu3abjeidhho4bly7@t3aawaeybxlk

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v1 0/2] hugetlb: two small improvements of hugetlb init parallelization
  2024-02-20  1:52   ` Andrew Morton
@ 2024-02-20  6:17     ` Gang Li
  0 siblings, 0 replies; 11+ messages in thread
From: Gang Li @ 2024-02-20  6:17 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-kernel, linux-mm

On 2024/2/20 09:52, Andrew Morton wrote:
> On Mon, 19 Feb 2024 11:03:29 +0800 Gang Li <gang.li@linux.dev> wrote:
> 
>> On 2024/2/13 19:13, Gang Li wrote:
>>> This series includes two improvements: fixing the PADATA Kconfig warning
>>> and a potential bug in gather_bootmem_prealloc_parallel. Please refer to
>>> the specific commit message for details.
>>>
>>> For Andrew:
>>> If you want me to include these two fixes into the previous series[1], I
>>> would be happy to send v6. Otherwise, you can directly apply these two
>>> patches.
>>>
>>> [1]. https://lore.kernel.org/lkml/20240126152411.1238072-1-gang.li@linux.dev/
>>
>> Hi Andrew,
>> A gentle ping here :).
>>
>> Do you want to apply these two patches, or would you like me to
>> include them into the original patch and send out v6?
> 
> The patchset is now rather a mess and I'm not confidently identifying
> which issues remain open and which are addressed.
> 
> So yes, a full redo and resend would be preferred, please.
> 
> Links which I collected are:
> 
> https://lkml.kernel.org/r/202402020454.6EPkP1hi-lkp@intel.com
> https://lkml.kernel.org/r/20240204072525.1986626-1-gang.li@linux.dev
> https://lkml.kernel.org/r/6A148F29-68B2-4365-872C-E6AB599C55F6@linux.dev
> https://lkml.kernel.org/r/j7xb7m5cy374ngbdm23rvryq6vy6jxtewtu3abjeidhho4bly7@t3aawaeybxlk

In summary, there is two issues based on v5[1] and all are fixed by my
patch.

Scattered discussions led to some confusion, I'll make it clear in the
next version.

Thanks.

[1]. 
https://lore.kernel.org/lkml/20240126152411.1238072-1-gang.li@linux.dev/

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2024-02-20  6:17 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-02-13 11:13 [PATCH v1 0/2] hugetlb: two small improvements of hugetlb init parallelization Gang Li
2024-02-13 11:13 ` [PATCH v1 1/2] padata: downgrade padata_do_multithreaded to serial execution for non-SMP Gang Li
2024-02-13 14:52   ` Muchun Song
2024-02-13 15:15     ` Gang Li
2024-02-15 10:49       ` Muchun Song
2024-02-13 11:13 ` [PATCH v1 2/2] hugetlb: process multiple lists in gather_bootmem_prealloc_parallel Gang Li
2024-02-13 14:54   ` Muchun Song
2024-02-13 14:16 ` [PATCH v1 0/2] hugetlb: two small improvements of hugetlb init parallelization Paul E. McKenney
2024-02-19  3:03 ` Gang Li
2024-02-20  1:52   ` Andrew Morton
2024-02-20  6:17     ` Gang Li

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).