All of lore.kernel.org
 help / color / mirror / Atom feed
* + memory-hotplug-fix-kswapd-looping-forever-problem-fix-fix.patch added to -mm tree
@ 2012-07-17 22:44 akpm
  2012-07-18  1:22 ` Minchan Kim
  0 siblings, 1 reply; 14+ messages in thread
From: akpm @ 2012-07-17 22:44 UTC (permalink / raw)
  To: mm-commits
  Cc: akpm, aaditya.kumar.30, kamezawa.hiroyu, mgorman, mhocko, minchan


The patch titled
     Subject: memory-hotplug-fix-kswapd-looping-forever-problem-fix-fix
has been added to the -mm tree.  Its filename is
     memory-hotplug-fix-kswapd-looping-forever-problem-fix-fix.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/SubmitChecklist when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Andrew Morton <akpm@linux-foundation.org>
Subject: memory-hotplug-fix-kswapd-looping-forever-problem-fix-fix

fix CONFIG_MEMORY_ISOLATION=n build:

mm/page_alloc.c: In function 'free_area_init_core':
mm/page_alloc.c:4430: error: 'struct zone' has no member named 'nr_pageblock_isolate'

Is this really necessary?  Does the zone start out all-zeroes?  If not, can we
make it do so?

Cc: Aaditya Kumar <aaditya.kumar.30@gmail.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Minchan Kim <minchan@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/page_alloc.c |    2 ++
 1 file changed, 2 insertions(+)

diff -puN mm/page_alloc.c~memory-hotplug-fix-kswapd-looping-forever-problem-fix-fix mm/page_alloc.c
--- a/mm/page_alloc.c~memory-hotplug-fix-kswapd-looping-forever-problem-fix-fix
+++ a/mm/page_alloc.c
@@ -4424,7 +4424,9 @@ static void __paginginit free_area_init_
 		lruvec_init(&zone->lruvec, zone);
 		zap_zone_vm_stats(zone);
 		zone->flags = 0;
+#ifdef CONFIG_MEMORY_ISOLATION
 		zone->nr_pageblock_isolate = 0;
+#endif
 		if (!size)
 			continue;
 
_
Subject: Subject: memory-hotplug-fix-kswapd-looping-forever-problem-fix-fix

Patches currently in -mm which might be from akpm@linux-foundation.org are

linux-next.patch
linux-next-git-rejects.patch
i-need-old-gcc.patch
arch-alpha-kernel-systblss-remove-debug-check.patch
arch-x86-platform-iris-irisc-register-a-platform-device-and-a-platform-driver.patch
arch-x86-kernel-cpu-perf_event_intel_uncoreh-make-uncore_pmu_hrtimer_interval-64-bit.patch
sysfs-fail-dentry-revalidation-after-namespace-change-fix.patch
thermal-add-generic-cpufreq-cooling-implementation.patch
thermal-exynos5-add-exynos5-thermal-sensor-driver-support.patch
thermal-exynos-register-the-tmu-sensor-with-the-kernel-thermal-layer.patch
coredump-warn-about-unsafe-suid_dumpable-core_pattern-combo.patch
mm.patch
mm-make-vb_alloc-more-foolproof-fix.patch
mm-hugetlb-add-new-hugetlb-cgroup-fix.patch
mm-hugetlb-add-new-hugetlb-cgroup-fix-fix.patch
hugetlb-cgroup-add-hugetlb-cgroup-control-files-fix.patch
hugetlb-cgroup-add-hugetlb-cgroup-control-files-fix-fix.patch
mm-memblockc-memblock_double_array-cosmetic-cleanups.patch
memcg-make-mem_cgroup_force_empty_list-return-bool-fix.patch
mm-fadvise-dont-return-einval-when-filesystem-cannot-implement-fadvise-checkpatch-fixes.patch
memcg-rename-config-variables.patch
memcg-rename-config-variables-fix-fix.patch
mm-have-order-0-compaction-start-off-where-it-left-checkpatch-fixes.patch
mm-have-order-0-compaction-start-off-where-it-left-v3-typo.patch
memory-hotplug-fix-kswapd-looping-forever-problem-fix.patch
memory-hotplug-fix-kswapd-looping-forever-problem-fix-fix.patch
mm-memcg-fix-compaction-migration-failing-due-to-memcg-limits-checkpatch-fixes.patch
memcg-prevent-oom-with-too-many-dirty-pages.patch
avr32-mm-faultc-port-oom-changes-to-do_page_fault-fix.patch
nmi-watchdog-fix-for-lockup-detector-breakage-on-resume.patch
kernel-sysc-avoid-argv_freenull.patch
kmsg-dev-kmsg-properly-return-possible-copy_from_user-failure.patch
printk-add-generic-functions-to-find-kern_level-headers-fix.patch
btrfs-use-printk_get_level-and-printk_skip_level-add-__printf-fix-fallout-fix.patch
btrfs-use-printk_get_level-and-printk_skip_level-add-__printf-fix-fallout-checkpatch-fixes.patch
lib-vsprintfc-remind-people-to-update-documentation-printk-formatstxt-when-adding-printk-formats.patch
string-introduce-memweight-fix.patch
drivers-rtc-rtc-ab8500c-use-uie-emulation-checkpatch-fixes.patch
drivers-rtc-rtc-r9701c-check-that-r9701_set_datetime-succeeded.patch
kernel-kmodc-document-call_usermodehelper_fns-a-bit.patch
kmod-avoid-deadlock-from-recursive-kmod-call.patch
fork-use-vma_pages-to-simplify-the-code-fix.patch
revert-sched-fix-fork-error-path-to-not-crash.patch
ipc-use-kconfig-options-for-__arch_want_ipc_parse_version.patch
fs-cachefiles-add-support-for-large-files-in-filesystem-caching-fix.patch
include-linux-aioh-cpp-c-conversions.patch
fault-injection-add-selftests-for-cpu-and-memory-hotplug.patch
journal_add_journal_head-debug.patch
mutex-subsystem-synchro-test-module-fix.patch
slab-leaks3-default-y.patch
put_bh-debug.patch


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: + memory-hotplug-fix-kswapd-looping-forever-problem-fix-fix.patch added to -mm tree
  2012-07-17 22:44 + memory-hotplug-fix-kswapd-looping-forever-problem-fix-fix.patch added to -mm tree akpm
@ 2012-07-18  1:22 ` Minchan Kim
  2012-07-18 21:38   ` Andrew Morton
  0 siblings, 1 reply; 14+ messages in thread
From: Minchan Kim @ 2012-07-18  1:22 UTC (permalink / raw)
  To: Ralf Baechle, minchan, mm-commits, akpm, aaditya.kumar.30,
	kamezawa.hiroyu, linux-mm

Hi Andrew,

> 
> 
> The patch titled
>      Subject: memory-hotplug-fix-kswapd-looping-forever-problem-fix-fix
> has been added to the -mm tree.  Its filename is
>      memory-hotplug-fix-kswapd-looping-forever-problem-fix-fix.patch
> 
> Before you just go and hit "reply", please:
>    a) Consider who else should be cc'ed
>    b) Prefer to cc a suitable mailing list as well
>    c) Ideally: find the original patch on the mailing list and do a
>       reply-to-all to that, adding suitable additional cc's
> 
> *** Remember to use Documentation/SubmitChecklist when testing your code ***
> 
> The -mm tree is included into linux-next and is updated
> there every 3-4 working days
> 
> ------------------------------------------------------
> From: Andrew Morton <akpm@linux-foundation.org>
> Subject: memory-hotplug-fix-kswapd-looping-forever-problem-fix-fix
> 
> fix CONFIG_MEMORY_ISOLATION=n build:
> 
> mm/page_alloc.c: In function 'free_area_init_core':
> mm/page_alloc.c:4430: error: 'struct zone' has no member named
> 'nr_pageblock_isolate'

Sorry about that.
I sent a patch yesterday.

> 
> Is this really necessary?  Does the zone start out all-zeroes?  If not, can we
> make it do so?

Good point.
It can remove zap_zone_vm_stats and zone->flags = 0, too.
More important thing is that we could remove adding code to initialize
zero whenever we add new field to zone. So I look at the code.

In summary, IMHO, all is already initialie zero out but we need double
check in mips.

== hotplug ==

1) ia64 arch_alloc_nodedata uses kzalloc
2) generic_alloc_nodedata uses kzalloc

So it seems to be no problem.

== contig_page_data ==
1) bootmem 

struct pglist_data __refdata contig_page_data = {
         .bdata = &bootmem_node_data[0]
};
it initializes all data of zone to zero implicitly.

2) nobootmem

struct pglist_data __refdata contig_page_data;
EXPORT_SYMBOL(contig_page_data);

it initializes all data of zone to zero implicitly.

So both case isn't no problem.

== node_data ==

1) x86

In setup_node_data, it does memset.

2) ia64

fill_pernode does memset pernode

3) powerpc

careful_zallocation does memset so it initializes all data to zero.

4) parisc

struct node_map_data node_data[MAX_NUMNODES] __read_mostly;

So it initializes all data of zone to zero implicitly.

5) m32r 

pg_data_t m32r_node_data[MAX_NUMNODES];

So it initializes all data of zone to zero implicitly.

6) sh

setup_bootmem_node does memset.
So it initializes all data of zone to zero implicitly.

7) sparc

allocate_node_data does memset.
So it initializes all data of zone to zero implicitly.

8) alpha

pg_data_t node_data[MAX_NUMNODES];
EXPORT_SYMBOL(node_data);

So it initializes all data of zone to zero implicitly.

9) tile

struct pglist_data node_data[MAX_NUMNODES] __read_mostly;
EXPORT_SYMBOL(node_data);

So it initializes all data of zone to zero implicitly.

10) mips

node_mem_init 

it seems to use slot_freepfn for __node_data but doesn't initialize zero
but not sure. we needs double checks.

11) m68k

pg_data_t pg_data_map[MAX_NUMNODES];
EXPORT_SYMBOL(pg_data_map);

So it initializes all data of zone to zero implicitly.

> 
> Cc: Aaditya Kumar <aaditya.kumar.30@gmail.com>
> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
> Cc: Mel Gorman <mgorman@suse.de>
> Cc: Michal Hocko <mhocko@suse.cz>
> Cc: Minchan Kim <minchan@kernel.org>
> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
> ---
> 
>  mm/page_alloc.c |    2 ++
>  1 file changed, 2 insertions(+)
> 
> diff -puN mm/page_alloc.c~memory-hotplug-fix-kswapd-looping-forever-problem-fix-fix
> mm/page_alloc.c
> --- a/mm/page_alloc.c~memory-hotplug-fix-kswapd-looping-forever-problem-fix-fix
> +++ a/mm/page_alloc.c
> @@ -4424,7 +4424,9 @@ static void __paginginit free_area_init_
>                 lruvec_init(&zone->lruvec, zone);
>                 zap_zone_vm_stats(zone);
>                 zone->flags = 0;
> +#ifdef CONFIG_MEMORY_ISOLATION
>                 zone->nr_pageblock_isolate = 0;
> +#endif
>                 if (!size)
>                         continue;
> 
> _
> Subject: Subject: memory-hotplug-fix-kswapd-looping-forever-problem-fix-fix
> 
> Patches currently in -mm which might be from akpm@linux-foundation.org are
> 
> linux-next.patch
> linux-next-git-rejects.patch
> i-need-old-gcc.patch
> arch-alpha-kernel-systblss-remove-debug-check.patch
> arch-x86-platform-iris-irisc-register-a-platform-device-and-a-platform-driver.patch
> arch-x86-kernel-cpu-perf_event_intel_uncoreh-make-uncore_pmu_hrtimer_interval-64-bit.patch
> sysfs-fail-dentry-revalidation-after-namespace-change-fix.patch
> thermal-add-generic-cpufreq-cooling-implementation.patch
> thermal-exynos5-add-exynos5-thermal-sensor-driver-support.patch
> thermal-exynos-register-the-tmu-sensor-with-the-kernel-thermal-layer.patch
> coredump-warn-about-unsafe-suid_dumpable-core_pattern-combo.patch
> mm.patch
> mm-make-vb_alloc-more-foolproof-fix.patch
> mm-hugetlb-add-new-hugetlb-cgroup-fix.patch
> mm-hugetlb-add-new-hugetlb-cgroup-fix-fix.patch
> hugetlb-cgroup-add-hugetlb-cgroup-control-files-fix.patch
> hugetlb-cgroup-add-hugetlb-cgroup-control-files-fix-fix.patch
> mm-memblockc-memblock_double_array-cosmetic-cleanups.patch
> memcg-make-mem_cgroup_force_empty_list-return-bool-fix.patch
> mm-fadvise-dont-return-einval-when-filesystem-cannot-implement-fadvise-checkpatch-fixes.patch
> memcg-rename-config-variables.patch
> memcg-rename-config-variables-fix-fix.patch
> mm-have-order-0-compaction-start-off-where-it-left-checkpatch-fixes.patch
> mm-have-order-0-compaction-start-off-where-it-left-v3-typo.patch
> memory-hotplug-fix-kswapd-looping-forever-problem-fix.patch
> memory-hotplug-fix-kswapd-looping-forever-problem-fix-fix.patch
> mm-memcg-fix-compaction-migration-failing-due-to-memcg-limits-checkpatch-fixes.patch
> memcg-prevent-oom-with-too-many-dirty-pages.patch
> avr32-mm-faultc-port-oom-changes-to-do_page_fault-fix.patch
> nmi-watchdog-fix-for-lockup-detector-breakage-on-resume.patch
> kernel-sysc-avoid-argv_freenull.patch
> kmsg-dev-kmsg-properly-return-possible-copy_from_user-failure.patch
> printk-add-generic-functions-to-find-kern_level-headers-fix.patch
> btrfs-use-printk_get_level-and-printk_skip_level-add-__printf-fix-fallout-fix.patch
> btrfs-use-printk_get_level-and-printk_skip_level-add-__printf-fix-fallout-checkpatch-fixes.patch
> lib-vsprintfc-remind-people-to-update-documentation-printk-formatstxt-when-adding-printk-formats.patch
> string-introduce-memweight-fix.patch
> drivers-rtc-rtc-ab8500c-use-uie-emulation-checkpatch-fixes.patch
> drivers-rtc-rtc-r9701c-check-that-r9701_set_datetime-succeeded.patch
> kernel-kmodc-document-call_usermodehelper_fns-a-bit.patch
> kmod-avoid-deadlock-from-recursive-kmod-call.patch
> fork-use-vma_pages-to-simplify-the-code-fix.patch
> revert-sched-fix-fork-error-path-to-not-crash.patch
> ipc-use-kconfig-options-for-__arch_want_ipc_parse_version.patch
> fs-cachefiles-add-support-for-large-files-in-filesystem-caching-fix.patch
> include-linux-aioh-cpp-c-conversions.patch
> fault-injection-add-selftests-for-cpu-and-memory-hotplug.patch
> journal_add_journal_head-debug.patch
> mutex-subsystem-synchro-test-module-fix.patch
> slab-leaks3-default-y.patch
> put_bh-debug.patch
> 
> --
> To unsubscribe from this list: send the line "unsubscribe mm-commits" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> 
> -- 
> Kind regards,
> Minchan Kim

-- 
Kind regards,
Minchan Kim

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: + memory-hotplug-fix-kswapd-looping-forever-problem-fix-fix.patch added to -mm tree
  2012-07-18  1:22 ` Minchan Kim
@ 2012-07-18 21:38   ` Andrew Morton
  2012-07-19  0:10     ` Minchan Kim
  0 siblings, 1 reply; 14+ messages in thread
From: Andrew Morton @ 2012-07-18 21:38 UTC (permalink / raw)
  To: Minchan Kim; +Cc: Ralf Baechle, aaditya.kumar.30, kamezawa.hiroyu, linux-mm

On Wed, 18 Jul 2012 10:22:00 +0900
Minchan Kim <minchan@kernel.org> wrote:

> > 
> > Is this really necessary?  Does the zone start out all-zeroes?  If not, can we
> > make it do so?
> 
> Good point.
> It can remove zap_zone_vm_stats and zone->flags = 0, too.
> More important thing is that we could remove adding code to initialize
> zero whenever we add new field to zone. So I look at the code.
> 
> In summary, IMHO, all is already initialie zero out but we need double
> check in mips.
> 

Well, this is hardly a performance-critical path.  So rather than
groveling around ensuring that each and every architectures does the
right thing, would it not be better to put a single memset() into core
MM if there is an appropriate place?

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: + memory-hotplug-fix-kswapd-looping-forever-problem-fix-fix.patch added to -mm tree
  2012-07-18 21:38   ` Andrew Morton
@ 2012-07-19  0:10     ` Minchan Kim
  2012-07-19  0:21       ` Tejun Heo
  0 siblings, 1 reply; 14+ messages in thread
From: Minchan Kim @ 2012-07-19  0:10 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Ralf Baechle, aaditya.kumar.30, kamezawa.hiroyu, linux-mm,
	Johannes Weiner, Tejun Heo

On Wed, Jul 18, 2012 at 02:38:10PM -0700, Andrew Morton wrote:
> On Wed, 18 Jul 2012 10:22:00 +0900
> Minchan Kim <minchan@kernel.org> wrote:
> 
> > > 
> > > Is this really necessary?  Does the zone start out all-zeroes?  If not, can we
> > > make it do so?
> > 
> > Good point.
> > It can remove zap_zone_vm_stats and zone->flags = 0, too.
> > More important thing is that we could remove adding code to initialize
> > zero whenever we add new field to zone. So I look at the code.
> > 
> > In summary, IMHO, all is already initialie zero out but we need double
> > check in mips.
> > 
> 
> Well, this is hardly a performance-critical path.  So rather than
> groveling around ensuring that each and every architectures does the
> right thing, would it not be better to put a single memset() into core
> MM if there is an appropriate place?

I think most good place is free_area_init_node but at a glance,
bootmem_data is set up eariler than free_area_init_node so shouldn't we
keep that pointer still?

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 32985dd..1e7ca80 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -4366,9 +4366,7 @@ static void __paginginit free_area_init_core(struct pglist_data *pgdat,
        int ret;
 
        pgdat_resize_init(pgdat);
-       pgdat->nr_zones = 0;
        init_waitqueue_head(&pgdat->kswapd_wait);
-       pgdat->kswapd_max_order = 0;
        pgdat_page_cgroup_init(pgdat);
 
        for (j = 0; j < MAX_NR_ZONES; j++) {
@@ -4429,11 +4427,6 @@ static void __paginginit free_area_init_core(struct pglist_data *pgdat,
 
                zone_pcp_init(zone);
                lruvec_init(&zone->lruvec, zone);
-               zap_zone_vm_stats(zone);
-               zone->flags = 0;
-#ifdef CONFIG_MEMORY_ISOLATION
-               zone->nr_pageblock_isolate = 0;
-#endif
                if (!size)
                        continue;
 
@@ -4495,7 +4488,15 @@ void __paginginit free_area_init_node(int nid, unsigned long *zones_size,
                unsigned long node_start_pfn, unsigned long *zholes_size)
 {
        pg_data_t *pgdat = NODE_DATA(nid);
-
+       /* We guarantees pg_data_t starts out all-zeroes except bdata */
+#ifndef CONFIG_NO_BOOTMEM
+       struct bootmem_data *bdata;
+       bdata = pgdat->bdata;
+#endif
+       memset(pgdat, 0, sizeof(pg_data_t));
+#ifndef CONFIG_NO_BOOTMEM
+       pgdat->bdata = bdata;
+#endif
        pgdat->node_id = nid;
        pgdat->node_start_pfn = node_start_pfn;
        calculate_node_totalpages(pgdat, zones_size, zholes_size);

> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

-- 
Kind regards,
Minchan Kim

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: + memory-hotplug-fix-kswapd-looping-forever-problem-fix-fix.patch added to -mm tree
  2012-07-19  0:10     ` Minchan Kim
@ 2012-07-19  0:21       ` Tejun Heo
  2012-07-19  0:48         ` Minchan Kim
  0 siblings, 1 reply; 14+ messages in thread
From: Tejun Heo @ 2012-07-19  0:21 UTC (permalink / raw)
  To: Minchan Kim
  Cc: Andrew Morton, Ralf Baechle, aaditya.kumar.30, kamezawa.hiroyu,
	linux-mm, Johannes Weiner, Yinghai Lu

(cc'ing Yinghai, hi!)

Hello,

On Thu, Jul 19, 2012 at 09:10:02AM +0900, Minchan Kim wrote:
> On Wed, Jul 18, 2012 at 02:38:10PM -0700, Andrew Morton wrote:
> > On Wed, 18 Jul 2012 10:22:00 +0900
> > Minchan Kim <minchan@kernel.org> wrote:
> > 
> > > > 
> > > > Is this really necessary?  Does the zone start out all-zeroes?  If not, can we
> > > > make it do so?
> > > 
> > > Good point.
> > > It can remove zap_zone_vm_stats and zone->flags = 0, too.
> > > More important thing is that we could remove adding code to initialize
> > > zero whenever we add new field to zone. So I look at the code.
> > > 
> > > In summary, IMHO, all is already initialie zero out but we need double
> > > check in mips.
> > > 
> > 
> > Well, this is hardly a performance-critical path.  So rather than
> > groveling around ensuring that each and every architectures does the
> > right thing, would it not be better to put a single memset() into core
> > MM if there is an appropriate place?
> 
> I think most good place is free_area_init_node but at a glance,
> bootmem_data is set up eariler than free_area_init_node so shouldn't we
> keep that pointer still?

I don't think zapping node_data that late is a good idea.  It's used
from very early in the boot and its usage during early boot is fairly
platform dependent.  Dunno whether there's a good solution for this.
Maybe trigger warning if some fields which have to be zero aren't?

Thanks.

-- 
tejun

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: + memory-hotplug-fix-kswapd-looping-forever-problem-fix-fix.patch added to -mm tree
  2012-07-19  0:21       ` Tejun Heo
@ 2012-07-19  0:48         ` Minchan Kim
  2012-07-19 16:57           ` Tejun Heo
  0 siblings, 1 reply; 14+ messages in thread
From: Minchan Kim @ 2012-07-19  0:48 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Andrew Morton, Ralf Baechle, aaditya.kumar.30, kamezawa.hiroyu,
	linux-mm, Johannes Weiner, Yinghai Lu

Hi Tejun,

On Wed, Jul 18, 2012 at 05:21:02PM -0700, Tejun Heo wrote:
> (cc'ing Yinghai, hi!)
> 
> Hello,
> 
> On Thu, Jul 19, 2012 at 09:10:02AM +0900, Minchan Kim wrote:
> > On Wed, Jul 18, 2012 at 02:38:10PM -0700, Andrew Morton wrote:
> > > On Wed, 18 Jul 2012 10:22:00 +0900
> > > Minchan Kim <minchan@kernel.org> wrote:
> > > 
> > > > > 
> > > > > Is this really necessary?  Does the zone start out all-zeroes?  If not, can we
> > > > > make it do so?
> > > > 
> > > > Good point.
> > > > It can remove zap_zone_vm_stats and zone->flags = 0, too.
> > > > More important thing is that we could remove adding code to initialize
> > > > zero whenever we add new field to zone. So I look at the code.
> > > > 
> > > > In summary, IMHO, all is already initialie zero out but we need double
> > > > check in mips.
> > > > 
> > > 
> > > Well, this is hardly a performance-critical path.  So rather than
> > > groveling around ensuring that each and every architectures does the
> > > right thing, would it not be better to put a single memset() into core
> > > MM if there is an appropriate place?
> > 
> > I think most good place is free_area_init_node but at a glance,
> > bootmem_data is set up eariler than free_area_init_node so shouldn't we
> > keep that pointer still?
> 
> I don't think zapping node_data that late is a good idea.  It's used

As I look over all arch code[1], they seem to zero out when pg_data_t is allocated.
So we might not need this but Andrew want to make sure it in core MM code.
[1] http://marc.info/?l=linux-mm&m=134257473810711&w=3

> from very early in the boot and its usage during early boot is fairly
> platform dependent.  Dunno whether there's a good solution for this.

You mean some arches might want to use pglist_data's any fields
freely during bootup? So free_area_init_node should initialize fields
for core MM part explicitly?
If so, do they want to use fields of pglist_dat->node_zones[ ] freely?
Maybe not. If they doesn't use them, we could zero out zone structure
at least, not pg_data_t and it could be helpful, too.

> Maybe trigger warning if some fields which have to be zero aren't?

It's not good because this causes adding new WARNING in that part
whenever we add new field in pgdat. It nullify this patch's goal.

> 
> Thanks.
> 
> -- 
> tejun
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

-- 
Kind regards,
Minchan Kim

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: + memory-hotplug-fix-kswapd-looping-forever-problem-fix-fix.patch added to -mm tree
  2012-07-19  0:48         ` Minchan Kim
@ 2012-07-19 16:57           ` Tejun Heo
  2012-07-19 23:50             ` Minchan Kim
  0 siblings, 1 reply; 14+ messages in thread
From: Tejun Heo @ 2012-07-19 16:57 UTC (permalink / raw)
  To: Minchan Kim
  Cc: Andrew Morton, Ralf Baechle, aaditya.kumar.30, kamezawa.hiroyu,
	linux-mm, Johannes Weiner, Yinghai Lu

Hello,

On Thu, Jul 19, 2012 at 09:48:45AM +0900, Minchan Kim wrote:
> > Maybe trigger warning if some fields which have to be zero aren't?
> 
> It's not good because this causes adding new WARNING in that part
> whenever we add new field in pgdat. It nullify this patch's goal.

Maybe just do that on some fields?  The goal is catching unlikely case
where archs leave the struct with garbage data.  I don't think full
coverage is an absolute requirement.  Or reorganize the fields such
that fields unused by boot code is collected at the top so that it can
be memset after certain offset?

But, really, given how the structure is used, I think we're better off
just making sure all archs clear them and maybe have a sanity check or
two just in case.  It's not like breakage on that front is gonna be
subtle.

Thanks.

-- 
tejun

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: + memory-hotplug-fix-kswapd-looping-forever-problem-fix-fix.patch added to -mm tree
  2012-07-19 16:57           ` Tejun Heo
@ 2012-07-19 23:50             ` Minchan Kim
  2012-07-20 17:15               ` Tejun Heo
  2012-07-20 21:22               ` Andrew Morton
  0 siblings, 2 replies; 14+ messages in thread
From: Minchan Kim @ 2012-07-19 23:50 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Andrew Morton, Ralf Baechle, aaditya.kumar.30, kamezawa.hiroyu,
	linux-mm, Johannes Weiner, Yinghai Lu

On Thu, Jul 19, 2012 at 09:57:50AM -0700, Tejun Heo wrote:
> Hello,
> 
> On Thu, Jul 19, 2012 at 09:48:45AM +0900, Minchan Kim wrote:
> > > Maybe trigger warning if some fields which have to be zero aren't?
> > 
> > It's not good because this causes adding new WARNING in that part
> > whenever we add new field in pgdat. It nullify this patch's goal.
> 
> Maybe just do that on some fields?  The goal is catching unlikely case
> where archs leave the struct with garbage data.  I don't think full
> coverage is an absolute requirement.  Or reorganize the fields such

IIUC your previous reply, archs can use any fields during boot.
If so, we need full coverage for catching it.

> that fields unused by boot code is collected at the top so that it can
> be memset after certain offset?

If the fields touched by boot are limited, it's good idea.
Let me ask a question.
What fields are used by boot code before calling free_area_init_node
(excpet struct bootmem_data *bdata)?

> 
> But, really, given how the structure is used, I think we're better off
> just making sure all archs clear them and maybe have a sanity check or
> two just in case.  It's not like breakage on that front is gonna be
> subtle.

Of course, it seems all archs seems to zero-out already as I mentioned
(Not sure, MIPS) but Andrew doesn't want it. Andrew?

> 
> Thanks.
> 
> -- 
> tejun
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

-- 
Kind regards,
Minchan Kim

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: + memory-hotplug-fix-kswapd-looping-forever-problem-fix-fix.patch added to -mm tree
  2012-07-19 23:50             ` Minchan Kim
@ 2012-07-20 17:15               ` Tejun Heo
  2012-07-20 21:22               ` Andrew Morton
  1 sibling, 0 replies; 14+ messages in thread
From: Tejun Heo @ 2012-07-20 17:15 UTC (permalink / raw)
  To: Minchan Kim
  Cc: Andrew Morton, Ralf Baechle, aaditya.kumar.30, kamezawa.hiroyu,
	linux-mm, Johannes Weiner, Yinghai Lu

Hello,

On Fri, Jul 20, 2012 at 08:50:57AM +0900, Minchan Kim wrote:
> > But, really, given how the structure is used, I think we're better off
> > just making sure all archs clear them and maybe have a sanity check or
> > two just in case.  It's not like breakage on that front is gonna be
> > subtle.
> 
> Of course, it seems all archs seems to zero-out already as I mentioned
> (Not sure, MIPS) but Andrew doesn't want it. Andrew?

So, to be more direct.  Either 1. remove the spurious initializations
(and hunt down archs which don't zero them if there's any) or 2. leave
it alone.  It's one of the data structures which are allocated and
used way before any generic code kicks in.  I mean, even how it's
deferenced is arch-dependent - it's wrapped in NODE_DATA macro for a
reason.

I would vote for #1 as it's simply brain-damaged to not zero any
global data structure and partial initialization of a data structure
already in use is silly and dangerous.

Thanks.

-- 
tejun

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: + memory-hotplug-fix-kswapd-looping-forever-problem-fix-fix.patch added to -mm tree
  2012-07-19 23:50             ` Minchan Kim
  2012-07-20 17:15               ` Tejun Heo
@ 2012-07-20 21:22               ` Andrew Morton
  2012-07-20 21:36                 ` Tejun Heo
  1 sibling, 1 reply; 14+ messages in thread
From: Andrew Morton @ 2012-07-20 21:22 UTC (permalink / raw)
  To: Minchan Kim
  Cc: Tejun Heo, Ralf Baechle, aaditya.kumar.30, kamezawa.hiroyu,
	linux-mm, Johannes Weiner, Yinghai Lu

On Fri, 20 Jul 2012 08:50:57 +0900
Minchan Kim <minchan@kernel.org> wrote:

> > 
> > But, really, given how the structure is used, I think we're better off
> > just making sure all archs clear them and maybe have a sanity check or
> > two just in case.  It's not like breakage on that front is gonna be
> > subtle.
> 
> Of course, it seems all archs seems to zero-out already as I mentioned
> (Not sure, MIPS) but Andrew doesn't want it. Andrew?

My point is that having to ensure that each arch zeroes out this
structure is difficult/costly/unreliable/fragile.  It would be better
if we can reliably clear it at some well-known place in core MM.

That might mean that the memory gets cleared twice on some
architectures, but I doubt if that matters - it's a once-off thing.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: + memory-hotplug-fix-kswapd-looping-forever-problem-fix-fix.patch added to -mm tree
  2012-07-20 21:22               ` Andrew Morton
@ 2012-07-20 21:36                 ` Tejun Heo
  2012-07-23  4:58                   ` Minchan Kim
  0 siblings, 1 reply; 14+ messages in thread
From: Tejun Heo @ 2012-07-20 21:36 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Minchan Kim, Ralf Baechle, aaditya.kumar.30, kamezawa.hiroyu,
	linux-mm, Johannes Weiner, Yinghai Lu

Hello, Andrew.

On Fri, Jul 20, 2012 at 02:22:13PM -0700, Andrew Morton wrote:
> My point is that having to ensure that each arch zeroes out this
> structure is difficult/costly/unreliable/fragile.  It would be better
> if we can reliably clear it at some well-known place in core MM.
> 
> That might mean that the memory gets cleared twice on some
> architectures, but I doubt if that matters - it's a once-off thing.

Clearing twice isn't the problem here.  The problem is the risk of
zapping fields which are already in use.  That would be way more
unexpected and difficult to track down than garbage value in whatever
field.

It might not be ideal but I think nudging all archs to clear all
static global structures they allocate is the better way here.  It's
at least better than having to worry about this type of partial
re-initialization.

Thanks.

-- 
tejun

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: + memory-hotplug-fix-kswapd-looping-forever-problem-fix-fix.patch added to -mm tree
  2012-07-20 21:36                 ` Tejun Heo
@ 2012-07-23  4:58                   ` Minchan Kim
  2012-07-23 15:42                     ` Tejun Heo
  0 siblings, 1 reply; 14+ messages in thread
From: Minchan Kim @ 2012-07-23  4:58 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Andrew Morton, Ralf Baechle, aaditya.kumar.30, kamezawa.hiroyu,
	linux-mm, Johannes Weiner, Yinghai Lu

On Fri, Jul 20, 2012 at 02:36:41PM -0700, Tejun Heo wrote:
> Hello, Andrew.
> 
> On Fri, Jul 20, 2012 at 02:22:13PM -0700, Andrew Morton wrote:
> > My point is that having to ensure that each arch zeroes out this
> > structure is difficult/costly/unreliable/fragile.  It would be better
> > if we can reliably clear it at some well-known place in core MM.
> > 
> > That might mean that the memory gets cleared twice on some
> > architectures, but I doubt if that matters - it's a once-off thing.
> 
> Clearing twice isn't the problem here.  The problem is the risk of
> zapping fields which are already in use.  That would be way more
> unexpected and difficult to track down than garbage value in whatever
> field.

I would like to know what fields you are concerning because most of field
in pg_data_t are generic except bdata so they would be initialized
by free_area_init_node. So IMHO, reset pg_data_t except bdata would be
no problem and clean approach. If some arch needs some fields in pg_data_t
, we have to declare new variable struct arch_data in pg_data_t and
generic functions doesn't need to touch them.
Of course, we can skip reset of that structure, too.

Please let me know it if I am missing something.

-- 
Kind regards,
Minchan Kim

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: + memory-hotplug-fix-kswapd-looping-forever-problem-fix-fix.patch added to -mm tree
  2012-07-23  4:58                   ` Minchan Kim
@ 2012-07-23 15:42                     ` Tejun Heo
  2012-07-24  1:11                       ` Minchan Kim
  0 siblings, 1 reply; 14+ messages in thread
From: Tejun Heo @ 2012-07-23 15:42 UTC (permalink / raw)
  To: Minchan Kim
  Cc: Andrew Morton, Ralf Baechle, aaditya.kumar.30, kamezawa.hiroyu,
	linux-mm, Johannes Weiner, Yinghai Lu

Hello, Minchan.

On Mon, Jul 23, 2012 at 01:58:55PM +0900, Minchan Kim wrote:
> I would like to know what fields you are concerning because most of field

The above question itself is a problem.  It's subtle to hell.  Some
fields of this data structure is used during early boot but at some
point all are reset to zero, so we have to be careful about how those
fields are used before and after.

This might seem clear now but things like this are likely to make
people later working on the code go WTF.  Let's say for whatever
reason ->bdata needs to be accessed after free_area_init - e.g.
arch_add_memory() needs some info from bdata, what then?

What if we end up having to add a new property field which is
determined by platform code but used by generic code.  I would add a
field to pgdat, init it from numa.c and then later use it in generic
code.  If the field gets zeroed inbetween, I would get pretty annoyed.

I really don't think this subject is worth the amount of discussion we
had in this thread.  Just make the archs clear the data structure on
creation.  Anything else is silly.

Thanks.

-- 
tejun

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: + memory-hotplug-fix-kswapd-looping-forever-problem-fix-fix.patch added to -mm tree
  2012-07-23 15:42                     ` Tejun Heo
@ 2012-07-24  1:11                       ` Minchan Kim
  0 siblings, 0 replies; 14+ messages in thread
From: Minchan Kim @ 2012-07-24  1:11 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Andrew Morton, Ralf Baechle, aaditya.kumar.30, kamezawa.hiroyu,
	linux-mm, Johannes Weiner, Yinghai Lu

On Mon, Jul 23, 2012 at 08:42:47AM -0700, Tejun Heo wrote:
> Hello, Minchan.
> 
> On Mon, Jul 23, 2012 at 01:58:55PM +0900, Minchan Kim wrote:
> > I would like to know what fields you are concerning because most of field
> 
> The above question itself is a problem.  It's subtle to hell.  Some
> fields of this data structure is used during early boot but at some
> point all are reset to zero, so we have to be careful about how those
> fields are used before and after.
> 
> This might seem clear now but things like this are likely to make
> people later working on the code go WTF.  Let's say for whatever
> reason ->bdata needs to be accessed after free_area_init - e.g.
> arch_add_memory() needs some info from bdata, what then?
> 
> What if we end up having to add a new property field which is
> determined by platform code but used by generic code.  I would add a
> field to pgdat, init it from numa.c and then later use it in generic
> code.  If the field gets zeroed inbetween, I would get pretty annoyed.
> 
> I really don't think this subject is worth the amount of discussion we
> had in this thread.  Just make the archs clear the data structure on
> creation.  Anything else is silly.

I sent patchset and will wait of akpm's opinion.
Thanks for the comment, Tejun.

-- 
Kind regards,
Minchan Kim

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2012-07-24  1:11 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-07-17 22:44 + memory-hotplug-fix-kswapd-looping-forever-problem-fix-fix.patch added to -mm tree akpm
2012-07-18  1:22 ` Minchan Kim
2012-07-18 21:38   ` Andrew Morton
2012-07-19  0:10     ` Minchan Kim
2012-07-19  0:21       ` Tejun Heo
2012-07-19  0:48         ` Minchan Kim
2012-07-19 16:57           ` Tejun Heo
2012-07-19 23:50             ` Minchan Kim
2012-07-20 17:15               ` Tejun Heo
2012-07-20 21:22               ` Andrew Morton
2012-07-20 21:36                 ` Tejun Heo
2012-07-23  4:58                   ` Minchan Kim
2012-07-23 15:42                     ` Tejun Heo
2012-07-24  1:11                       ` Minchan Kim

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.