linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Tejun Heo <tj@kernel.org>
To: Tang Chen <tangchen@cn.fujitsu.com>
Cc: yinghai@kernel.org, tglx@linutronix.de, mingo@elte.hu,
	hpa@zytor.com, akpm@linux-foundation.org, trenn@suse.de,
	jiang.liu@huawei.com, wency@cn.fujitsu.com, laijs@cn.fujitsu.com,
	isimatu.yasuaki@jp.fujitsu.com, mgorman@suse.de,
	minchan@kernel.org, mina86@mina86.com, gong.chen@linux.intel.com,
	vasilis.liaskovitis@profitbricks.com, lwoodman@redhat.com,
	riel@redhat.com, jweiner@redhat.com, prarit@redhat.com,
	x86@kernel.org, linux-doc@vger.kernel.org,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org
Subject: Re: [Part1 PATCH v5 00/22] x86, ACPI, numa: Parse numa info earlier
Date: Wed, 19 Jun 2013 23:17:19 -0700	[thread overview]
Message-ID: <20130620061719.GA16114@mtj.dyndns.org> (raw)
In-Reply-To: <51C298B2.9060900@cn.fujitsu.com>

Hello, Tang.

On Thu, Jun 20, 2013 at 01:52:50PM +0800, Tang Chen wrote:
> 1. It is difficult to tell which memory allocation is temporary and
>    which one is permanent when memblock is allocating memory. So, we
>    can only wait till boot is complete, and see which remains.
>    But, we have the second difficulty.
> 
> 2. In memblock.reserve[], we cannot tell why we allocated this memory
>    just from the array item, right?  So it is difficult to do the
>    relocation. If in the future, we have to allocate permanent memory
>    for other new purposes, we have to do the relocation again and again.
>    (Not sure if I understand the point correctly. I think there isn't
>     a generic way to relocate memory used for different purposes.)

I was suggesting two separate things.

* As memblock allocator can relocate itself.  There's no point in
  avoiding setting NUMA node while parsing and registering NUMA
  topology.  Just parse and register NUMA info and later tell it to
  relocate itself out of hot-pluggable node.  A number of patches in
  the series is doing this dancing - carefully reordering NUMA
  probing.  No need to do that.  It's really fragile thing to do.

* Once you get the above out of the way, I don't think there are a lot
  of permanent allocations in the way before NUMA is initialized.
  Re-order the remaining ones if that's cleaner to do.  If that gets
  overly messy / fragile, copying them around or freeing and reloading
  afterwards could be an option too.  There isn't much point in being
  super-efficient about ACPI override table.  Being cleaner and more
  robust is far more important.

As for distinguishing temporary / permanent, it shouldn't be difficult
to make memblock track all allocations before NUMA info becomes online
and then verify that those areas are free by the time boot is
complete.  Just mark the reserved areas allocated before NUMA info is
fully available.

> If you also had a look at the Part2 patches, you will see that I
> introduced a flags member into memblock to specify different types
> of memory, which will help to recognize hotpluggable memory. My
> thinking is that ensure memblock will not allocate hotpluggable
> memory. I think this is the most safe and easy way to satisfy hotplug
> requirement.

And you can use exactly the same mechanism to track memory areas which
were allocated before NUMA info was fully available, right?

> So you don't agree to serialize the operations at boot time.

No, I'm not disagreeing that some ordering is necessary.  My point is
that things seem to be going that way too far.  Sure, some reordering
is necessary but it doesn't have to be this fragile.  Careful
reordering isn't the only way to achieve it.

> About this patch-set from Yinghai, actually he is doing a job that I
> failed to do. And he also included a lot of other things in the
> patch-set, such as extend max number of overridable acpi tables, local
> node pagetable, and so on.

Doing multiple things to achieve a goal in a patchset might not be
optimal but is usually okay if properly explained.  What's not okay is
not explaining the overall goal, approach and design in the head
message, poor quality of patch description and code documentation.

This part of code is almost inherently fragile and difficult to debug
and patchset like this would degrade the maintainability and I really
don't want to spend hours trying to decipher what the overall approach
is by trying to navigate maze of poorly documented patches only to
find out that some of the basic approaches are not very agreeable.  We
could have had this exact discussion way earlier if the head message
properly described what was going on and the review process would have
been much more pleasant for all involved parties.

I don't think it matters whose patches go in how as long as they are
attributed correctly.  The end result - what goes in the git tree as
log and code changes - matters, and it needs to be whole lot better.

Thanks.

-- 
tejun

  reply	other threads:[~2013-06-20  6:17 UTC|newest]

Thread overview: 87+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-06-13 13:02 [Part1 PATCH v5 00/22] x86, ACPI, numa: Parse numa info earlier Tang Chen
2013-06-13 13:02 ` [Part1 PATCH v5 01/22] x86: Change get_ramdisk_{image|size}() to global Tang Chen
2013-06-14 21:30   ` [tip:x86/mm] " tip-bot for Yinghai Lu
2013-06-13 13:02 ` [Part1 PATCH v5 02/22] x86, microcode: Use common get_ramdisk_{image|size}() Tang Chen
2013-06-14 21:31   ` [tip:x86/mm] x86, microcode: Use common get_ramdisk_{image|size}( ) tip-bot for Yinghai Lu
2013-06-13 13:02 ` [Part1 PATCH v5 03/22] x86, ACPI, mm: Kill max_low_pfn_mapped Tang Chen
2013-06-14 21:31   ` [tip:x86/mm] " tip-bot for Yinghai Lu
2013-06-17 21:04   ` [Part1 PATCH v5 03/22] " Tejun Heo
2013-06-17 21:13     ` Yinghai Lu
2013-06-17 23:08       ` Tejun Heo
2013-06-13 13:02 ` [Part1 PATCH v5 04/22] x86, ACPI: Search buffer above 4GB in a second try for acpi initrd table override Tang Chen
2013-06-14 21:31   ` [tip:x86/mm] " tip-bot for Yinghai Lu
2013-06-17 21:06   ` [Part1 PATCH v5 04/22] " Tejun Heo
2013-06-13 13:02 ` [Part1 PATCH v5 05/22] x86, ACPI: Increase acpi initrd override tables number limit Tang Chen
2013-06-14 21:31   ` [tip:x86/mm] " tip-bot for Yinghai Lu
2013-06-13 13:02 ` [Part1 PATCH v5 06/22] x86, ACPI: Split acpi_initrd_override() into find/copy two steps Tang Chen
2013-06-14 21:31   ` [tip:x86/mm] x86, ACPI: Split acpi_initrd_override() into find/ copy " tip-bot for Yinghai Lu
2013-06-13 13:02 ` [Part1 PATCH v5 07/22] x86, ACPI: Store override acpi tables phys addr in cpio files info array Tang Chen
2013-06-14 21:31   ` [tip:x86/mm] " tip-bot for Yinghai Lu
2013-06-17 23:38   ` [Part1 PATCH v5 07/22] " Tejun Heo
2013-06-17 23:40     ` Yinghai Lu
2013-06-17 23:52   ` Tejun Heo
2013-06-13 13:02 ` [Part1 PATCH v5 08/22] x86, ACPI: Make acpi_initrd_override_find work with 32bit flat mode Tang Chen
2013-06-14 21:31   ` [tip:x86/mm] " tip-bot for Yinghai Lu
2013-06-18  0:07   ` [Part1 PATCH v5 08/22] " Tejun Heo
2013-06-13 13:02 ` [Part1 PATCH v5 09/22] x86, ACPI: Find acpi tables in initrd early from head_32.S/head64.c Tang Chen
2013-06-14 21:32   ` [tip:x86/mm] " tip-bot for Yinghai Lu
2013-06-18  0:33   ` [Part1 PATCH v5 09/22] " Tejun Heo
2013-06-13 13:02 ` [Part1 PATCH v5 10/22] x86, mm, numa: Move two functions calling on successful path later Tang Chen
2013-06-14 21:32   ` [tip:x86/mm] " tip-bot for Yinghai Lu
2013-06-18  0:53   ` [Part1 PATCH v5 10/22] " Tejun Heo
2013-06-13 13:02 ` [Part1 PATCH v5 11/22] x86, mm, numa: Call numa_meminfo_cover_memory() checking early Tang Chen
2013-06-14 21:32   ` [tip:x86/mm] " tip-bot for Yinghai Lu
2013-06-18  1:05   ` [Part1 PATCH v5 11/22] " Tejun Heo
2013-06-13 13:02 ` [Part1 PATCH v5 12/22] x86, mm, numa: Move node_map_pfn_alignment() to x86 Tang Chen
2013-06-14 21:32   ` [tip:x86/mm] " tip-bot for Yinghai Lu
2013-06-18  1:08   ` [Part1 PATCH v5 12/22] " Tejun Heo
2013-06-13 13:03 ` [Part1 PATCH v5 13/22] x86, mm, numa: Use numa_meminfo to check node_map_pfn alignment Tang Chen
2013-06-14 21:32   ` [tip:x86/mm] " tip-bot for Yinghai Lu
2013-06-18  1:40   ` [Part1 PATCH v5 13/22] " Tejun Heo
2013-06-13 13:03 ` [Part1 PATCH v5 14/22] x86, mm, numa: Set memblock nid later Tang Chen
2013-06-14 21:32   ` [tip:x86/mm] " tip-bot for Yinghai Lu
2013-06-18  1:45   ` [Part1 PATCH v5 14/22] " Tejun Heo
2013-06-13 13:03 ` [Part1 PATCH v5 15/22] x86, mm, numa: Move node_possible_map setting later Tang Chen
2013-06-14 21:32   ` [tip:x86/mm] " tip-bot for Yinghai Lu
2013-06-13 13:03 ` [Part1 PATCH v5 16/22] x86, mm, numa: Move numa emulation handling down Tang Chen
2013-06-14 21:33   ` [tip:x86/mm] " tip-bot for Yinghai Lu
2013-06-18  1:58   ` [Part1 PATCH v5 16/22] " Tejun Heo
2013-06-18  6:22     ` Yinghai Lu
2013-06-18  7:13       ` Yinghai Lu
2013-06-19 21:25       ` Yinghai Lu
2013-06-13 13:03 ` [Part1 PATCH v5 17/22] x86, ACPI, numa, ia64: split SLIT handling out Tang Chen
2013-06-14 21:33   ` [tip:x86/mm] " tip-bot for Yinghai Lu
2013-06-13 13:03 ` [Part1 PATCH v5 18/22] x86, mm, numa: Add early_initmem_init() stub Tang Chen
2013-06-14 21:33   ` [tip:x86/mm] " tip-bot for Yinghai Lu
2013-06-13 13:03 ` [Part1 PATCH v5 19/22] x86, mm: Parse numa info earlier Tang Chen
2013-06-14 21:33   ` [tip:x86/mm] " tip-bot for Yinghai Lu
2013-06-13 13:03 ` [Part1 PATCH v5 20/22] x86, mm: Add comments for step_size shift Tang Chen
2013-06-14 21:33   ` [tip:x86/mm] " tip-bot for Yinghai Lu
2013-06-13 13:03 ` [Part1 PATCH v5 21/22] x86, mm: Make init_mem_mapping be able to be called several times Tang Chen
2013-06-13 18:35   ` Konrad Rzeszutek Wilk
2013-06-13 22:47     ` Yinghai Lu
2013-06-14  5:08       ` Tang Chen
2013-06-14 21:33   ` [tip:x86/mm] " tip-bot for Yinghai Lu
2013-06-13 13:03 ` [Part1 PATCH v5 22/22] x86, mm, numa: Put pagetable on local node ram for 64bit Tang Chen
2013-06-14 21:34   ` [tip:x86/mm] " tip-bot for Yinghai Lu
2013-06-18  2:03 ` [Part1 PATCH v5 00/22] x86, ACPI, numa: Parse numa info earlier Tejun Heo
2013-06-18  5:47   ` Tang Chen
2013-06-18 17:21     ` Tejun Heo
2013-06-20  5:52       ` Tang Chen
2013-06-20  6:17         ` Tejun Heo [this message]
2013-06-21  9:19           ` Tang Chen
2013-06-21 18:25             ` Tejun Heo
2013-06-24  3:51               ` Tang Chen
2013-06-24  7:26                 ` Tang Chen
2013-06-24 19:59                   ` Tejun Heo
2013-06-18 17:10 ` Vasilis Liaskovitis
2013-06-18 20:19   ` Yinghai Lu
2013-06-19 10:05     ` Vasilis Liaskovitis
2013-06-20 18:42       ` Yinghai Lu
2013-06-24  9:40   ` Gu Zheng
2013-06-21  5:19 ` H. Peter Anvin
2013-06-21  6:06   ` Tang Chen
2013-06-21  6:10     ` H. Peter Anvin
2013-06-21  6:20       ` Tang Chen
2013-06-21  6:26         ` Tejun Heo
2013-06-21 20:18   ` Yinghai Lu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130620061719.GA16114@mtj.dyndns.org \
    --to=tj@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=gong.chen@linux.intel.com \
    --cc=hpa@zytor.com \
    --cc=isimatu.yasuaki@jp.fujitsu.com \
    --cc=jiang.liu@huawei.com \
    --cc=jweiner@redhat.com \
    --cc=laijs@cn.fujitsu.com \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lwoodman@redhat.com \
    --cc=mgorman@suse.de \
    --cc=mina86@mina86.com \
    --cc=minchan@kernel.org \
    --cc=mingo@elte.hu \
    --cc=prarit@redhat.com \
    --cc=riel@redhat.com \
    --cc=tangchen@cn.fujitsu.com \
    --cc=tglx@linutronix.de \
    --cc=trenn@suse.de \
    --cc=vasilis.liaskovitis@profitbricks.com \
    --cc=wency@cn.fujitsu.com \
    --cc=x86@kernel.org \
    --cc=yinghai@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).