All of lore.kernel.org
 help / color / mirror / Atom feed
From: Joonsoo Kim <iamjoonsoo.kim@lge.com>
To: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Linux Memory Management List <linux-mm@kvack.org>,
	LKML <linux-kernel@vger.kernel.org>,
	Vlastimil Babka <vbabka@suse.cz>,
	Laura Abbott <labbott@redhat.com>,
	"Aneesh Kumar K . V" <aneesh.kumar@linux.ibm.com>,
	Mel Gorman <mgorman@techsingularity.net>,
	Michal Hocko <mhocko@suse.com>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Roman Gushchin <guro@fb.com>, Minchan Kim <minchan@kernel.org>,
	Rik van Riel <riel@surriel.com>,
	Christian Koenig <christian.koenig@amd.com>,
	Huang Rui <ray.huang@amd.com>,
	"Rafael J . Wysocki" <rjw@rjwysocki.net>,
	Pavel Machek <pavel@ucw.cz>,
	kernel-team@lge.com, Christoph Hellwig <hch@infradead.org>,
	Kexec Mailing List <kexec@lists.infradead.org>
Subject: Re: [PATCH v2 03/10] kexec: separate PageHighMem() and PageHighMemZone() use case
Date: Wed, 6 May 2020 14:23:27 +0900	[thread overview]
Message-ID: <20200506052327.GA25974@js1304-desktop> (raw)
In-Reply-To: <87ftcfpzjn.fsf@x220.int.ebiederm.org>

On Mon, May 04, 2020 at 09:03:56AM -0500, Eric W. Biederman wrote:
> 
> I have added in the kexec mailling list.
> 
> Looking at the patch we are discussing it appears that the kexec code
> could be doing much better in highmem situations today but is not.

Sound great!

> 
> 
> Joonsoo Kim <js1304@gmail.com> writes:
> 
> > 2020년 5월 1일 (금) 오후 11:06, Eric W. Biederman <ebiederm@xmission.com>님이 작성:
> >>
> >> js1304@gmail.com writes:
> >>
> >> > From: Joonsoo Kim <iamjoonsoo.kim@lge.com>
> >> >
> >> > Until now, PageHighMem() is used for two different cases. One is to check
> >> > if there is a direct mapping for this page or not. The other is to check
> >> > the zone of this page, that is, weather it is the highmem type zone or not.
> >> >
> >> > Now, we have separate functions, PageHighMem() and PageHighMemZone() for
> >> > each cases. Use appropriate one.
> >> >
> >> > Note that there are some rules to determine the proper macro.
> >> >
> >> > 1. If PageHighMem() is called for checking if the direct mapping exists
> >> > or not, use PageHighMem().
> >> > 2. If PageHighMem() is used to predict the previous gfp_flags for
> >> > this page, use PageHighMemZone(). The zone of the page is related to
> >> > the gfp_flags.
> >> > 3. If purpose of calling PageHighMem() is to count highmem page and
> >> > to interact with the system by using this count, use PageHighMemZone().
> >> > This counter is usually used to calculate the available memory for an
> >> > kernel allocation and pages on the highmem zone cannot be available
> >> > for an kernel allocation.
> >> > 4. Otherwise, use PageHighMemZone(). It's safe since it's implementation
> >> > is just copy of the previous PageHighMem() implementation and won't
> >> > be changed.
> >> >
> >> > I apply the rule #2 for this patch.
> >>
> >> Hmm.
> >>
> >> What happened to the notion of deprecating and reducing the usage of
> >> highmem?  I know that we have some embedded architectures where it is
> >> still important but this feels like it flies in the face of that.
> >
> > AFAIK, deprecating highmem requires some more time and, before then,
> > we need to support it.
> 
> But it at least makes sense to look at what we are doing with highmem
> and ask if it makes sense.
> 
> >> This part of kexec would be much more maintainable if it had a proper
> >> mm layer helper that tested to see if the page matched the passed in
> >> gfp flags.  That way the mm layer could keep changing and doing weird
> >> gyrations and this code would not care.
> >
> > Good idea! I will do it.
> >
> >>
> >> What would be really helpful is if there was a straight forward way to
> >> allocate memory whose physical address fits in the native word size.
> >>
> >>
> >> All I know for certain about this patch is that it takes a piece of code
> >> that looked like it made sense, and transfroms it into something I can
> >> not easily verify, and can not maintain.
> >
> > Although I decide to make a helper as you described above, I don't
> > understand why you think that a new code isn't maintainable. It is just
> > the same thing with different name. Could you elaborate more why do
> > you think so?
> 
> Because the current code is already wrong.  It does not handle
> the general case of what it claims to handle.  When the only distinction
> that needs to be drawn is highmem or not highmem that is likely fine.
> But now you are making it possible to draw more distinctions.  At which
> point I have no idea which distinction needs to be drawn.
> 
> 
> The code and the logic is about 20 years old.  When it was written I
> don't recally taking numa seriously and the kernel only had 3 zones
> as I recall (DMA aka the now deprecated GFP_DMA, NORMAL, and HIGH).
> 
> The code attempts to work around limitations of those old zones amd play
> nice in a highmem world by allocating memory HIGH memory and not using
> it if the memory was above 4G ( on 32bit ).
> 
> Looking the kernel now has GFP_DMA32 so on 32bit with highmem we should
> probably be using that, when allocating memory.
> 

From quick investigation, unfortunately, ZONE_DMA32 isn't available on
x86 32bit now so using GFP_DMA32 to allocate memory below 4G would not
work. Enabling ZONE_DMA32 on x86 32bit would be not simple, so, IMHO, it
would be better to leave the code as it is.

> 
> 
> Further in dealing with this memory management situation we only
> have two situations we call kimage_alloc_page.
> 
> For an indirect page which must have a valid page_address(page).
> We could probably relax that if we cared to.
> 
> For a general kexec page to store the next kernel in until we switch.
> The general pages can be in high memory.
> 
> In a highmem world all of those pages should be below 32bit.
> 
> 
> 
> Given that we fundamentally have two situations my sense is that we
> should just refactor the code so that we never have to deal with:
> 
> 
> 			/* The old page I have found cannot be a
> 			 * destination page, so return it if it's
> 			 * gfp_flags honor the ones passed in.
> 			 */
> 			if (!(gfp_mask & __GFP_HIGHMEM) &&
> 			    PageHighMem(old_page)) {
> 				kimage_free_pages(old_page);
> 				continue;
> 			}
> 
> Either we teach kimage_add_entry how to work with high memory pages
> (still 32bit accessible) or we teach kimage_alloc_page to notice it is
> an indirect page allocation and to always skip trying to reuse the page
> it found in that case.
> 
> That way the code does not need to know about forever changing mm internals.

Nice! I already have seen your patch and found that above two lines
related to HIGHMEM are removed. Thanks for your help.

Thanks.

WARNING: multiple messages have this Message-ID (diff)
From: Joonsoo Kim <iamjoonsoo.kim@lge.com>
To: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: kernel-team@lge.com, Michal Hocko <mhocko@suse.com>,
	Minchan Kim <minchan@kernel.org>,
	"Aneesh Kumar K . V" <aneesh.kumar@linux.ibm.com>,
	Rik van Riel <riel@surriel.com>,
	"Rafael J . Wysocki" <rjw@rjwysocki.net>,
	LKML <linux-kernel@vger.kernel.org>,
	Christian Koenig <christian.koenig@amd.com>,
	Christoph Hellwig <hch@infradead.org>,
	Linux Memory Management List <linux-mm@kvack.org>,
	Huang Rui <ray.huang@amd.com>,
	Kexec Mailing List <kexec@lists.infradead.org>,
	Pavel Machek <pavel@ucw.cz>, Johannes Weiner <hannes@cmpxchg.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Laura Abbott <labbott@redhat.com>,
	Mel Gorman <mgorman@techsingularity.net>,
	Roman Gushchin <guro@fb.com>, Vlastimil Babka <vbabka@suse.cz>
Subject: Re: [PATCH v2 03/10] kexec: separate PageHighMem() and PageHighMemZone() use case
Date: Wed, 6 May 2020 14:23:27 +0900	[thread overview]
Message-ID: <20200506052327.GA25974@js1304-desktop> (raw)
In-Reply-To: <87ftcfpzjn.fsf@x220.int.ebiederm.org>

On Mon, May 04, 2020 at 09:03:56AM -0500, Eric W. Biederman wrote:
> 
> I have added in the kexec mailling list.
> 
> Looking at the patch we are discussing it appears that the kexec code
> could be doing much better in highmem situations today but is not.

Sound great!

> 
> 
> Joonsoo Kim <js1304@gmail.com> writes:
> 
> > 2020년 5월 1일 (금) 오후 11:06, Eric W. Biederman <ebiederm@xmission.com>님이 작성:
> >>
> >> js1304@gmail.com writes:
> >>
> >> > From: Joonsoo Kim <iamjoonsoo.kim@lge.com>
> >> >
> >> > Until now, PageHighMem() is used for two different cases. One is to check
> >> > if there is a direct mapping for this page or not. The other is to check
> >> > the zone of this page, that is, weather it is the highmem type zone or not.
> >> >
> >> > Now, we have separate functions, PageHighMem() and PageHighMemZone() for
> >> > each cases. Use appropriate one.
> >> >
> >> > Note that there are some rules to determine the proper macro.
> >> >
> >> > 1. If PageHighMem() is called for checking if the direct mapping exists
> >> > or not, use PageHighMem().
> >> > 2. If PageHighMem() is used to predict the previous gfp_flags for
> >> > this page, use PageHighMemZone(). The zone of the page is related to
> >> > the gfp_flags.
> >> > 3. If purpose of calling PageHighMem() is to count highmem page and
> >> > to interact with the system by using this count, use PageHighMemZone().
> >> > This counter is usually used to calculate the available memory for an
> >> > kernel allocation and pages on the highmem zone cannot be available
> >> > for an kernel allocation.
> >> > 4. Otherwise, use PageHighMemZone(). It's safe since it's implementation
> >> > is just copy of the previous PageHighMem() implementation and won't
> >> > be changed.
> >> >
> >> > I apply the rule #2 for this patch.
> >>
> >> Hmm.
> >>
> >> What happened to the notion of deprecating and reducing the usage of
> >> highmem?  I know that we have some embedded architectures where it is
> >> still important but this feels like it flies in the face of that.
> >
> > AFAIK, deprecating highmem requires some more time and, before then,
> > we need to support it.
> 
> But it at least makes sense to look at what we are doing with highmem
> and ask if it makes sense.
> 
> >> This part of kexec would be much more maintainable if it had a proper
> >> mm layer helper that tested to see if the page matched the passed in
> >> gfp flags.  That way the mm layer could keep changing and doing weird
> >> gyrations and this code would not care.
> >
> > Good idea! I will do it.
> >
> >>
> >> What would be really helpful is if there was a straight forward way to
> >> allocate memory whose physical address fits in the native word size.
> >>
> >>
> >> All I know for certain about this patch is that it takes a piece of code
> >> that looked like it made sense, and transfroms it into something I can
> >> not easily verify, and can not maintain.
> >
> > Although I decide to make a helper as you described above, I don't
> > understand why you think that a new code isn't maintainable. It is just
> > the same thing with different name. Could you elaborate more why do
> > you think so?
> 
> Because the current code is already wrong.  It does not handle
> the general case of what it claims to handle.  When the only distinction
> that needs to be drawn is highmem or not highmem that is likely fine.
> But now you are making it possible to draw more distinctions.  At which
> point I have no idea which distinction needs to be drawn.
> 
> 
> The code and the logic is about 20 years old.  When it was written I
> don't recally taking numa seriously and the kernel only had 3 zones
> as I recall (DMA aka the now deprecated GFP_DMA, NORMAL, and HIGH).
> 
> The code attempts to work around limitations of those old zones amd play
> nice in a highmem world by allocating memory HIGH memory and not using
> it if the memory was above 4G ( on 32bit ).
> 
> Looking the kernel now has GFP_DMA32 so on 32bit with highmem we should
> probably be using that, when allocating memory.
> 

From quick investigation, unfortunately, ZONE_DMA32 isn't available on
x86 32bit now so using GFP_DMA32 to allocate memory below 4G would not
work. Enabling ZONE_DMA32 on x86 32bit would be not simple, so, IMHO, it
would be better to leave the code as it is.

> 
> 
> Further in dealing with this memory management situation we only
> have two situations we call kimage_alloc_page.
> 
> For an indirect page which must have a valid page_address(page).
> We could probably relax that if we cared to.
> 
> For a general kexec page to store the next kernel in until we switch.
> The general pages can be in high memory.
> 
> In a highmem world all of those pages should be below 32bit.
> 
> 
> 
> Given that we fundamentally have two situations my sense is that we
> should just refactor the code so that we never have to deal with:
> 
> 
> 			/* The old page I have found cannot be a
> 			 * destination page, so return it if it's
> 			 * gfp_flags honor the ones passed in.
> 			 */
> 			if (!(gfp_mask & __GFP_HIGHMEM) &&
> 			    PageHighMem(old_page)) {
> 				kimage_free_pages(old_page);
> 				continue;
> 			}
> 
> Either we teach kimage_add_entry how to work with high memory pages
> (still 32bit accessible) or we teach kimage_alloc_page to notice it is
> an indirect page allocation and to always skip trying to reuse the page
> it found in that case.
> 
> That way the code does not need to know about forever changing mm internals.

Nice! I already have seen your patch and found that above two lines
related to HIGHMEM are removed. Thanks for your help.

Thanks.

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

  parent reply	other threads:[~2020-05-06  5:53 UTC|newest]

Thread overview: 51+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-04-29  3:26 [PATCH v2 00/10] change the implementation of the PageHighMem() js1304
2020-04-29  3:26 ` [PATCH v2 01/10] mm/page-flags: introduce PageHighMemZone() js1304
2020-04-29  3:26 ` [PATCH v2 02/10] drm/ttm: separate PageHighMem() and PageHighMemZone() use case js1304
2020-04-29  3:26 ` [PATCH v2 03/10] kexec: " js1304
2020-05-01 14:03   ` Eric W. Biederman
2020-05-01 14:03     ` Eric W. Biederman
2020-05-04  3:10     ` Joonsoo Kim
2020-05-04  3:10       ` Joonsoo Kim
2020-05-04 14:03       ` Eric W. Biederman
2020-05-04 14:03         ` Eric W. Biederman
2020-05-04 14:03         ` Eric W. Biederman
2020-05-04 21:59         ` [RFC][PATCH] kexec: Teach indirect pages how to live in high memory Eric W. Biederman
2020-05-04 21:59           ` Eric W. Biederman
2020-05-04 21:59           ` Eric W. Biederman
2020-05-05 17:44           ` Hari Bathini
2020-05-05 17:44             ` Hari Bathini
2020-05-05 18:39             ` Eric W. Biederman
2020-05-05 18:39               ` Eric W. Biederman
2020-05-05 18:39               ` Eric W. Biederman
2020-10-09  1:35               ` Joonsoo Kim
2020-10-09  1:35                 ` Joonsoo Kim
2020-05-06  5:23         ` Joonsoo Kim [this message]
2020-05-06  5:23           ` [PATCH v2 03/10] kexec: separate PageHighMem() and PageHighMemZone() use case Joonsoo Kim
2020-04-29  3:26 ` [PATCH v2 04/10] power: " js1304
2020-05-01 12:22   ` Christoph Hellwig
2020-05-04  3:01     ` Joonsoo Kim
2020-05-04  3:01       ` Joonsoo Kim
2020-04-29  3:26 ` [PATCH v2 05/10] mm/gup: " js1304
2020-05-01 12:24   ` Christoph Hellwig
2020-05-04  3:02     ` Joonsoo Kim
2020-05-04  3:02       ` Joonsoo Kim
2020-04-29  3:26 ` [PATCH v2 06/10] mm/hugetlb: " js1304
2020-05-01 12:26   ` Christoph Hellwig
2020-05-04  3:03     ` Joonsoo Kim
2020-05-04  3:03       ` Joonsoo Kim
2020-04-29  3:26 ` [PATCH v2 07/10] mm: " js1304
2020-05-01 12:30   ` Christoph Hellwig
2020-05-04  3:08     ` Joonsoo Kim
2020-05-04  3:08       ` Joonsoo Kim
2020-04-29  3:26 ` [PATCH v2 08/10] mm/page_alloc: correct the use of is_highmem_idx() js1304
2020-04-29  3:26 ` [PATCH v2 09/10] mm/migrate: replace PageHighMem() with open-code js1304
2020-04-29  3:26 ` [PATCH v2 10/10] mm/page-flags: change the implementation of the PageHighMem() js1304
2020-04-30  1:47 ` [PATCH v2 00/10] " Andrew Morton
2020-05-01 10:52   ` Joonsoo Kim
2020-05-01 10:52     ` Joonsoo Kim
2020-05-01 10:55     ` Christoph Hellwig
2020-05-01 12:15       ` Joonsoo Kim
2020-05-01 12:15         ` Joonsoo Kim
2020-05-01 12:34         ` Christoph Hellwig
2020-05-04  3:09           ` Joonsoo Kim
2020-05-04  3:09             ` Joonsoo Kim

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200506052327.GA25974@js1304-desktop \
    --to=iamjoonsoo.kim@lge.com \
    --cc=akpm@linux-foundation.org \
    --cc=aneesh.kumar@linux.ibm.com \
    --cc=christian.koenig@amd.com \
    --cc=ebiederm@xmission.com \
    --cc=guro@fb.com \
    --cc=hannes@cmpxchg.org \
    --cc=hch@infradead.org \
    --cc=kernel-team@lge.com \
    --cc=kexec@lists.infradead.org \
    --cc=labbott@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@techsingularity.net \
    --cc=mhocko@suse.com \
    --cc=minchan@kernel.org \
    --cc=pavel@ucw.cz \
    --cc=ray.huang@amd.com \
    --cc=riel@surriel.com \
    --cc=rjw@rjwysocki.net \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.