linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: David Hildenbrand <david@redhat.com>
To: "Kirill A. Shutemov" <kirill@shutemov.name>
Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	Andrew Morton <akpm@linux-foundation.org>,
	Michal Hocko <mhocko@kernel.org>,
	Oscar Salvador <osalvador@suse.de>
Subject: Re: [PATCH v1 2/2] mm: factor out next_present_section_nr()
Date: Tue, 14 Jan 2020 11:49:19 +0100	[thread overview]
Message-ID: <4de17591-e2c4-daff-e4b2-d03dd8792d0f@redhat.com> (raw)
In-Reply-To: <20200114104119.pybggnb4b2mq45wr@box>

On 14.01.20 11:41, Kirill A. Shutemov wrote:
> On Tue, Jan 14, 2020 at 12:02:00AM +0100, David Hildenbrand wrote:
>>
>>
>>> Am 13.01.2020 um 23:57 schrieb David Hildenbrand <dhildenb@redhat.com>:
>>>
>>> 
>>>
>>>>> Am 13.01.2020 um 23:41 schrieb Kirill A. Shutemov <kirill@shutemov.name>:
>>>>>
>>>>> On Mon, Jan 13, 2020 at 03:40:35PM +0100, David Hildenbrand wrote:
>>>>> Let's move it to the header and use the shorter variant from
>>>>> mm/page_alloc.c (the original one will also check
>>>>> "__highest_present_section_nr + 1", which is not necessary). While at it,
>>>>> make the section_nr in next_pfn() const.
>>>>>
>>>>> In next_pfn(), we now return section_nr_to_pfn(-1) instead of -1 once
>>>>> we exceed __highest_present_section_nr, which doesn't make a difference in
>>>>> the caller as it is big enough (>= all sane end_pfn).
>>>>>
>>>>> Cc: Andrew Morton <akpm@linux-foundation.org>
>>>>> Cc: Michal Hocko <mhocko@kernel.org>
>>>>> Cc: Oscar Salvador <osalvador@suse.de>
>>>>> Cc: Kirill A. Shutemov <kirill@shutemov.name>
>>>>> Signed-off-by: David Hildenbrand <david@redhat.com>
>>>>> ---
>>>>> include/linux/mmzone.h | 10 ++++++++++
>>>>> mm/page_alloc.c        | 11 ++---------
>>>>> mm/sparse.c            | 10 ----------
>>>>> 3 files changed, 12 insertions(+), 19 deletions(-)
>>>>>
>>>>> diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
>>>>> index c2bc309d1634..462f6873905a 100644
>>>>> --- a/include/linux/mmzone.h
>>>>> +++ b/include/linux/mmzone.h
>>>>> @@ -1379,6 +1379,16 @@ static inline int pfn_present(unsigned long pfn)
>>>>>   return present_section(__nr_to_section(pfn_to_section_nr(pfn)));
>>>>> }
>>>>>
>>>>> +static inline unsigned long next_present_section_nr(unsigned long section_nr)
>>>>> +{
>>>>> +    while (++section_nr <= __highest_present_section_nr) {
>>>>> +        if (present_section_nr(section_nr))
>>>>> +            return section_nr;
>>>>> +    }
>>>>> +
>>>>> +    return -1;
>>>>> +}
>>>>> +
>>>>> /*
>>>>> * These are _only_ used during initialisation, therefore they
>>>>> * can use __initdata ...  They could have names to indicate
>>>>> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
>>>>> index a92791512077..26e8044e9848 100644
>>>>> --- a/mm/page_alloc.c
>>>>> +++ b/mm/page_alloc.c
>>>>> @@ -5852,18 +5852,11 @@ overlap_memmap_init(unsigned long zone, unsigned long *pfn)
>>>>> /* Skip PFNs that belong to non-present sections */
>>>>> static inline __meminit unsigned long next_pfn(unsigned long pfn)
>>>>> {
>>>>> -    unsigned long section_nr;
>>>>> +    const unsigned long section_nr = pfn_to_section_nr(++pfn);
>>>>>
>>>>> -    section_nr = pfn_to_section_nr(++pfn);
>>>>>   if (present_section_nr(section_nr))
>>>>>       return pfn;
>>>>> -
>>>>> -    while (++section_nr <= __highest_present_section_nr) {
>>>>> -        if (present_section_nr(section_nr))
>>>>> -            return section_nr_to_pfn(section_nr);
>>>>> -    }
>>>>> -
>>>>> -    return -1;
>>>>> +    return section_nr_to_pfn(next_present_section_nr(section_nr));
>>>>
>>>> This changes behaviour in the corner case: if next_present_section_nr()
>>>> returns -1, we call section_nr_to_pfn() for it. It's unlikely would give
>>>> any valid pfn, but I can't say for sure for all archs. I guess the worst
>>>> case scenrio would be endless loop over the same secitons/pfns.
>>>>
>>>> Have you considered the case?
>>>
>>> Yes, see the patch description. We return -1 << PFN_SECTION_SHIFT, so a number close to the end of the address space (0xfff...000). (Will double check tomorrow if any 32bit arch could be problematic here)
>>
>> ... but thinking again, 0xfff... is certainly an invalid PFN, so this should work just fine.
>>
>> (biggest possible pfn is -1 >> PFN_SHIFT)
>>
>> But it‘s late in Germany, will double check tomorrow :)
> 
> If the end_pfn happens the be more than -1UL << PFN_SECTION_SHIFT we are
> screwed: the pfn is invalid, next_present_section_nr() returns -1, the
> next iterartion is on the same pfn and we have endless loop.
> 
> The question is whether we can prove end_pfn is always less than
> -1UL << PFN_SECTION_SHIFT in any configuration of any arch.
> 
> It is not obvious for me.

memmap_init_zone() is called for a physical memory region: pfn + size
(nr_pages)

The highest possible PFN you can have is "-1(unsigned long) >>
PFN_SHIFT". So even if you would want to add the very last section, the
PFN would still be smaller than -1UL << PFN_SECTION_SHIFT.

-- 
Thanks,

David / dhildenb


  reply	other threads:[~2020-01-14 10:49 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-01-13 14:40 [PATCH v1 0/2] mm/page_alloc: memmap_init_zone() cleanups David Hildenbrand
2020-01-13 14:40 ` [PATCH v1 1/2] mm/page_alloc: fix and rework pfn handling in memmap_init_zone() David Hildenbrand
2020-02-03 21:35   ` Alexander Duyck
2020-02-03 21:44     ` David Hildenbrand
2020-02-03 23:17       ` Alexander Duyck
2020-02-04  8:40         ` David Hildenbrand
2020-01-13 14:40 ` [PATCH v1 2/2] mm: factor out next_present_section_nr() David Hildenbrand
2020-01-13 22:41   ` Kirill A. Shutemov
2020-01-13 22:57     ` David Hildenbrand
2020-01-13 23:02       ` David Hildenbrand
2020-01-14 10:41         ` Kirill A. Shutemov
2020-01-14 10:49           ` David Hildenbrand [this message]
2020-01-14 15:52             ` Kirill A. Shutemov
2020-01-14 16:50               ` David Hildenbrand
2020-01-14 16:52                 ` David Hildenbrand
2020-01-31  4:30 ` [PATCH v1 0/2] mm/page_alloc: memmap_init_zone() cleanups Andrew Morton
2020-02-03 14:49   ` Kirill A. Shutemov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4de17591-e2c4-daff-e4b2-d03dd8792d0f@redhat.com \
    --to=david@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=kirill@shutemov.name \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=osalvador@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).