All of lore.kernel.org
 help / color / mirror / Atom feed
From: David Hildenbrand <david@redhat.com>
To: Oscar Salvador <osalvador@suse.de>
Cc: Michal Hocko <mhocko@suse.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Dave Hansen <dave.hansen@linux.intel.com>,
	Anshuman Khandual <anshuman.khandual@arm.com>,
	Vlastimil Babka <vbabka@suse.cz>,
	Pavel Tatashin <pasha.tatashin@soleen.com>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH v2 1/3] mm,page_alloc: Use {get,put}_online_mems() to get stable zone's values
Date: Tue, 8 Jun 2021 17:00:09 +0200	[thread overview]
Message-ID: <0eadea9c-5af0-d7e6-071e-898b04294dd3@redhat.com> (raw)
In-Reply-To: <20210607102318.GA12683@linux>

On 07.06.21 12:23, Oscar Salvador wrote:
> On Mon, Jun 07, 2021 at 10:49:01AM +0200, David Hildenbrand wrote:
>> I'd like to point out that I think the seqlock is not in place to
>> synchronize with actual growing/shrinking but to get consistent zone ranges
>> -- like using atomics, but we have two inter-dependent values here.
> 
> I guess so, at least that's what it should do.
> But the way it is placed right now is misleading.
> 
> If we really want to get consistent zone ranges, we should start using
> zone's seqlock where it matters and that is pretty much all those
> places that use zone_spans_pfn().

Right, or even only zone_end_pfn() to get a consistent value.

> Otherwise there is no way you can be sure the pfn you're checking is
> within the limits. Moreover, as Michal pointed out early, if we really
> want to go down that road the locking should be made in the caller
> evolving the operation, otheriwse things might change once the lock
> is dropped and you're working with a wrong assumption.
> 
> I can see arguments for both riping it out and doing it right (but none for
> the way it is right now).
> For riping it out, one could say that those races might not be fatal,
> as usually the pfn you're working with (the one you want to check falls
> within a certain range) you know is valid, so the worst can happen is
> you get false positives/negatives and that might or might not be detected
> further down. How bad are false positive/negatives I guess it depends on the
> situation, but we already do that right now.
> The zone_spans_pfn() from page_outside_zone_boundaries() is the only one using
> locking right now, so well, if we survided this long without locks in other places
> using zone_spans_pfn() makes one wonder if it is that bad.
> 
> On the other hand, one could argue that for correctness sake, we should be holding
> zone's seqlock whenever checking for zone_spans_pfn() to avoid any inconsistency.
> 
> 

IMHO, as we know the race exists and we have a tool to handle it in 
place, we should maybe fix the obvious cases if possible.

Code that uses zone->zone_start_pfn directly is unlikely to be broken on 
most architectures. We will usually read/write via single instruction 
and won't get inconsistencies, for example, when shrinking or growing 
the zone. We most probably don't want to use an atomic for that right now.

Code that uses zone->spanned_pages to detect the zone end, however, is 
more likely to be broken. I don't think we have any relevant around 
anymore. Everything was converted to zone_end_pfn().

I feel like we should just make zone_end_pfn() take the seqlock in read. 
Then, we at least get a consistent value, for example, while growing a zone.

Just imagine the following case when we grow a section to the front when 
onlining memory:

	zone->zone_start_pfn -= new_pages;
	zone->spanned_pages += new_pages;

Note that compilers/CPUs might reshuffle as they like. If someone (e.g., 
zone_spans_pfn()) races with that code, it might get new 
zone->zone_start_pfn but old zone->spanned_pages. zone_end_pfn() will 
report a "too small zone" and trigger false negatives in zone_spans_pfn().

-- 
Thanks,

David / dhildenb


  parent reply	other threads:[~2021-06-08 15:00 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-06-02  9:14 [PATCH v2 0/3] Memory hotplug locking cleanup Oscar Salvador
2021-06-02  9:14 ` [PATCH v2 1/3] mm,page_alloc: Use {get,put}_online_mems() to get stable zone's values Oscar Salvador
2021-06-02 18:37   ` David Hildenbrand
2021-06-02 19:45     ` Oscar Salvador
2021-06-03  8:38       ` Oscar Salvador
2021-06-03 12:45         ` Michal Hocko
2021-06-04  7:41           ` Oscar Salvador
2021-06-07  7:52             ` Oscar Salvador
2021-06-07  8:49               ` David Hildenbrand
2021-06-07 10:23                 ` Oscar Salvador
2021-06-08 10:42                   ` Oscar Salvador
2021-06-08 15:00                   ` David Hildenbrand [this message]
2021-06-09  9:42                     ` David Hildenbrand
2021-06-07  8:42             ` Michal Hocko
2021-06-03  2:32   ` [mm,page_alloc] [confidence: ] acb5758bf4: BUG:sleeping_function_called_from_invalid_context_at_include/linux/percpu-rwsem.h kernel test robot
2021-06-03  2:32     ` [mm, page_alloc] " kernel test robot
2021-06-02  9:14 ` [PATCH v2 2/3] mm,memory_hotplug: Drop unneeded locking Oscar Salvador
2021-06-03 12:52   ` Michal Hocko
2021-06-02  9:14 ` [PATCH v2 3/3] mm,memory_hotplug: Remove unneeded declarations Oscar Salvador
2021-06-02 18:38   ` David Hildenbrand

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=0eadea9c-5af0-d7e6-071e-898b04294dd3@redhat.com \
    --to=david@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=anshuman.khandual@arm.com \
    --cc=dave.hansen@linux.intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.com \
    --cc=osalvador@suse.de \
    --cc=pasha.tatashin@soleen.com \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.