All of lore.kernel.org
 help / color / mirror / Atom feed
From: Michal Hocko <mhocko@suse.com>
To: Feng Tang <feng.tang@intel.com>
Cc: linux-mm@kvack.org, Andrew Morton <akpm@linux-foundation.org>,
	David Rientjes <rientjes@google.com>,
	Dave Hansen <dave.hansen@intel.com>,
	Ben Widawsky <ben.widawsky@intel.com>,
	linux-kernel@vger.kernel.org,
	Andrea Arcangeli <aarcange@redhat.com>,
	Mel Gorman <mgorman@techsingularity.net>,
	Mike Kravetz <mike.kravetz@oracle.com>,
	Randy Dunlap <rdunlap@infradead.org>,
	Vlastimil Babka <vbabka@suse.cz>, Andi Kleen <ak@linux.intel.com>,
	Dan Williams <dan.j.williams@intel.com>,
	ying.huang@intel.com
Subject: Re: [v3 PATCH 2/3] mm/mempolicy: don't handle MPOL_LOCAL like a fake MPOL_PREFERRED policy
Date: Tue, 1 Jun 2021 10:44:39 +0200	[thread overview]
Message-ID: <YLXzd95duZ3va7Te@dhcp22.suse.cz> (raw)
In-Reply-To: <1622469956-82897-3-git-send-email-feng.tang@intel.com>

On Mon 31-05-21 22:05:55, Feng Tang wrote:
> MPOL_LOCAL policy has been setup as a real policy, but it is still
> handled like a faked POL_PREFERRED policy with one internal
> MPOL_F_LOCAL flag bit set, and there are many places having to
> judge the real 'prefer' or the 'local' policy, which are quite
> confusing.
> 
> In current code, there are 4 cases that MPOL_LOCAL are used:
> 1. user specifies 'local' policy
> 2. user specifies 'prefer' policy, but with empty nodemask
> 3. system 'default' policy is used
> 4. 'prefer' policy + valid 'preferred' node with MPOL_F_STATIC_NODES
>    flag set, and when it is 'rebind' to a nodemask which doesn't
>    contains the 'preferred' node, it will perform as 'local' policy
> 
> So make 'local' a real policy instead of a fake 'prefer' one, and
> kill MPOL_F_LOCAL bit, which can greatly reduce the confusion for
> code reading.
> 
> For case 4, the logic of mpol_rebind_preferred() is confusing, as
> Michal Hocko pointed out:
> 
>  "
>  I do believe that rebinding preferred policy is just bogus and
>  it should be dropped altogether on the ground that a preference
>  is a mere hint from userspace where to start the allocation.
>  Unless I am missing something cpusets will be always authoritative
>  for the final placement. The preferred node just acts as a starting
>  point and it should be really preserved when cpusets changes.
>  Otherwise we have a very subtle behavior corner cases.
>  "
> So dump all the tricky transformation between 'prefer' and 'local',
> and just record the new nodemask of rebinding.
> 
> Suggested-by: Michal Hocko <mhocko@suse.com>
> Signed-off-by: Feng Tang <feng.tang@intel.com>

I like this very much! It simplifies a tricky code and also a very
dubious behavior. I would like to hear from others whether there might
be some userspace depending on this obscure behavior though. One never
knows...

Some more notes/questions below

[...]
> @@ -239,25 +240,19 @@ static int mpol_set_nodemask(struct mempolicy *pol,
>  		  cpuset_current_mems_allowed, node_states[N_MEMORY]);
>  
>  	VM_BUG_ON(!nodes);
> -	if (pol->mode == MPOL_PREFERRED && nodes_empty(*nodes))
> -		nodes = NULL;	/* explicit local allocation */
> -	else {
> -		if (pol->flags & MPOL_F_RELATIVE_NODES)
> -			mpol_relative_nodemask(&nsc->mask2, nodes, &nsc->mask1);
> -		else
> -			nodes_and(nsc->mask2, *nodes, nsc->mask1);
>  
> -		if (mpol_store_user_nodemask(pol))
> -			pol->w.user_nodemask = *nodes;
> -		else
> -			pol->w.cpuset_mems_allowed =
> -						cpuset_current_mems_allowed;
> -	}
> +	if (pol->flags & MPOL_F_RELATIVE_NODES)
> +		mpol_relative_nodemask(&nsc->mask2, nodes, &nsc->mask1);
> +	else
> +		nodes_and(nsc->mask2, *nodes, nsc->mask1);

Maybe I've just got lost here but why don't you need to check for the
local policy anymore? mpol_new will take care of the MPOL_PREFERRED &&
nodes_empty special but why do we want/need all this for a local policy
at all?

>  
> -	if (nodes)
> -		ret = mpol_ops[pol->mode].create(pol, &nsc->mask2);
> +	if (mpol_store_user_nodemask(pol))
> +		pol->w.user_nodemask = *nodes;
>  	else
> -		ret = mpol_ops[pol->mode].create(pol, NULL);
> +		pol->w.cpuset_mems_allowed =
> +					cpuset_current_mems_allowed;

please use a single line. This is just harder to read. You will cross
the line limit but readability should be preferred here.

[...]

I haven't spotted anything else.
-- 
Michal Hocko
SUSE Labs

  reply	other threads:[~2021-06-01  8:44 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-31 14:05 [v3 PATCH 0/3] mm/mempolicy: some fix and semantics cleanup Feng Tang
2021-05-31 14:05 ` [v3 PATCH 1/3] mm/mempolicy: cleanup nodemask intersection check for oom Feng Tang
2021-06-01  8:19   ` Michal Hocko
2021-06-01 11:08     ` Feng Tang
2021-06-01 23:56       ` Andrew Morton
2021-05-31 14:05 ` [v3 PATCH 2/3] mm/mempolicy: don't handle MPOL_LOCAL like a fake MPOL_PREFERRED policy Feng Tang
2021-06-01  8:44   ` Michal Hocko [this message]
2021-06-01 11:29     ` Feng Tang
2021-05-31 14:05 ` [v3 PATCH 3/3] mm/mempolicy: unify the parameter sanity check for mbind and set_mempolicy Feng Tang
2021-06-01  8:46   ` Michal Hocko
2021-05-31 21:41 ` [v3 PATCH 0/3] mm/mempolicy: some fix and semantics cleanup Andrew Morton
2021-06-01  0:55   ` Feng Tang
2021-06-01  8:48     ` Michal Hocko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YLXzd95duZ3va7Te@dhcp22.suse.cz \
    --to=mhocko@suse.com \
    --cc=aarcange@redhat.com \
    --cc=ak@linux.intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=ben.widawsky@intel.com \
    --cc=dan.j.williams@intel.com \
    --cc=dave.hansen@intel.com \
    --cc=feng.tang@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@techsingularity.net \
    --cc=mike.kravetz@oracle.com \
    --cc=rdunlap@infradead.org \
    --cc=rientjes@google.com \
    --cc=vbabka@suse.cz \
    --cc=ying.huang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.