linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Yang Shi <yang.shi@linux.alibaba.com>
To: Michal Hocko <mhocko@kernel.org>
Cc: Cyrill Gorcunov <gorcunov@gmail.com>,
	adobriyan@gmail.com, willy@infradead.org, mguzik@redhat.com,
	akpm@linux-foundation.org, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org
Subject: Re: [v3 PATCH] mm: introduce arg_lock to protect arg_start|end and env_start|end in mm_struct
Date: Thu, 12 Apr 2018 09:20:24 -0700	[thread overview]
Message-ID: <49c17035-1b8c-5fa3-9944-33467589d1f1@linux.alibaba.com> (raw)
In-Reply-To: <20180412121801.GE23400@dhcp22.suse.cz>



On 4/12/18 5:18 AM, Michal Hocko wrote:
> On Tue 10-04-18 11:28:13, Yang Shi wrote:
>>
>> On 4/10/18 9:21 AM, Yang Shi wrote:
>>>
>>> On 4/10/18 5:28 AM, Cyrill Gorcunov wrote:
>>>> On Tue, Apr 10, 2018 at 01:10:01PM +0200, Michal Hocko wrote:
>>>>>> Because do_brk does vma manipulations, for this reason it's
>>>>>> running under down_write_killable(&mm->mmap_sem). Or you
>>>>>> mean something else?
>>>>> Yes, all we need the new lock for is to get a consistent view on brk
>>>>> values. I am simply asking whether there is something fundamentally
>>>>> wrong by doing the update inside the new lock while keeping the
>>>>> original
>>>>> mmap_sem locking in the brk path. That would allow us to drop the
>>>>> mmap_sem lock in the proc path when looking at brk values.
>>>> Michal gimme some time. I guess  we might do so, but I need some
>>>> spare time to take more precise look into the code, hopefully today
>>>> evening. Also I've a suspicion that we've wracked check_data_rlimit
>>>> with this new lock in prctl. Need to verify it again.
>>> I see you guys points. We might be able to move the drop of mmap_sem
>>> before setting mm->brk in sys_brk since mmap_sem should be used to
>>> protect vma manipulation only, then protect the value modify with the
>>> new arg_lock. Then we can eliminate mmap_sem stuff in prctl path, and it
>>> also prevents from wrecking check_data_rlimit.
>>>
>>> At the first glance, it looks feasible to me. Will look into deeper
>>> later.
>> A further look told me this might be *not* feasible.
>>
>> It looks the new lock will not break check_data_rlimit since in my patch
>> both start_brk and brk is protected by mmap_sem. The code flow might look
>> like below:
>>
>> CPU A                             CPU B
>> --------                       --------
>> prctl                               sys_brk
>>                                        down_write
>> check_data_rlimit           check_data_rlimit (need mm->start_brk)
>>                                        set brk
>> down_write                    up_write
>> set start_brk
>> set brk
>> up_write
>>
>>
>> If CPU A gets the mmap_sem first, it will set start_brk and brk, then CPU B
>> will check with the new start_brk. And, prctl doesn't care if sys_brk is run
>> before it since it gets the new start_brk and brk from parameter.
>>
>> If we protect start_brk and brk with the new lock, sys_brk might get old
>> start_brk, then sys_brk might break rlimit check silently, is that right?
>>
>> So, it looks using new lock in prctl and keeping mmap_sem in brk path has
>> race condition.
> OK, I've admittedly didn't give it too much time to think about. Maybe
> we do something clever to remove the race but can we start at least by
> reducing the write lock to read on prctl side and use the dedicated
> spinlock for updating values? That should close the above race AFAICS
> and the read lock would be much more friendly to other VM operations.

Yes, is sounds feasible. We just need care about prctl is run before 
sys_brk. So, you mean:

down_read
spin_lock
update all the values
spin_unlock
up_read


>

  reply	other threads:[~2018-04-12 16:20 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-04-09 21:52 [v3 PATCH] mm: introduce arg_lock to protect arg_start|end and env_start|end in mm_struct Yang Shi
2018-04-10  8:48 ` Cyrill Gorcunov
2018-04-10  9:09 ` Michal Hocko
2018-04-10  9:40   ` Cyrill Gorcunov
2018-04-10 10:42     ` Michal Hocko
2018-04-10 11:02       ` Cyrill Gorcunov
2018-04-10 11:10         ` Michal Hocko
2018-04-10 12:28           ` Cyrill Gorcunov
2018-04-10 16:21             ` Yang Shi
2018-04-10 18:28               ` Yang Shi
2018-04-10 19:17                 ` Cyrill Gorcunov
2018-04-10 19:33                   ` Yang Shi
2018-04-10 20:06                     ` Cyrill Gorcunov
2018-04-12 12:18                 ` Michal Hocko
2018-04-12 16:20                   ` Yang Shi [this message]
2018-04-13  6:56                     ` Michal Hocko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=49c17035-1b8c-5fa3-9944-33467589d1f1@linux.alibaba.com \
    --to=yang.shi@linux.alibaba.com \
    --cc=adobriyan@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=gorcunov@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mguzik@redhat.com \
    --cc=mhocko@kernel.org \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).