From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-lf0-f71.google.com (mail-lf0-f71.google.com [209.85.215.71]) by kanga.kvack.org (Postfix) with ESMTP id 5C6DD828F6 for ; Sun, 31 Jul 2016 05:44:42 -0400 (EDT) Received: by mail-lf0-f71.google.com with SMTP id 33so58727050lfw.1 for ; Sun, 31 Jul 2016 02:44:42 -0700 (PDT) Received: from mail-wm0-f65.google.com (mail-wm0-f65.google.com. [74.125.82.65]) by mx.google.com with ESMTPS id m126si11230902wmm.55.2016.07.31.02.44.40 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sun, 31 Jul 2016 02:44:40 -0700 (PDT) Received: by mail-wm0-f65.google.com with SMTP id o80so21904024wme.0 for ; Sun, 31 Jul 2016 02:44:40 -0700 (PDT) Date: Sun, 31 Jul 2016 11:44:38 +0200 From: Michal Hocko Subject: Re: [PATCH 09/10] vhost, mm: make sure that oom_reaper doesn't reap memory read by vhost Message-ID: <20160731094438.GA24353@dhcp22.suse.cz> References: <1469734954-31247-1-git-send-email-mhocko@kernel.org> <1469734954-31247-10-git-send-email-mhocko@kernel.org> <20160728233359-mutt-send-email-mst@kernel.org> <20160729060422.GA5504@dhcp22.suse.cz> <20160729161039-mutt-send-email-mst@kernel.org> <20160729133529.GE8031@dhcp22.suse.cz> <20160729205620-mutt-send-email-mst@kernel.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20160729205620-mutt-send-email-mst@kernel.org> Sender: owner-linux-mm@kvack.org List-ID: To: "Michael S. Tsirkin" Cc: linux-mm@kvack.org, Andrew Morton , Tetsuo Handa , Oleg Nesterov , David Rientjes , Vladimir Davydov , "Paul E. McKenney" On Fri 29-07-16 20:57:44, Michael S. Tsirkin wrote: > On Fri, Jul 29, 2016 at 03:35:29PM +0200, Michal Hocko wrote: > > On Fri 29-07-16 16:14:10, Michael S. Tsirkin wrote: > > > On Fri, Jul 29, 2016 at 08:04:22AM +0200, Michal Hocko wrote: > > > > On Thu 28-07-16 23:41:53, Michael S. Tsirkin wrote: > > > > > On Thu, Jul 28, 2016 at 09:42:33PM +0200, Michal Hocko wrote: > > [...] > > > > > > and the reader would hit a page fault > > > > > > + * if it stumbled over a reaped memory. > > > > > > > > > > This last point I don't get. flag read could bypass data read > > > > > if that happens data read could happen after unmap > > > > > yes it might get a PF but you handle that, correct? > > > > > > > > The point I've tried to make is that if the reader really page faults > > > > then get_user will imply the full barrier already. If get_user didn't > > > > page fault then the state of the flag is not really important because > > > > the reaper shouldn't have touched it. Does it make more sense now or > > > > I've missed your question? > > > > > > Can task flag read happen before the get_user pagefault? > > > > Do you mean? > > > > get_user_mm() > > temp = false <- test_bit(MMF_UNSTABLE, &mm->flags) > > ret = __get_user(x, ptr) > > #PF > > if (!ret && temp) # misses the flag > > > > The code is basically doing > > > > if (!__get_user() && test_bit(MMF_UNSTABLE, &mm->flags)) > > > > so test_bit part of the conditional cannot be evaluated before > > __get_user() part is done. Compiler cannot reorder two depending > > subconditions AFAIK. > > But maybe the CPU can. Are you sure? How does that differ from if (ptr && ptr->something) construct? Let's CC Paul. Just to describe the situation. We have the following situation: #define __get_user_mm(mm, x, ptr) \ ({ \ int ___gu_err = __get_user(x, ptr); \ if (!___gu_err && test_bit(MMF_UNSTABLE, &mm->flags)) \ ___gu_err = -EFAULT; \ ___gu_err; \ }) and the oom reaper doing: set_bit(MMF_UNSTABLE, &mm->flags); for (vma = mm->mmap ; vma; vma = vma->vm_next) { unmap_page_range I assume that write memory barrier between set_bit and unmap_page_range is not really needed because unmapping should already imply the memory barrier. A read memory barrier between __get_user and test_bit shouldn't be really needed because we can tolerate a stale value if __get_user didn't #PF because we haven't unmapped that address obviously. If we unmapped it then __get_user would #PF and that should imply a full memory barrier as well. Now the question is whether a CPU can speculate and read the flag before we issue the #PF. -- Michal Hocko SUSE Labs -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org