From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id CF485C433FE for ; Thu, 3 Dec 2020 16:59:44 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 1B9B5207A3 for ; Thu, 3 Dec 2020 16:59:43 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 1B9B5207A3 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=ziepe.ca Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 5A8B16B0036; Thu, 3 Dec 2020 11:59:43 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 5592C6B005C; Thu, 3 Dec 2020 11:59:43 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 44A776B0068; Thu, 3 Dec 2020 11:59:43 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0225.hostedemail.com [216.40.44.225]) by kanga.kvack.org (Postfix) with ESMTP id 2B18B6B0036 for ; Thu, 3 Dec 2020 11:59:43 -0500 (EST) Received: from smtpin18.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id D4CF2181AC9C6 for ; Thu, 3 Dec 2020 16:59:42 +0000 (UTC) X-FDA: 77552582604.18.box20_1f04873273bd Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin18.hostedemail.com (Postfix) with ESMTP id 8804D100EC660 for ; Thu, 3 Dec 2020 16:59:41 +0000 (UTC) X-HE-Tag: box20_1f04873273bd X-Filterd-Recvd-Size: 5530 Received: from mail-qk1-f193.google.com (mail-qk1-f193.google.com [209.85.222.193]) by imf24.hostedemail.com (Postfix) with ESMTP for ; Thu, 3 Dec 2020 16:59:40 +0000 (UTC) Received: by mail-qk1-f193.google.com with SMTP id 1so2760747qka.0 for ; Thu, 03 Dec 2020 08:59:39 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ziepe.ca; s=google; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=czCu/+Tdgj/CLWty8jG8WgsaqrGCer09a2DoA0MYkto=; b=iI/YbaGZiZ8ezNWZOKHS1I8abWcFF48J8hDgAiBc1iV1xvnRFycyNHx0DmucGWUZkg iMbmMGmTJHsdPtbLBbYpJtSTma4kqX8nBM1e6RTU5HKvkkDRRRxMMjxRXlKK9QR5lnkN ixMb+RHtG1ARvnFnCvl0SLgmNsC5Xp9R8eFShakOgeSd6KAK74b+GKfMVZF8EUDzqs6B m7jFQKoWvQo/M/yUolxPNMuj+DRKHeBF5lfMB+/tzaH8BIIAZO1aoaPadreT2NiW/kSg F9ac9Y7Sl2O5PiDS07CC2kNhepqik2Or7e5S7ryXA8yfZFngrJ9ovtmFcwJXt/r0ro2q Ipxg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=czCu/+Tdgj/CLWty8jG8WgsaqrGCer09a2DoA0MYkto=; b=Am/gr33bSi/HqjY+AmMRl2JJTN2NY86fk5t83vCyVN7/Q8ZIpgfChQj1V86AmmoC4l 2cHCxiQHofiLp7A+Bia0B0XWAA9W33sR9DQOwdRttEXLQIohl1T2k+sLuzbWgsm3xsyF 1TzC4H+WSokyI4mIFufdfiCWuAvwzG3uMEu1htKiUceXtoCUVqsrGuM+8lukG7IJCZfZ JLgO8ObUvpRqv1X8uMfN4gXAqCLH/t9RpMui92PubXzC5lAKFeb24LFMA3f847WlcR49 xKS0ux30G6vaa8FVudIs47zk+Sl1flL7NaamcGnjbvJPj/sJgcFQn8g16s798a7zSA1P GfHQ== X-Gm-Message-State: AOAM530KXslCdo3fMIr2m02tQ/e6oC7x4nlu2kzD6zZAGHvuN/AjnJRc 2m9xyYHN0MbGujoGssilED6+dw== X-Google-Smtp-Source: ABdhPJw8kYBhMi7M0265TZ6ENGwpBqQw4QVMxtcpBHnWeF52m3QaSmOMSpYEDN/V4gxx/ShI0p8CCA== X-Received: by 2002:a05:620a:624:: with SMTP id 4mr3709568qkv.351.1607014779195; Thu, 03 Dec 2020 08:59:39 -0800 (PST) Received: from ziepe.ca ([206.223.160.26]) by smtp.gmail.com with ESMTPSA id n81sm483955qka.76.2020.12.03.08.59.38 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 03 Dec 2020 08:59:38 -0800 (PST) Received: from jgg by mlx with local (Exim 4.94) (envelope-from ) id 1kkrxB-005YNk-9T; Thu, 03 Dec 2020 12:59:37 -0400 Date: Thu, 3 Dec 2020 12:59:37 -0400 From: Jason Gunthorpe To: Pavel Tatashin Cc: LKML , linux-mm , Andrew Morton , Vlastimil Babka , Michal Hocko , David Hildenbrand , Oscar Salvador , Dan Williams , Sasha Levin , Tyler Hicks , Joonsoo Kim , mike.kravetz@oracle.com, Steven Rostedt , Ingo Molnar , Peter Zijlstra , Mel Gorman , Matthew Wilcox , David Rientjes , John Hubbard Subject: Re: [PATCH 6/6] mm/gup: migrate pinned pages out of movable zone Message-ID: <20201203165937.GU5487@ziepe.ca> References: <20201202052330.474592-1-pasha.tatashin@soleen.com> <20201202052330.474592-7-pasha.tatashin@soleen.com> <20201202163507.GL5487@ziepe.ca> <20201203010809.GQ5487@ziepe.ca> <20201203141729.GS5487@ziepe.ca> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Thu, Dec 03, 2020 at 11:40:15AM -0500, Pavel Tatashin wrote: > > Looking at this code some more.. How is it even correct? > > > > 1633 if (!isolate_lru_page(head)) { > > 1634 list_add_tail(&head->lru, &cma_page_list); > > > > Here we are only running under the read side of the mmap sem so multiple > > GUPs can be calling that sequence in parallel. I don't see any > > obvious exclusion that will prevent corruption of head->lru. The first > > GUP thread to do isolate_lru_page() will ClearPageLRU() and the second > > GUP thread will be a NOP for isolate_lru_page(). > > > > They will both race list_add_tail and other list ops. That is not OK. > > Good question. I studied it, and I do not see how this is OK. Worse, > this race is also exposable as a syscall instead of via driver: two > move_pages() run simultaneously. Perhaps in other places? > > move_pages() > kernel_move_pages() > mmget() > do_pages_move() > add_page_for_migratio() > mmap_read_lock(mm); > list_add_tail(&head->lru, pagelist); <- Not protected When this was CMA only it might have been rarer to trigger, but this move stuff sounds like it makes it much more broadly, eg on typical servers with RDMA exposed/etc Seems like it needs fixing as part of this too :\ Page at a time inside the gup loop could address both concerns, unsure about batching performance here though.. Jason