From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id D848DC4361A for ; Thu, 3 Dec 2020 17:00:42 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 84CD9207A5 for ; Thu, 3 Dec 2020 17:00:42 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2436640AbgLCRA0 (ORCPT ); Thu, 3 Dec 2020 12:00:26 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35296 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725918AbgLCRAZ (ORCPT ); Thu, 3 Dec 2020 12:00:25 -0500 Received: from mail-qk1-x742.google.com (mail-qk1-x742.google.com [IPv6:2607:f8b0:4864:20::742]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E3060C061A4E for ; Thu, 3 Dec 2020 08:59:39 -0800 (PST) Received: by mail-qk1-x742.google.com with SMTP id y197so2712133qkb.7 for ; Thu, 03 Dec 2020 08:59:39 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ziepe.ca; s=google; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=czCu/+Tdgj/CLWty8jG8WgsaqrGCer09a2DoA0MYkto=; b=iI/YbaGZiZ8ezNWZOKHS1I8abWcFF48J8hDgAiBc1iV1xvnRFycyNHx0DmucGWUZkg iMbmMGmTJHsdPtbLBbYpJtSTma4kqX8nBM1e6RTU5HKvkkDRRRxMMjxRXlKK9QR5lnkN ixMb+RHtG1ARvnFnCvl0SLgmNsC5Xp9R8eFShakOgeSd6KAK74b+GKfMVZF8EUDzqs6B m7jFQKoWvQo/M/yUolxPNMuj+DRKHeBF5lfMB+/tzaH8BIIAZO1aoaPadreT2NiW/kSg F9ac9Y7Sl2O5PiDS07CC2kNhepqik2Or7e5S7ryXA8yfZFngrJ9ovtmFcwJXt/r0ro2q Ipxg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=czCu/+Tdgj/CLWty8jG8WgsaqrGCer09a2DoA0MYkto=; b=dCXveEerlDhLhM7sHj4/G3OktxXfPgQZO/0gvioL2POKBh+meazb+zFXg3mdtS1R++ uKQV7XhYNRPdqIl/R5UnjRslIZKAJ5gejNSjO+KdIiI6bO33bzNlZicDEslrpa1SbZP2 rJ5faM4chyRnxW6MhSkbeczUgPQk2201wUX+nVTMmIDroFmnmAjs8Zyc73jbuqKkvLTw nVLwOHG9lMFQDzKr2iIFMxlZY49+vBkGWk8t535HymVOgtH85Toe0Oamr1qH6VNJUb7p rpZZKafqMqJXkPKb5HKUJVbZOYLy7QOhEi7PBMXtYd1GdEyw4EaiPSwOAL+Z0X1idjY7 TyNg== X-Gm-Message-State: AOAM533RW/G6PPK3KA1Zm78+G4l46em/7MAE6K3w2n1pWcX4lQZCQkeT 6SY+NoVoE1IqhnZUzdoC91UNlA== X-Google-Smtp-Source: ABdhPJw8kYBhMi7M0265TZ6ENGwpBqQw4QVMxtcpBHnWeF52m3QaSmOMSpYEDN/V4gxx/ShI0p8CCA== X-Received: by 2002:a05:620a:624:: with SMTP id 4mr3709568qkv.351.1607014779195; Thu, 03 Dec 2020 08:59:39 -0800 (PST) Received: from ziepe.ca ([206.223.160.26]) by smtp.gmail.com with ESMTPSA id n81sm483955qka.76.2020.12.03.08.59.38 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 03 Dec 2020 08:59:38 -0800 (PST) Received: from jgg by mlx with local (Exim 4.94) (envelope-from ) id 1kkrxB-005YNk-9T; Thu, 03 Dec 2020 12:59:37 -0400 Date: Thu, 3 Dec 2020 12:59:37 -0400 From: Jason Gunthorpe To: Pavel Tatashin Cc: LKML , linux-mm , Andrew Morton , Vlastimil Babka , Michal Hocko , David Hildenbrand , Oscar Salvador , Dan Williams , Sasha Levin , Tyler Hicks , Joonsoo Kim , mike.kravetz@oracle.com, Steven Rostedt , Ingo Molnar , Peter Zijlstra , Mel Gorman , Matthew Wilcox , David Rientjes , John Hubbard Subject: Re: [PATCH 6/6] mm/gup: migrate pinned pages out of movable zone Message-ID: <20201203165937.GU5487@ziepe.ca> References: <20201202052330.474592-1-pasha.tatashin@soleen.com> <20201202052330.474592-7-pasha.tatashin@soleen.com> <20201202163507.GL5487@ziepe.ca> <20201203010809.GQ5487@ziepe.ca> <20201203141729.GS5487@ziepe.ca> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Dec 03, 2020 at 11:40:15AM -0500, Pavel Tatashin wrote: > > Looking at this code some more.. How is it even correct? > > > > 1633 if (!isolate_lru_page(head)) { > > 1634 list_add_tail(&head->lru, &cma_page_list); > > > > Here we are only running under the read side of the mmap sem so multiple > > GUPs can be calling that sequence in parallel. I don't see any > > obvious exclusion that will prevent corruption of head->lru. The first > > GUP thread to do isolate_lru_page() will ClearPageLRU() and the second > > GUP thread will be a NOP for isolate_lru_page(). > > > > They will both race list_add_tail and other list ops. That is not OK. > > Good question. I studied it, and I do not see how this is OK. Worse, > this race is also exposable as a syscall instead of via driver: two > move_pages() run simultaneously. Perhaps in other places? > > move_pages() > kernel_move_pages() > mmget() > do_pages_move() > add_page_for_migratio() > mmap_read_lock(mm); > list_add_tail(&head->lru, pagelist); <- Not protected When this was CMA only it might have been rarer to trigger, but this move stuff sounds like it makes it much more broadly, eg on typical servers with RDMA exposed/etc Seems like it needs fixing as part of this too :\ Page at a time inside the gup loop could address both concerns, unsure about batching performance here though.. Jason