From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.4 required=3.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 37CB0C43603 for ; Wed, 4 Dec 2019 23:45:41 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id D682D2073C for ; Wed, 4 Dec 2019 23:45:40 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=nvidia.com header.i=@nvidia.com header.b="fgESFJzf" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org D682D2073C Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=nvidia.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 3FAD16B0D2E; Wed, 4 Dec 2019 18:45:40 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 3AB906B0D2F; Wed, 4 Dec 2019 18:45:40 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 29BA16B0D30; Wed, 4 Dec 2019 18:45:40 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0129.hostedemail.com [216.40.44.129]) by kanga.kvack.org (Postfix) with ESMTP id 0FF666B0D2E for ; Wed, 4 Dec 2019 18:45:40 -0500 (EST) Received: from smtpin14.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with SMTP id 78DEF37E7 for ; Wed, 4 Dec 2019 23:45:39 +0000 (UTC) X-FDA: 76229093598.14.jeans43_143a83916b300 X-HE-Tag: jeans43_143a83916b300 X-Filterd-Recvd-Size: 4966 Received: from hqnvemgate25.nvidia.com (hqnvemgate25.nvidia.com [216.228.121.64]) by imf01.hostedemail.com (Postfix) with ESMTP for ; Wed, 4 Dec 2019 23:45:38 +0000 (UTC) Received: from hqpgpgate101.nvidia.com (Not Verified[216.228.121.13]) by hqnvemgate25.nvidia.com (using TLS: TLSv1.2, DES-CBC3-SHA) id ; Wed, 04 Dec 2019 15:45:33 -0800 Received: from hqmail.nvidia.com ([172.20.161.6]) by hqpgpgate101.nvidia.com (PGP Universal service); Wed, 04 Dec 2019 15:45:37 -0800 X-PGP-Universal: processed; by hqpgpgate101.nvidia.com on Wed, 04 Dec 2019 15:45:37 -0800 Received: from [10.110.48.28] (10.124.1.5) by HQMAIL107.nvidia.com (172.20.187.13) with Microsoft SMTP Server (TLS) id 15.0.1473.3; Wed, 4 Dec 2019 23:45:37 +0000 Subject: Re: bug: move_pages(2) does not udpate "status" if no pages are moved To: Yang Shi CC: Felix Abecassis , Linux MM , Andrew Morton References: <1a8ccccb-e429-45d3-3615-b3b8bf04c6fe@nvidia.com> From: John Hubbard X-Nvconfidentiality: public Message-ID: <894a7d96-b715-bec5-2f72-1552891672ff@nvidia.com> Date: Wed, 4 Dec 2019 15:45:37 -0800 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.2.2 MIME-Version: 1.0 In-Reply-To: X-Originating-IP: [10.124.1.5] X-ClientProxiedBy: HQMAIL111.nvidia.com (172.20.187.18) To HQMAIL107.nvidia.com (172.20.187.13) Content-Type: text/plain; charset="utf-8" Content-Language: en-US Content-Transfer-Encoding: 7bit DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=nvidia.com; s=n1; t=1575503133; bh=XimNoA7lopVM7sVwZC8vwHgFFUwyZVizIajUmLskF7I=; h=X-PGP-Universal:Subject:To:CC:References:From:X-Nvconfidentiality: Message-ID:Date:User-Agent:MIME-Version:In-Reply-To: X-Originating-IP:X-ClientProxiedBy:Content-Type:Content-Language: Content-Transfer-Encoding; b=fgESFJzfwXwIGOY2BNPCNkhLvSfAxdyMl2ryNohZcjrjK2tyGCPVsG3ZZyK10hJmJ KjfamSydwPd2AX7l2/4EuhUN49frGIGn/4w9ifZcm+miWGQ384loS+/EcSsN1CNzy9 aCMjy+LCYEDlMWhOW6RT5BTH3G7MowVR5NTo1sv8iDait+WQ3dBSYnJ5vYjNcG+WRH XN079j31jQ8uAIf6d4JdBHTYU4GWLyMZmlXSzvIFNVh3oiK1gH5a0kHrc2YQsEqTHD BL8n+A8SosmUPfPCQ74Q+1nCfMibmxuqkOvKICDo0Gxs7djVvnuKcqZ6kKOKkU5QkK v03rEHtpf69EA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 12/4/19 12:04 PM, Yang Shi wrote: > On Wed, Dec 4, 2019 at 11:21 AM John Hubbard wrote: >> >> On 12/4/19 11:01 AM, Felix Abecassis wrote: >>> Hello all, >>> >> >> Hi Felix, >> >> Thanks for writing up a very clear description of the problem. >> >>> On kernel 5.3, when using the move_pages syscall (wrapped by libnuma) and all >>> pages happen to be on the right node already, this function returns 0 but the >>> "status" array is not updated. This array potentially contains garbage values >>> (e.g. from malloc(3)), and I don't see a way to detect this. >> >> >> The way to detect this case would be to zero the array before calling move_pages(). >> Then, if move_pages() returns 0, and the array remains full of zeroes, you can >> conclude that move_pages() "succeeded", and that there were no errors for any >> of the pages. So the pages are where you requested them to end up. > > I don't think we can just simply return all zeros here. It looks the > status should contain error code or the target node id if the page is > moved to that node successfully. So, if the page is already on the > requested node, the status should contain the current node id, but the > current node maybe not 0. > > So, IMHO it sounds like a valid bug. > Yes, you're right, a more precise reading of the man page does support that: if move_pages() returns 0, then the status array *must* contain valid node IDs. I see. (Felix also mentioned the same thing, in a side note.) Looking some more at both the man page and Felix's report (and the kernel implementation), it seems like there are maybe two bugs here: 1) Not setting the status array in some cases, if some pages were not moved for "non fatal" reasons, and 2) Returning success if no pages were moved. The "ERRORS" section of the man page seems to require that ENOENT be returned in that case. Although, you could perhaps argue that this statement is only unidirectional. In other words, maybe ENOENT happens, but it doesn't *have* to happen, if all pages were already on the target node. ERRORS ENOENT No pages were found that require moving. All pages are either already on the target node, not present, had an invalid address or could not be moved because they were mapped by multiple processes. thanks, -- John Hubbard NVIDIA