From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id C7AC6C433EF for ; Wed, 12 Jan 2022 11:16:18 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 60D0C6B0142; Wed, 12 Jan 2022 06:16:18 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 5BD056B0143; Wed, 12 Jan 2022 06:16:18 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 4AAFD6B0144; Wed, 12 Jan 2022 06:16:18 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0170.hostedemail.com [216.40.44.170]) by kanga.kvack.org (Postfix) with ESMTP id 3D9046B0142 for ; Wed, 12 Jan 2022 06:16:18 -0500 (EST) Received: from smtpin27.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id D468C8249980 for ; Wed, 12 Jan 2022 11:16:17 +0000 (UTC) X-FDA: 79021381194.27.AE400C2 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by imf04.hostedemail.com (Postfix) with ESMTP id 4276240013 for ; Wed, 12 Jan 2022 11:16:17 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1641986176; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=rvfHccyQZvOEcbK/VSQsdfq4siS9aIa44pr1d+OE6yo=; b=D2bSwqmqktCx8JqUBuCKZkxV+h0YRIJYwFt/tCXuARtk99NFqhpwLRnSR/Hafcgf/L4lwb T7Gywj8CEfJyb5Rq1EqNzfGk2+hmlY5D+RvTr714ZLD7Fek857O8qKuEEc/5YUQ1UxIJAt jx0odFtbuFS9+etJHfz8ENNe/WTdiUs= Received: from mail-ed1-f70.google.com (mail-ed1-f70.google.com [209.85.208.70]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-315-6MVz1Km9OpmuPMGmvTG_qg-1; Wed, 12 Jan 2022 06:16:15 -0500 X-MC-Unique: 6MVz1Km9OpmuPMGmvTG_qg-1 Received: by mail-ed1-f70.google.com with SMTP id z9-20020a05640240c900b003fea688a17eso1975079edb.10 for ; Wed, 12 Jan 2022 03:16:15 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:date:mime-version:user-agent:subject :content-language:to:cc:references:from:organization:in-reply-to :content-transfer-encoding; bh=rvfHccyQZvOEcbK/VSQsdfq4siS9aIa44pr1d+OE6yo=; b=SFgzXLuqLWhcG0HJQthuim10lBq+WJef9Q7vYGLccQcgDKYsGmmVWqjIMHI0ekocwC Etak40MSgo+eDbhHll1iFkEbstHYM/AFXqJ3KB5NF4hHv9NAzWNHvo5hzjeUPz18Hw4x 7PKVCY0gga/mZwxkrTxmjJB+pRrYadildchVXPm3J+MLkW2Rac40TCbTYwCDZUOZrTq2 yQiXRnPsMvQUGvwbTRKJwTpGY+IoAPTLuB48ukeioNU26t1L3dSeQph9NFJk0+2ARPFG IQNROO+CNHAKb02b/FacoT0JN2PN4cheAK5Qfc63m5hZxYbPhJZqDuQeI4u3zSU7BQYl 5Pbw== X-Gm-Message-State: AOAM531yOC2hNwLU86X4t37ydN31scgbe5ol3jkxIXdR38yWOtveNkQX 1JA9H5vcSVXzfYbWFhnCTpjcRADxZ05xkqNr+G519E6ju6k/N4f+bm+59Wiz+jbc/Ie0wO8PYAU mPVZG6JiKzy0= X-Received: by 2002:a17:906:974a:: with SMTP id o10mr7201046ejy.226.1641986174390; Wed, 12 Jan 2022 03:16:14 -0800 (PST) X-Google-Smtp-Source: ABdhPJylcjOYtqv8QO2gzLMiyLf36NeGdQwoe97SOgETUwT3484y0JFJN63XyHPi2xLKoaaQHMvZ5w== X-Received: by 2002:a17:906:974a:: with SMTP id o10mr7201027ejy.226.1641986174194; Wed, 12 Jan 2022 03:16:14 -0800 (PST) Received: from ?IPV6:2003:cb:c702:4700:e25f:39eb:3cb8:1dec? (p200300cbc7024700e25f39eb3cb81dec.dip0.t-ipconnect.de. [2003:cb:c702:4700:e25f:39eb:3cb8:1dec]) by smtp.gmail.com with ESMTPSA id f18sm6068251edf.95.2022.01.12.03.16.13 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 12 Jan 2022 03:16:13 -0800 (PST) Message-ID: <8c4df8e4-ef99-c3fd-dcca-759e92739d4c@redhat.com> Date: Wed, 12 Jan 2022 12:16:13 +0100 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.4.0 Subject: Re: [PATCH v3 00/10] Add MEMORY_DEVICE_COHERENT for coherent device memory mapping To: Alex Sierra , akpm@linux-foundation.org, Felix.Kuehling@amd.com, linux-mm@kvack.org, rcampbell@nvidia.com, linux-ext4@vger.kernel.org, linux-xfs@vger.kernel.org Cc: amd-gfx@lists.freedesktop.org, dri-devel@lists.freedesktop.org, hch@lst.de, jgg@nvidia.com, jglisse@redhat.com, apopple@nvidia.com, willy@infradead.org References: <20220110223201.31024-1-alex.sierra@amd.com> From: David Hildenbrand Organization: Red Hat In-Reply-To: <20220110223201.31024-1-alex.sierra@amd.com> X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Language: en-US Content-Type: text/plain; charset=UTF-8 X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 4276240013 X-Stat-Signature: t34nsqt5477enun7mj1wpmnjpftysymn Authentication-Results: imf04.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=D2bSwqmq; spf=none (imf04.hostedemail.com: domain of david@redhat.com has no SPF policy when checking 170.10.129.124) smtp.mailfrom=david@redhat.com; dmarc=pass (policy=none) header.from=redhat.com X-HE-Tag: 1641986177-927136 Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 10.01.22 23:31, Alex Sierra wrote: > This patch series introduces MEMORY_DEVICE_COHERENT, a type of memory > owned by a device that can be mapped into CPU page tables like > MEMORY_DEVICE_GENERIC and can also be migrated like > MEMORY_DEVICE_PRIVATE. >=20 > Christoph, the suggestion to incorporate Ralph Campbell=E2=80=99s refco= unt > cleanup patch into our hardware page migration patchset originally came > from you, but it proved impractical to do things in that order because > the refcount cleanup introduced a bug with wide ranging structural > implications. Instead, we amended Ralph=E2=80=99s patch so that it coul= d be > applied after merging the migration work. As we saw from the recent > discussion, merging the refcount work is going to take some time and > cooperation between multiple development groups, while the migration > work is ready now and is needed now. So we propose to merge this > patchset first and continue to work with Ralph and others to merge the > refcount cleanup separately, when it is ready. >=20 > This patch series is mostly self-contained except for a few places wher= e > it needs to update other subsystems to handle the new memory type. > System stability and performance are not affected according to our > ongoing testing, including xfstests. >=20 > How it works: The system BIOS advertises the GPU device memory > (aka VRAM) as SPM (special purpose memory) in the UEFI system address > map. >=20 > The amdgpu driver registers the memory with devmap as > MEMORY_DEVICE_COHERENT using devm_memremap_pages. The initial user for > this hardware page migration capability is the Frontier supercomputer > project. This functionality is not AMD-specific. We expect other GPU > vendors to find this functionality useful, and possibly other hardware > types in the future. >=20 > Our test nodes in the lab are similar to the Frontier configuration, > with .5 TB of system memory plus 256 GB of device memory split across > 4 GPUs, all in a single coherent address space. Page migration is > expected to improve application efficiency significantly. We will > report empirical results as they become available. Hi, might be a dumb question because I'm not too familiar with MEMORY_DEVICE_COHERENT, but who's in charge of migrating *to* that memory? Or how does a process ever get a grab on such pages? And where does migration come into play? I assume migration is only required to migrate off of that device memory to ordinary system RAM when required because the device memory has to be freed up, correct? (a high level description on how this is exploited from users space would be great) --=20 Thanks, David / dhildenb