From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751510AbdGRPiV (ORCPT ); Tue, 18 Jul 2017 11:38:21 -0400 Received: from mx1.redhat.com ([209.132.183.28]:53980 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751338AbdGRPiT (ORCPT ); Tue, 18 Jul 2017 11:38:19 -0400 DMARC-Filter: OpenDMARC Filter v1.3.2 mx1.redhat.com 3ED0C7CE04 Authentication-Results: ext-mx04.extmail.prod.ext.phx2.redhat.com; dmarc=none (p=none dis=none) header.from=redhat.com Authentication-Results: ext-mx04.extmail.prod.ext.phx2.redhat.com; spf=pass smtp.mailfrom=jglisse@redhat.com DKIM-Filter: OpenDKIM Filter v2.11.0 mx1.redhat.com 3ED0C7CE04 Date: Tue, 18 Jul 2017 11:38:16 -0400 From: Jerome Glisse To: Bob Liu Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, John Hubbard , David Nellans , Dan Williams , Balbir Singh , Michal Hocko Subject: Re: [PATCH 0/6] Cache coherent device memory (CDM) with HMM v5 Message-ID: <20170718153816.GA3135@redhat.com> References: <20170713211532.970-1-jglisse@redhat.com> <2d534afc-28c5-4c81-c452-7e4c013ab4d0@huawei.com> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <2d534afc-28c5-4c81-c452-7e4c013ab4d0@huawei.com> User-Agent: Mutt/1.8.3 (2017-05-23) X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.28]); Tue, 18 Jul 2017 15:38:19 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Jul 18, 2017 at 11:26:51AM +0800, Bob Liu wrote: > On 2017/7/14 5:15, Jérôme Glisse wrote: > > Sorry i made horrible mistake on names in v4, i completly miss- > > understood the suggestion. So here i repost with proper naming. > > This is the only change since v3. Again sorry about the noise > > with v4. > > > > Changes since v4: > > - s/DEVICE_HOST/DEVICE_PUBLIC > > > > Git tree: > > https://cgit.freedesktop.org/~glisse/linux/log/?h=hmm-cdm-v5 > > > > > > Cache coherent device memory apply to architecture with system bus > > like CAPI or CCIX. Device connected to such system bus can expose > > their memory to the system and allow cache coherent access to it > > from the CPU. > > > > Even if for all intent and purposes device memory behave like regular > > memory, we still want to manage it in isolation from regular memory. > > Several reasons for that, first and foremost this memory is less > > reliable than regular memory if the device hangs because of invalid > > commands we can loose access to device memory. Second CPU access to > > this memory is expected to be slower than to regular memory. Third > > having random memory into device means that some of the bus bandwith > > wouldn't be available to the device but would be use by CPU access. > > > > This is why we want to manage such memory in isolation from regular > > memory. Kernel should not try to use this memory even as last resort > > when running out of memory, at least for now. > > > > I think set a very large node distance for "Cache Coherent Device Memory" > may be a easier way to address these concerns. Such approach was discuss at length in the past see links below. Outcome of discussion: - CPU less node are bad - device memory can be unreliable (device hang) no way for application to understand that - application and driver NUMA madvise/mbind/mempolicy ... can conflict with each other and no way the kernel can figure out which should apply - NUMA as it is now would not work as we need further isolation that what a large node distance would provide Probably few others argument i forget. https://lists.gt.net/linux/kernel/2551369 https://groups.google.com/forum/#!topic/linux.kernel/Za_e8C3XnRs%5B1-25%5D https://lwn.net/Articles/720380/ Cheers, Jérôme