From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1161879AbeCAVDq (ORCPT ); Thu, 1 Mar 2018 16:03:46 -0500 Received: from mx0a-001b2d01.pphosted.com ([148.163.156.1]:34314 "EHLO mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1161818AbeCAVDn (ORCPT ); Thu, 1 Mar 2018 16:03:43 -0500 Subject: Re: [PATCH v2 00/10] Copy Offload in NVMe Fabrics with P2P PCI Memory From: Benjamin Herrenschmidt Reply-To: benh@au1.ibm.com To: Dan Williams Cc: Logan Gunthorpe , Linux Kernel Mailing List , linux-pci@vger.kernel.org, linux-nvme@lists.infradead.org, linux-rdma , linux-nvdimm , linux-block@vger.kernel.org, Stephen Bates , Christoph Hellwig , Jens Axboe , Keith Busch , Sagi Grimberg , Bjorn Helgaas , Jason Gunthorpe , Max Gurtovoy , =?ISO-8859-1?Q?J=E9r=F4me?= Glisse , Alex Williamson , Oliver OHalloran Date: Fri, 02 Mar 2018 08:03:30 +1100 In-Reply-To: References: <20180228234006.21093-1-logang@deltatee.com> <1519876489.4592.3.camel@kernel.crashing.org> <1519876569.4592.4.camel@au1.ibm.com> Organization: IBM Australia Content-Type: text/plain; charset="UTF-8" X-Mailer: Evolution 3.26.5 (3.26.5-1.fc27) Mime-Version: 1.0 Content-Transfer-Encoding: 7bit X-TM-AS-GCONF: 00 x-cbid: 18030121-0012-0000-0000-000005B79F78 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 18030121-0013-0000-0000-00001933A5BA Message-Id: <1519938210.4592.30.camel@au1.ibm.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:,, definitions=2018-03-01_11:,, signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 priorityscore=1501 malwarescore=0 suspectscore=0 phishscore=0 bulkscore=0 spamscore=0 clxscore=1011 lowpriorityscore=0 impostorscore=0 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1709140000 definitions=main-1803010257 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, 2018-03-01 at 11:21 -0800, Dan Williams wrote: > > > The devm_memremap_pages() infrastructure allows placing the memmap in > "System-RAM" even if the hotplugged range is in PCI space. So, even if > it is an issue on some configurations, it's just a simple adjustment > to where the memmap is placed. Actually can you explain a bit more here ? devm_memremap_pages() doesn't take any specific argument about what to do with the memory. It does create the vmemmap sections etc... but does so by calling arch_add_memory(). So __add_memory() isn't called, which means the pages aren't added to the linear mapping. Then you manually add them to ZONE_DEVICE. Am I correct ? In that case, they indeed can't be used as normal memory pages, which is good, and if they are indeed not in the linear mapping, then there is no caching issues. However, what happens if anything calls page_address() on them ? Some DMA ops do that for example, or some devices might ... This is all quite convoluted with no documentation I can find that explains the various expectations. So the question is are those pages landing in the linear mapping, and if yes, by what code path ? The next question is if we ever want that to work on ppc64, we need a way to make this fit in our linear mapping and map it non-cachable, which will require some wrangling on how we handle that mapping. Cheers, Ben.