From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751083AbdE3WGf (ORCPT ); Tue, 30 May 2017 18:06:35 -0400 Received: from mail-pf0-f175.google.com ([209.85.192.175]:36644 "EHLO mail-pf0-f175.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750825AbdE3WGd (ORCPT ); Tue, 30 May 2017 18:06:33 -0400 Subject: Re: Device address specific mapping of arm,mmu-500 To: Marc Zyngier , Will Deacon Cc: Robin Murphy , Mark Rutland , Joerg Roedel , linux-arm-kernel@lists.infradead.org, iommu@lists.linux-foundation.org, "linux-kernel@vger.kernel.org" References: <1b79efe2-6835-7a7a-f5ad-361391a7b967@broadcom.com> <20170530151437.GC23067@arm.com> <81637642-22d9-4868-156f-052f64bd042f@broadcom.com> <226bcebc-3902-90d3-24e5-51f2e1f3affb@arm.com> <16e5fc9d-b014-af7c-dcda-527522ac5cc9@arm.com> From: Ray Jui Message-ID: Date: Tue, 30 May 2017 15:06:24 -0700 User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:54.0) Gecko/20100101 Thunderbird/54.0 MIME-Version: 1.0 In-Reply-To: <16e5fc9d-b014-af7c-dcda-527522ac5cc9@arm.com> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Marc/Robin/Will, On 5/30/17 10:27 AM, Marc Zyngier wrote: > On 30/05/17 18:16, Ray Jui wrote: >> Hi Marc, >> >> On 5/30/17 9:59 AM, Marc Zyngier wrote: >>> On 30/05/17 17:49, Ray Jui wrote: >>>> Hi Will, >>>> >>>> On 5/30/17 8:14 AM, Will Deacon wrote: >>>>> On Mon, May 29, 2017 at 06:18:45PM -0700, Ray Jui wrote: >>>>>> I'm writing to check with you to see if the latest arm-smmu.c driver in >>>>>> v4.12-rc Linux for smmu-500 can support mapping that is only specific to >>>>>> a particular physical address range while leave the rest still to be >>>>>> handled by the client device. I believe this can already be supported by >>>>>> the device tree binding of the generic IOMMU framework; however, it is >>>>>> not clear to me whether or not the arm-smmu.c driver can support it. >>>>>> >>>>>> To give you some background information: >>>>>> >>>>>> We have a SoC that has PCIe root complex that has a build-in logic block >>>>>> to forward MSI writes to ARM GICv3 ITS. Unfortunately, this logic block >>>>>> has a HW bug that causes the MSI writes not parsed properly and can >>>>>> potentially corrupt data in the internal FIFO. A workaround is to have >>>>>> ARM MMU-500 takes care of all inbound transactions. I found that is >>>>>> working after hooking up our PCIe root complex to MMU-500; however, even >>>>>> with this optimized arm-smmu driver in v4.12, I'm still seeing a >>>>>> significant Ethernet throughput drop in both the TX and RX directions. >>>>>> The throughput drop is very significant at around 50% (but is already >>>>>> much improved compared to other prior kernel versions at 70~90%). >>>>> >>>>> Did Robin's experiments help at all with this? >>>>> >>>>> http://www.linux-arm.org/git?p=linux-rm.git;a=shortlog;h=refs/heads/iommu/perf >>>>> >>>> >>>> It looks like these are new optimizations that have not yet been merged >>>> in v4.12? I'm going to give it a try. >>>> >>>>>> One alternative is to only use MMU-500 for MSI writes towards >>>>>> GITS_TRANSLATER register in the GICv3, i.e., if I can define a specific >>>>>> region of physical address that I want MMU-500 to act on and leave the >>>>>> rest of inbound transactions to be handled directly by our PCIe >>>>>> controller, it can potentially work around the HW bug we have and at the >>>>>> same time achieve optimal throughput. >>>>> >>>>> I don't think you can bypass the SMMU for MSIs unless you give them their >>>>> own StreamIDs, which is likely to break things horribly in the kernel. You >>>>> could try to create an identity mapping, but you'll still have the >>>>> translation overhead and you'd probably end up having to supply your own DMA >>>>> ops to manage the address space. I'm assuming that you need to prevent the >>>>> physical address of the ITS from being allocated as an IOVA? >>>> >>>> Will, is that a HW limitation that the SMMU cannot be used, only for MSI >>>> writes, in which case, the physical address range is very specific in >>>> our ASIC that falls in the device memory region (e.g., below 0x80000000)? >>>> >>>> In fact, what I need in this case is a static mapping from IOMMU on the >>>> physical address of the GITS_TRANSLATER of the GICv3 ITS, which is the >>>> address that MSI writes go to. This is to bypass the MSI forwarding >>>> logic in our PCIe controller. At the same time, I can leave the rest of >>>> inbound transactions to be handled by our PCIe controller without going >>>> through the MMU. >>> >>> How is that going to work for DMA? I imagine your network interfaces do >>> have to access memory, don't they? How can the transactions be >>> terminated in the PCIe controller? >> >> Sorry, I may not phrase this properly. These inbound transactions (DMA >> write to DDR, from endpoint) do not terminate in the PCIe controller. >> They are taken by the PCIe controller as PCIe transactions and will be >> carried towards the designated memory on the host. > > So what is the StreamID used for these transactions? Is that a different > StreamID from that of the DMAing device? If you want to avoid the SMMU > effect on the transaction, you must make sure if doesn't match anything > there. > > Thanks, > > M. > Thanks for the reply. I'm checking with our ASIC team, but from my understanding, the stream ID in our ASIC is constructed based on the some custom fields that a developer can program + some standard PCIe BDF fields. That is, I don't think we can make the stream ID from the same PF different between MSI writes and DMA writes, as you have already predicted. It sounds like I do not have much option here... Thanks, Ray From mboxrd@z Thu Jan 1 00:00:00 1970 From: Ray Jui via iommu Subject: Re: Device address specific mapping of arm,mmu-500 Date: Tue, 30 May 2017 15:06:24 -0700 Message-ID: References: <1b79efe2-6835-7a7a-f5ad-361391a7b967@broadcom.com> <20170530151437.GC23067@arm.com> <81637642-22d9-4868-156f-052f64bd042f@broadcom.com> <226bcebc-3902-90d3-24e5-51f2e1f3affb@arm.com> <16e5fc9d-b014-af7c-dcda-527522ac5cc9@arm.com> Reply-To: Ray Jui Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <16e5fc9d-b014-af7c-dcda-527522ac5cc9-5wv7dgnIgG8@public.gmane.org> Content-Language: en-US List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: iommu-bounces-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org Errors-To: iommu-bounces-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org To: Marc Zyngier , Will Deacon Cc: Mark Rutland , "linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org" , iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org, linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r@public.gmane.org List-Id: iommu@lists.linux-foundation.org Hi Marc/Robin/Will, On 5/30/17 10:27 AM, Marc Zyngier wrote: > On 30/05/17 18:16, Ray Jui wrote: >> Hi Marc, >> >> On 5/30/17 9:59 AM, Marc Zyngier wrote: >>> On 30/05/17 17:49, Ray Jui wrote: >>>> Hi Will, >>>> >>>> On 5/30/17 8:14 AM, Will Deacon wrote: >>>>> On Mon, May 29, 2017 at 06:18:45PM -0700, Ray Jui wrote: >>>>>> I'm writing to check with you to see if the latest arm-smmu.c driver in >>>>>> v4.12-rc Linux for smmu-500 can support mapping that is only specific to >>>>>> a particular physical address range while leave the rest still to be >>>>>> handled by the client device. I believe this can already be supported by >>>>>> the device tree binding of the generic IOMMU framework; however, it is >>>>>> not clear to me whether or not the arm-smmu.c driver can support it. >>>>>> >>>>>> To give you some background information: >>>>>> >>>>>> We have a SoC that has PCIe root complex that has a build-in logic block >>>>>> to forward MSI writes to ARM GICv3 ITS. Unfortunately, this logic block >>>>>> has a HW bug that causes the MSI writes not parsed properly and can >>>>>> potentially corrupt data in the internal FIFO. A workaround is to have >>>>>> ARM MMU-500 takes care of all inbound transactions. I found that is >>>>>> working after hooking up our PCIe root complex to MMU-500; however, even >>>>>> with this optimized arm-smmu driver in v4.12, I'm still seeing a >>>>>> significant Ethernet throughput drop in both the TX and RX directions. >>>>>> The throughput drop is very significant at around 50% (but is already >>>>>> much improved compared to other prior kernel versions at 70~90%). >>>>> >>>>> Did Robin's experiments help at all with this? >>>>> >>>>> http://www.linux-arm.org/git?p=linux-rm.git;a=shortlog;h=refs/heads/iommu/perf >>>>> >>>> >>>> It looks like these are new optimizations that have not yet been merged >>>> in v4.12? I'm going to give it a try. >>>> >>>>>> One alternative is to only use MMU-500 for MSI writes towards >>>>>> GITS_TRANSLATER register in the GICv3, i.e., if I can define a specific >>>>>> region of physical address that I want MMU-500 to act on and leave the >>>>>> rest of inbound transactions to be handled directly by our PCIe >>>>>> controller, it can potentially work around the HW bug we have and at the >>>>>> same time achieve optimal throughput. >>>>> >>>>> I don't think you can bypass the SMMU for MSIs unless you give them their >>>>> own StreamIDs, which is likely to break things horribly in the kernel. You >>>>> could try to create an identity mapping, but you'll still have the >>>>> translation overhead and you'd probably end up having to supply your own DMA >>>>> ops to manage the address space. I'm assuming that you need to prevent the >>>>> physical address of the ITS from being allocated as an IOVA? >>>> >>>> Will, is that a HW limitation that the SMMU cannot be used, only for MSI >>>> writes, in which case, the physical address range is very specific in >>>> our ASIC that falls in the device memory region (e.g., below 0x80000000)? >>>> >>>> In fact, what I need in this case is a static mapping from IOMMU on the >>>> physical address of the GITS_TRANSLATER of the GICv3 ITS, which is the >>>> address that MSI writes go to. This is to bypass the MSI forwarding >>>> logic in our PCIe controller. At the same time, I can leave the rest of >>>> inbound transactions to be handled by our PCIe controller without going >>>> through the MMU. >>> >>> How is that going to work for DMA? I imagine your network interfaces do >>> have to access memory, don't they? How can the transactions be >>> terminated in the PCIe controller? >> >> Sorry, I may not phrase this properly. These inbound transactions (DMA >> write to DDR, from endpoint) do not terminate in the PCIe controller. >> They are taken by the PCIe controller as PCIe transactions and will be >> carried towards the designated memory on the host. > > So what is the StreamID used for these transactions? Is that a different > StreamID from that of the DMAing device? If you want to avoid the SMMU > effect on the transaction, you must make sure if doesn't match anything > there. > > Thanks, > > M. > Thanks for the reply. I'm checking with our ASIC team, but from my understanding, the stream ID in our ASIC is constructed based on the some custom fields that a developer can program + some standard PCIe BDF fields. That is, I don't think we can make the stream ID from the same PF different between MSI writes and DMA writes, as you have already predicted. It sounds like I do not have much option here... Thanks, Ray From mboxrd@z Thu Jan 1 00:00:00 1970 From: ray.jui@broadcom.com (Ray Jui) Date: Tue, 30 May 2017 15:06:24 -0700 Subject: Device address specific mapping of arm,mmu-500 In-Reply-To: <16e5fc9d-b014-af7c-dcda-527522ac5cc9@arm.com> References: <1b79efe2-6835-7a7a-f5ad-361391a7b967@broadcom.com> <20170530151437.GC23067@arm.com> <81637642-22d9-4868-156f-052f64bd042f@broadcom.com> <226bcebc-3902-90d3-24e5-51f2e1f3affb@arm.com> <16e5fc9d-b014-af7c-dcda-527522ac5cc9@arm.com> Message-ID: To: linux-arm-kernel@lists.infradead.org List-Id: linux-arm-kernel.lists.infradead.org Hi Marc/Robin/Will, On 5/30/17 10:27 AM, Marc Zyngier wrote: > On 30/05/17 18:16, Ray Jui wrote: >> Hi Marc, >> >> On 5/30/17 9:59 AM, Marc Zyngier wrote: >>> On 30/05/17 17:49, Ray Jui wrote: >>>> Hi Will, >>>> >>>> On 5/30/17 8:14 AM, Will Deacon wrote: >>>>> On Mon, May 29, 2017 at 06:18:45PM -0700, Ray Jui wrote: >>>>>> I'm writing to check with you to see if the latest arm-smmu.c driver in >>>>>> v4.12-rc Linux for smmu-500 can support mapping that is only specific to >>>>>> a particular physical address range while leave the rest still to be >>>>>> handled by the client device. I believe this can already be supported by >>>>>> the device tree binding of the generic IOMMU framework; however, it is >>>>>> not clear to me whether or not the arm-smmu.c driver can support it. >>>>>> >>>>>> To give you some background information: >>>>>> >>>>>> We have a SoC that has PCIe root complex that has a build-in logic block >>>>>> to forward MSI writes to ARM GICv3 ITS. Unfortunately, this logic block >>>>>> has a HW bug that causes the MSI writes not parsed properly and can >>>>>> potentially corrupt data in the internal FIFO. A workaround is to have >>>>>> ARM MMU-500 takes care of all inbound transactions. I found that is >>>>>> working after hooking up our PCIe root complex to MMU-500; however, even >>>>>> with this optimized arm-smmu driver in v4.12, I'm still seeing a >>>>>> significant Ethernet throughput drop in both the TX and RX directions. >>>>>> The throughput drop is very significant at around 50% (but is already >>>>>> much improved compared to other prior kernel versions at 70~90%). >>>>> >>>>> Did Robin's experiments help at all with this? >>>>> >>>>> http://www.linux-arm.org/git?p=linux-rm.git;a=shortlog;h=refs/heads/iommu/perf >>>>> >>>> >>>> It looks like these are new optimizations that have not yet been merged >>>> in v4.12? I'm going to give it a try. >>>> >>>>>> One alternative is to only use MMU-500 for MSI writes towards >>>>>> GITS_TRANSLATER register in the GICv3, i.e., if I can define a specific >>>>>> region of physical address that I want MMU-500 to act on and leave the >>>>>> rest of inbound transactions to be handled directly by our PCIe >>>>>> controller, it can potentially work around the HW bug we have and at the >>>>>> same time achieve optimal throughput. >>>>> >>>>> I don't think you can bypass the SMMU for MSIs unless you give them their >>>>> own StreamIDs, which is likely to break things horribly in the kernel. You >>>>> could try to create an identity mapping, but you'll still have the >>>>> translation overhead and you'd probably end up having to supply your own DMA >>>>> ops to manage the address space. I'm assuming that you need to prevent the >>>>> physical address of the ITS from being allocated as an IOVA? >>>> >>>> Will, is that a HW limitation that the SMMU cannot be used, only for MSI >>>> writes, in which case, the physical address range is very specific in >>>> our ASIC that falls in the device memory region (e.g., below 0x80000000)? >>>> >>>> In fact, what I need in this case is a static mapping from IOMMU on the >>>> physical address of the GITS_TRANSLATER of the GICv3 ITS, which is the >>>> address that MSI writes go to. This is to bypass the MSI forwarding >>>> logic in our PCIe controller. At the same time, I can leave the rest of >>>> inbound transactions to be handled by our PCIe controller without going >>>> through the MMU. >>> >>> How is that going to work for DMA? I imagine your network interfaces do >>> have to access memory, don't they? How can the transactions be >>> terminated in the PCIe controller? >> >> Sorry, I may not phrase this properly. These inbound transactions (DMA >> write to DDR, from endpoint) do not terminate in the PCIe controller. >> They are taken by the PCIe controller as PCIe transactions and will be >> carried towards the designated memory on the host. > > So what is the StreamID used for these transactions? Is that a different > StreamID from that of the DMAing device? If you want to avoid the SMMU > effect on the transaction, you must make sure if doesn't match anything > there. > > Thanks, > > M. > Thanks for the reply. I'm checking with our ASIC team, but from my understanding, the stream ID in our ASIC is constructed based on the some custom fields that a developer can program + some standard PCIe BDF fields. That is, I don't think we can make the stream ID from the same PF different between MSI writes and DMA writes, as you have already predicted. It sounds like I do not have much option here... Thanks, Ray