From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754572AbbBLBIA (ORCPT ); Wed, 11 Feb 2015 20:08:00 -0500 Received: from bear.ext.ti.com ([192.94.94.41]:60919 "EHLO bear.ext.ti.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754494AbbBLBH6 (ORCPT ); Wed, 11 Feb 2015 20:07:58 -0500 Message-ID: <54DBFCD2.7070500@ti.com> Date: Wed, 11 Feb 2015 19:07:30 -0600 From: Suman Anna User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.4.0 MIME-Version: 1.0 To: Tony Lindgren CC: Ohad Ben-Cohen , Kevin Hilman , Dave Gerlach , Robert Tivy , "linux-kernel@vger.kernel.org" , "linux-omap@vger.kernel.org" , linux-arm Subject: Re: [PATCH v3 2/2] remoteproc: add support to handle internal memories References: <1420838519-15669-1-git-send-email-s-anna@ti.com> <1420838519-15669-3-git-send-email-s-anna@ti.com> <20150211205757.GI2531@atomide.com> <54DBD7A3.9020604@ti.com> <20150211224841.GK2531@atomide.com> <54DBED69.6070206@ti.com> <20150212001857.GC25954@atomide.com> In-Reply-To: <20150212001857.GC25954@atomide.com> Content-Type: text/plain; charset="windows-1252" Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 02/11/2015 06:18 PM, Tony Lindgren wrote: > * Suman Anna [150211 16:05]: >> On 02/11/2015 04:48 PM, Tony Lindgren wrote: >>> * Suman Anna [150211 14:32]: >>>> On 02/11/2015 02:57 PM, Tony Lindgren wrote: >>>>> * Ohad Ben-Cohen [150210 02:14]: >>>>>> Hi Suman, >>>>>> >>>>>> On Fri, Jan 9, 2015 at 11:21 PM, Suman Anna wrote: >>>>>>> A remote processor may need to load certain firmware sections into >>>>>>> internal memories (eg: RAM at L1 or L2 levels) for performance or >>>>>>> other reasons. Introduce a new resource type (RSC_INTMEM) and add >>>>>>> an associated handler function to handle such memories. The handler >>>>>>> creates a kernel mapping for the resource's 'pa' (physical address). >>>>>> ... >>>>>>> + * rproc_handle_intmem() - handle internal memory resource entry >>>>>>> + * @rproc: rproc handle >>>>>>> + * @rsc: the intmem resource entry >>>>>>> + * @offset: offset of the resource data in resource table >>>>>>> + * @avail: size of available data (for image validation) >>>>>>> + * >>>>>>> + * This function will handle firmware requests for mapping a memory region >>>>>>> + * internal to a remote processor into kernel. It neither allocates any >>>>>>> + * physical pages, nor performs any iommu mapping, as this resource entry >>>>>>> + * is primarily used for representing physical internal memories. If the >>>>>>> + * internal memory region can only be accessed through an iommu, please >>>>>>> + * use a devmem resource entry. >>>>>>> + * >>>>>>> + * These resource entries should be grouped near the carveout entries in >>>>>>> + * the firmware's resource table, as other firmware entries might request >>>>>>> + * placing other data objects inside these memory regions (e.g. data/code >>>>>>> + * segments, trace resource entries, ...). >>>>>>> + */ >>>>>>> +static int rproc_handle_intmem(struct rproc *rproc, struct fw_rsc_intmem *rsc, >>>>>>> + int offset, int avail) >>>>>>> +{ >>>>>> ... >>>>>>> + va = (__force void *)ioremap_nocache(rsc->pa, rsc->len); >>>>>> >>>>>> Back in the days when we developed remoteproc there was a tremendous >>>>>> effort to move away from ioremap when not strictly needed. >>>>> >>>>> The use of ioremap in general is just fine for drivers as long >>>>> as they access a dedicated area to the specific device. Accessing >>>>> random registers and memory in the SoC is what I'm worried about. >>>>> >>>>>> I'd be happy if someone intimate with the related hardware could ack >>>>>> that in this specific case ioremap is indeed needed. No need to review >>>>>> the entire patch, or anything remoteproc, just make sure that >>>>>> generally ioremap is how we want to access this internal memory. >>>>>> >>>>>> Tony or Kevin any chance you could take a look and ack? >>>>>> >>>>>> If ioremap is indeed the way to go, I'd also expect that we wouldn't >>>>>> have to use __force here, but that's probably a minor patch cleanup. >>>>> >>>>> Hmm sounds like this memory should be dedicated to the accelerator? >>>>> >>>>> In that case it should use memblock to reserve that area early so >>>>> the kernel won't be accessing it at all. >>>> >>>> The usage here is not really on regular memory, but on internal device >>>> memory (eg: L2RAM within DSP which is accessible by MPU through L3 bus). >>>> For the regular shared memory for vrings and vring buffers, the >>>> remoteproc core does rely on CMA pools. >>> >>> OK sounds like Linux needs to access it initially to load the DSP boot >>> code to L2RAM to get the DSP booted. >>> >>> Maybe it can be done with the API provided by drivers/misc/sram.c? >>> >>> You could set up the L2RAM as compatible = "mmio-sram" and then >>> parse the optional phandle for that in the remoteproc code, then >>> allocate some memory from it to load the DSP boot code and free >>> it. >> >> Not quite the same usage, there are no implicit assumptions on managing >> this memory. Isn't the SRAM driver better suited for allocating memory >> using the gen_pool API. It is just regular code that is being placed >> into RAM, and the linker file on the remoteproc side dictates which >> portion we are using. So, the section can be anywhere based on the ELF >> parsing. Further, the same RAM space can be partitioned into Cache >> and/or RAM, which is usually controlled from internal processor >> subsystem register programming. > > It still sounds like you need an API like gen_pool to allocate and > load the DSP code though? So from that point of view it's best to > use some Linux generic API. > > Just guessing, but the process here is probably something like > request_firmware, configure hardware, allocate memory area, > copy firmware to memory, unallocate memory, boot m3 :) Yeah, atleast for the processors with MMUs, it's usually allocate memory, program IOMMU, copy firmware, boot rproc. Memory is freed when unloading the processor and loading a different firmware. For the cases with internal memory, either I need an ioremap of the region for copying the firmware sections, or as you said, allocate, copy and unallocate. That almost always means, I have to allocate the entire region, since I would need to usually copy the data to a specific location based on the ELF pheader data. The sram driver also does an ioremap internally, so I guess it can be done, and probably a bit more code for management within the rproc core. regards Suman From mboxrd@z Thu Jan 1 00:00:00 1970 From: s-anna@ti.com (Suman Anna) Date: Wed, 11 Feb 2015 19:07:30 -0600 Subject: [PATCH v3 2/2] remoteproc: add support to handle internal memories In-Reply-To: <20150212001857.GC25954@atomide.com> References: <1420838519-15669-1-git-send-email-s-anna@ti.com> <1420838519-15669-3-git-send-email-s-anna@ti.com> <20150211205757.GI2531@atomide.com> <54DBD7A3.9020604@ti.com> <20150211224841.GK2531@atomide.com> <54DBED69.6070206@ti.com> <20150212001857.GC25954@atomide.com> Message-ID: <54DBFCD2.7070500@ti.com> To: linux-arm-kernel@lists.infradead.org List-Id: linux-arm-kernel.lists.infradead.org On 02/11/2015 06:18 PM, Tony Lindgren wrote: > * Suman Anna [150211 16:05]: >> On 02/11/2015 04:48 PM, Tony Lindgren wrote: >>> * Suman Anna [150211 14:32]: >>>> On 02/11/2015 02:57 PM, Tony Lindgren wrote: >>>>> * Ohad Ben-Cohen [150210 02:14]: >>>>>> Hi Suman, >>>>>> >>>>>> On Fri, Jan 9, 2015 at 11:21 PM, Suman Anna wrote: >>>>>>> A remote processor may need to load certain firmware sections into >>>>>>> internal memories (eg: RAM at L1 or L2 levels) for performance or >>>>>>> other reasons. Introduce a new resource type (RSC_INTMEM) and add >>>>>>> an associated handler function to handle such memories. The handler >>>>>>> creates a kernel mapping for the resource's 'pa' (physical address). >>>>>> ... >>>>>>> + * rproc_handle_intmem() - handle internal memory resource entry >>>>>>> + * @rproc: rproc handle >>>>>>> + * @rsc: the intmem resource entry >>>>>>> + * @offset: offset of the resource data in resource table >>>>>>> + * @avail: size of available data (for image validation) >>>>>>> + * >>>>>>> + * This function will handle firmware requests for mapping a memory region >>>>>>> + * internal to a remote processor into kernel. It neither allocates any >>>>>>> + * physical pages, nor performs any iommu mapping, as this resource entry >>>>>>> + * is primarily used for representing physical internal memories. If the >>>>>>> + * internal memory region can only be accessed through an iommu, please >>>>>>> + * use a devmem resource entry. >>>>>>> + * >>>>>>> + * These resource entries should be grouped near the carveout entries in >>>>>>> + * the firmware's resource table, as other firmware entries might request >>>>>>> + * placing other data objects inside these memory regions (e.g. data/code >>>>>>> + * segments, trace resource entries, ...). >>>>>>> + */ >>>>>>> +static int rproc_handle_intmem(struct rproc *rproc, struct fw_rsc_intmem *rsc, >>>>>>> + int offset, int avail) >>>>>>> +{ >>>>>> ... >>>>>>> + va = (__force void *)ioremap_nocache(rsc->pa, rsc->len); >>>>>> >>>>>> Back in the days when we developed remoteproc there was a tremendous >>>>>> effort to move away from ioremap when not strictly needed. >>>>> >>>>> The use of ioremap in general is just fine for drivers as long >>>>> as they access a dedicated area to the specific device. Accessing >>>>> random registers and memory in the SoC is what I'm worried about. >>>>> >>>>>> I'd be happy if someone intimate with the related hardware could ack >>>>>> that in this specific case ioremap is indeed needed. No need to review >>>>>> the entire patch, or anything remoteproc, just make sure that >>>>>> generally ioremap is how we want to access this internal memory. >>>>>> >>>>>> Tony or Kevin any chance you could take a look and ack? >>>>>> >>>>>> If ioremap is indeed the way to go, I'd also expect that we wouldn't >>>>>> have to use __force here, but that's probably a minor patch cleanup. >>>>> >>>>> Hmm sounds like this memory should be dedicated to the accelerator? >>>>> >>>>> In that case it should use memblock to reserve that area early so >>>>> the kernel won't be accessing it at all. >>>> >>>> The usage here is not really on regular memory, but on internal device >>>> memory (eg: L2RAM within DSP which is accessible by MPU through L3 bus). >>>> For the regular shared memory for vrings and vring buffers, the >>>> remoteproc core does rely on CMA pools. >>> >>> OK sounds like Linux needs to access it initially to load the DSP boot >>> code to L2RAM to get the DSP booted. >>> >>> Maybe it can be done with the API provided by drivers/misc/sram.c? >>> >>> You could set up the L2RAM as compatible = "mmio-sram" and then >>> parse the optional phandle for that in the remoteproc code, then >>> allocate some memory from it to load the DSP boot code and free >>> it. >> >> Not quite the same usage, there are no implicit assumptions on managing >> this memory. Isn't the SRAM driver better suited for allocating memory >> using the gen_pool API. It is just regular code that is being placed >> into RAM, and the linker file on the remoteproc side dictates which >> portion we are using. So, the section can be anywhere based on the ELF >> parsing. Further, the same RAM space can be partitioned into Cache >> and/or RAM, which is usually controlled from internal processor >> subsystem register programming. > > It still sounds like you need an API like gen_pool to allocate and > load the DSP code though? So from that point of view it's best to > use some Linux generic API. > > Just guessing, but the process here is probably something like > request_firmware, configure hardware, allocate memory area, > copy firmware to memory, unallocate memory, boot m3 :) Yeah, atleast for the processors with MMUs, it's usually allocate memory, program IOMMU, copy firmware, boot rproc. Memory is freed when unloading the processor and loading a different firmware. For the cases with internal memory, either I need an ioremap of the region for copying the firmware sections, or as you said, allocate, copy and unallocate. That almost always means, I have to allocate the entire region, since I would need to usually copy the data to a specific location based on the ELF pheader data. The sram driver also does an ioremap internally, so I guess it can be done, and probably a bit more code for management within the rproc core. regards Suman