From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <linux-kernel-owner@vger.kernel.org>
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S1752159AbdASFB6 (ORCPT <rfc822;w@1wt.eu>);
        Thu, 19 Jan 2017 00:01:58 -0500
Received: from mga02.intel.com ([134.134.136.20]:59201 "EHLO mga02.intel.com"
        rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP
        id S1751123AbdASFB4 (ORCPT <rfc822;linux-kernel@vger.kernel.org>);
        Thu, 19 Jan 2017 00:01:56 -0500
X-ExtLoop1: 1
X-IronPort-AV: E=Sophos;i="5.33,252,1477983600"; 
   d="scan'208";a="32523221"
Date: Thu, 19 Jan 2017 10:31:50 +0530
From: Vinod Koul <vinod.koul@intel.com>
To: Andy Gross <andy.gross@linaro.org>
Cc: Abhishek Sahu <absahu@codeaurora.org>, dan.j.williams@intel.com,
        stanimir.varbanov@linaro.org, mcgrof@suse.com, okaya@codeaurora.org,
        pramod.gurav@linaro.org, arnd@arndb.de, linux-kernel@vger.kernel.org,
        dmaengine@vger.kernel.org, linux-arm-msm@vger.kernel.org
Subject: Re: [PATCH 2/5] dmaengine: Add support for custom data mapping
Message-ID: <20170119050150.GI3573@localhost>
References: <20161218162602.GR25795@localhost>
 <20161219050642.GA3047@hector.attlocal.net>
 <20161219154923.GT25795@localhost>
 <20161219175210.GB3047@hector.attlocal.net>
 <c792bb9abd55ff55fde92b38b43a69d6@codeaurora.org>
 <20161220202511.GD3047@hector.attlocal.net>
 <61cb961a6f448cfc48b983329e329b34@codeaurora.org>
 <20161229175450.GB17770@hector.attlocal.net>
 <8055fa20b35139e7d13831583ebf4f4f@codeaurora.org>
 <20170102161233.GC17770@hector.attlocal.net>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <20170102161233.GC17770@hector.attlocal.net>
User-Agent: Mutt/1.5.21 (2010-09-15)
Sender: linux-kernel-owner@vger.kernel.org
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org

On Mon, Jan 02, 2017 at 10:12:33AM -0600, Andy Gross wrote:
> On Mon, Jan 02, 2017 at 07:55:37PM +0530, Abhishek Sahu wrote:
> > >>>>>>> So a couple of thoughts on how to deal with this:
> > >>>>>>>
> > >>>>>>> 1) Define a virtual channel for the command descriptors vs a normal DMA
> > >>>>>>> transaction.  This lets you use the same hardware channel, but lets you discern
> > >>>>>>> which descriptor format you need to use.  The only issue I see with this is the
> > >>>>>>> required change in device tree binding to target the right type of channel (cmd
> > >>>>>>> vs normal).
> > >>>>>>
> > >>>>>>Or mark the descriptor is cmd and write accordingly...
> > >>>>>
> > >>>>>The only issue i see with that is knowing how much to pre-allocate during
> > >>>>>the
> > >>>>>prep call.  The flag set API would be called on the allocated tx
> > >>>>>descriptor.  So
> > >>>>>you'd have to know up front and be able to specify it.
> > >>>>>
> > >>>>>>
> > >>>>>>>
> > >>>>>>> 2) Provide an API to set flags for the descriptor on a whole descriptor basis.
> > >>>>>>>
> > >>>>>>> 3) If you have a set of transactions described by an sgl that has disparate use
> > >>>>>>> of flags, you split the list and use a separate transaction.  In other words, we
> > >>>>>>> need to enforce that the flag set API will be applied to all descriptors
> > >>>>>>> described by an sgl.  This means that the whole transaction may be comprised of
> > >>>>>>> multiple async TX descriptors.
> > >>>>
> > >>>>Each async TX descriptor will generate the BAM interrupt so we are
> > >>>>deviating
> > >>>>from main purpose of DMA where ideally we should get the interrupt at
> > >>>>the
> > >>>>end
> > >>>>of transfer. This is the main reason for going for this patch.
> > >>>
> > >>>If the client queues 1 descriptor or 5 descriptors, it doesn't matter that
> > >>>much.
> > >>>The client knows when it is done by waiting for the descriptors to
> > >>>complete.  It
> > >>>is less efficient than grouping them all, but it should still work.
> > >>>
> > >>Yes. client will wait for final descriptor completion. But these
> > >>interrupts
> > >>will be overhead for CPU. For 1-2 page it won't matter much I guess it
> > >>will
> > >>be
> > >>significant for complete chip read/write(during boot and FS i.e JFFS
> > >>operations).
> > >>>>
> > >>>>With the submitted patch, only 1 interrupt per channel is required for
> > >>>>complete NAND page and it solves the setting of BAM specific flags also.
> > >>>>Only issue with this patch is adding new API in DMA framework itself.
> > >>>>But
> > >>>>this API can be used by other DMA engines in future for which mapping
> > >>>>cannot
> > >>>>be done with available APIs and if this mapping is vendor specific.
> > >>>
> > >>>I guess the key point in all of this is that the DMA operation being done
> > >>>is not
> > >>>a normal data flow to/from the device.  It's direct remote register access
> > >>>to
> > >>>the device using address information contained in the sgl.  And you are
> > >>>collating the standard data access along with the special command access
> > >>>in the
> > >>>same API call.
> > >>Yes. Normally DMA engine (QCA ADM DMA engine also) allows to read/write
> > >>memory mapped
> > >>registers just like data. But BAM is different (Since it is not a global
> > >>DMA
> > >>Engine
> > >>and coupled with peripheral). Also, this different flag requirement is
> > >>not
> > >>just
> > >>for command descriptors but for data descriptors also.
> > >>
> > >>BAM data access and command access differs only with flag and register
> > >>read/write
> > >>list. The register read and write list will be simply array of
> > >>struct bam_cmd_element added in patch
> > >>struct bam_cmd_element {
> > >>        __le32 addr:24;
> > >>        __le32 command:8;
> > >>        __le32 data;
> > >>        __le32 mask;
> > >>        __le32 reserved;
> > >>};
> > >>
> > >>The address and size of the array will be passed in data and size field
> > >>of
> > >>SGL.
> > >>If we want to form the SGL for mentioned list then we will have SGL of
> > >>size
> > >>15
> > >>with just one descriptor.
> > >>
> > >>Now we require different flag for each SG entry. currently SG does not
> > >>have
> > >>flag parameter and we can't add flag parameter just for our requirement
> > >>in
> > >>generic SG. So we have added the custom mapping function and passed
> > >>modified
> > >>SG
> > >>as parameter which is generic SG and flag.
> > >
> > >I really think that we need some additional API that allows for the flag
> > >munging
> > >for the descriptors instead of overriding the prep_slave_sg.  We already
> > >needed
> > >to change the way the flags are passed anyway.  And instead of building up
> > >a
> > >special sg list, the API should take a structure that has a 1:1 mapping of
> > >the
> > >flags to the descriptors.  And you would call this API on your descriptor
> > >before
> > >issuing it.

Munging wont be a good idea, but for some of the cases current flags can be
used, and if need be, we can add additional flags

> > >
> > >So build up the sglist.  Call the prep_slave_sg.  You get back a tx
> > >descriptor
> > >that underneath is a bam descriptor.  Then call the API giving the
> > >descriptor
> > >and the structure that defines the flags for the descriptors.  Then submit
> > >the
> > >descriptor.
> > >
> > >Something like:
> > >int qcom_bam_apply_descriptor_flags(struct dma_async_tx_descriptor *tx,
> > >				    u16 *flags)
> > >{
> > >	struct bam_async_desc async_desc = container_of(tx,
> > >							struct bam_async_desc,
> > >							vd.tx);
> > >	int i;
> > >
> > >	for (i = 0; i < async_desc->num_desc; i++)
> > >		async_desc->desc[i].flags = cpu_to_le16(flags[i]);
> > >}
> > >
> > >EXPORT_SYMBOL(qcom_bam_apply_descriptor_flags)

This makes it bam specific and causes issues if we want to use this code in
generic libs, but yes this is an option but should be last resort.

-- 
~Vinod