From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754120AbdKIKPA (ORCPT ); Thu, 9 Nov 2017 05:15:00 -0500 Received: from mail-ot0-f195.google.com ([74.125.82.195]:48687 "EHLO mail-ot0-f195.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751182AbdKIKO5 (ORCPT ); Thu, 9 Nov 2017 05:14:57 -0500 X-Google-Smtp-Source: AGs4zMal1VSG75ND02EJm1pfn5YgI6tYzTGEmQl6Ug4WKvoG3x9hpjbGHrTt8HtONyC0anV3U0UbFICs7c2UNkb3RHI= MIME-Version: 1.0 In-Reply-To: References: <763e6a252de1ad8ccd28344fd0676ae2f92f796d.1510118606.git.green.hu@gmail.com> From: Arnd Bergmann Date: Thu, 9 Nov 2017 11:14:56 +0100 X-Google-Sender-Auth: Nw_ScoV9uGnRGvYBuH4LUcDAsfU Message-ID: Subject: Re: [PATCH 13/31] nds32: DMA mapping API To: Greentime Hu Cc: Greentime , Linux Kernel Mailing List , linux-arch , Thomas Gleixner , Jason Cooper , Marc Zyngier , Rob Herring , Networking , Vincent Chen , deanbo422@gmail.com Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Nov 9, 2017 at 8:12 AM, Greentime Hu wrote: > 2017-11-08 17:09 GMT+08:00 Arnd Bergmann : >> On Wed, Nov 8, 2017 at 6:55 AM, Greentime Hu wrote: >> >> You do the same cache operations for _to_cpu and _to_device, which >> usually works, >> but is more expensive than you need. It's better to take the ownership into >> account and only do what you need. >> > Like this? > > static void > nds32_dma_sync_single_for_cpu(struct device *dev, dma_addr_t handle, > size_t size, enum dma_data_direction dir) > { > consistent_sync((void *)dma_to_virt(dev, handle), size, > DMA_FROM_DEVICE); > } > > static void > nds32_dma_sync_single_for_device(struct device *dev, dma_addr_t handle, > size_t size, enum dma_data_direction dir) > { > consistent_sync((void *)dma_to_virt(dev, handle), size, > DMA_TO_DEVICE); > } No, it's more complicated than that. You need to pass both the direction of the DMA transaction and the ownership to consistent_sync(), and then do the correct cache maintenance operation for each of the six combinations. Which operation that is depends on the microarchitecture to some degree, e.g. on machines that can load arbitrary cache lines during speculative execution, you have to invalidate the caches during both _for_device/FROM_DEVICE _for_cpu/FROM_DEVICE, while machines without speculative execution can skip the second invalidation, they only need to get rid of dirty cache lines before the DMA from device. Usually you don't have to do a writeback during _for_cpu, since there are no dirty cache lines after the _for_device operation. It's not entirely clear what the correct behavior is for buffers that are not cache line aligned, some architectures use wbinval instead of inval for the _for_device/_FROM_DEVICE operation, on any partial cache line, but you wouldn't want to do that on the _for_cpu/_FROM_DEVICE operation. Arnd