From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752942Ab3LRAPr (ORCPT ); Tue, 17 Dec 2013 19:15:47 -0500 Received: from mga03.intel.com ([143.182.124.21]:52455 "EHLO mga03.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751806Ab3LRAPn (ORCPT ); Tue, 17 Dec 2013 19:15:43 -0500 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="4.95,504,1384329600"; d="scan'208";a="445988921" Subject: [PATCH 1/3] net_dma: mark broken From: Dan Williams To: netdev@vger.kernel.org Cc: Alexander Duyck , Dave Jiang , Vinod Koul , David Whipple , stable@vger.kernel.org, linux-kernel@vger.kernel.org, dmaengine@vger.kernel.org Date: Tue, 17 Dec 2013 16:15:38 -0800 Message-ID: <20131218001538.7847.3289.stgit@viggo.jf.intel.com> In-Reply-To: <20131218001312.7847.9917.stgit@viggo.jf.intel.com> References: <20131218001312.7847.9917.stgit@viggo.jf.intel.com> User-Agent: StGit/0.17.1-3-g7cb9 MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org net_dma can cause data to be copied to a stale mapping if a copy-on-write fault occurs during dma. The application sees missing data. The following trace is triggered by modifying the kernel to WARN if it ever triggers copy-on-write on a page that is undergoing dma: WARNING: CPU: 24 PID: 2529 at lib/dma-debug.c:485 debug_dma_assert_idle+0xd2/0x120() ioatdma 0000:00:04.0: DMA-API: cpu touching an active dma mapped page [pfn=0x16bcd9] Modules linked in: iTCO_wdt iTCO_vendor_support ioatdma lpc_ich pcspkr dca CPU: 24 PID: 2529 Comm: linbug Tainted: G W 3.13.0-rc1+ #353 00000000000001e5 ffff88016f45f688 ffffffff81751041 ffff88017ab0ef70 ffff88016f45f6d8 ffff88016f45f6c8 ffffffff8104ed9c ffffffff810f3646 ffff8801768f4840 0000000000000282 ffff88016f6cca10 00007fa2bb699349 Call Trace: [] dump_stack+0x46/0x58 [] warn_slowpath_common+0x8c/0xc0 [] ? ftrace_pid_func+0x26/0x30 [] warn_slowpath_fmt+0x46/0x50 [] debug_dma_assert_idle+0xd2/0x120 [] do_wp_page+0xd0/0x790 [] handle_mm_fault+0x51c/0xde0 [] ? copy_user_enhanced_fast_string+0x9/0x20 [] __do_page_fault+0x19c/0x530 [] ? _raw_spin_lock_bh+0x16/0x40 [] ? trace_clock_local+0x9/0x10 [] ? rb_reserve_next_event+0x64/0x310 [] ? ioat2_dma_prep_memcpy_lock+0x60/0x130 [ioatdma] [] do_page_fault+0xe/0x10 [] page_fault+0x22/0x30 [] ? __kfree_skb+0x51/0xd0 [] ? copy_user_enhanced_fast_string+0x9/0x20 [] ? memcpy_toiovec+0x52/0xa0 [] skb_copy_datagram_iovec+0x5f/0x2a0 [] tcp_rcv_established+0x674/0x7f0 [] tcp_v4_do_rcv+0x2e5/0x4a0 [..] ---[ end trace e30e3b01191b7617 ]--- Mapped at: [] debug_dma_map_page+0xb9/0x160 [] dma_async_memcpy_pg_to_pg+0x127/0x210 [] dma_memcpy_pg_to_iovec+0x119/0x1f0 [] dma_skb_copy_datagram_iovec+0x11c/0x2b0 [] tcp_rcv_established+0x74a/0x7f0: ...the problem is that the receive path falls back to cpu-copy in several locations and this trace is just one of the areas. A few options were considered to fix this: 1/ sync all dma whenever a cpu copy branch is taken 2/ modify the page fault handler to hold off while dma is in-flight Option 1 adds yet more cpu overhead to an "offload" that struggles to compete with cpu-copy. Option 2 adds checks for behavior that is already documented as broken when using get_user_pages(). At a minimum a debug mode is warranted to catch and flag these violations of the dma-api vs get_user_pages(). Thanks to David for his reproducer. Cc: Cc: Dave Jiang Cc: Vinod Koul Cc: Alexander Duyck Reported-by: David Whipple Signed-off-by: Dan Williams --- drivers/dma/Kconfig | 1 + 1 files changed, 1 insertions(+), 0 deletions(-) diff --git a/drivers/dma/Kconfig b/drivers/dma/Kconfig index 132a4fd375b2..c823daaf9043 100644 --- a/drivers/dma/Kconfig +++ b/drivers/dma/Kconfig @@ -355,6 +355,7 @@ config NET_DMA bool "Network: TCP receive copy offload" depends on DMA_ENGINE && NET default (INTEL_IOATDMA || FSL_DMA) + depends on BROKEN help This enables the use of DMA engines in the network stack to offload receive copy-to-user operations, freeing CPU cycles.