From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9A325C54E67 for ; Wed, 20 Mar 2024 04:11:48 +0000 (UTC) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 5EACB40298; Wed, 20 Mar 2024 05:11:47 +0100 (CET) Received: from szxga05-in.huawei.com (szxga05-in.huawei.com [45.249.212.191]) by mails.dpdk.org (Postfix) with ESMTP id 95C8D40271; Wed, 20 Mar 2024 05:11:45 +0100 (CET) Received: from mail.maildlp.com (unknown [172.19.163.44]) by szxga05-in.huawei.com (SkyGuard) with ESMTP id 4Tzw8608Rzz1h372; Wed, 20 Mar 2024 12:09:10 +0800 (CST) Received: from dggpeml500024.china.huawei.com (unknown [7.185.36.10]) by mail.maildlp.com (Postfix) with ESMTPS id 067E31405A2; Wed, 20 Mar 2024 12:11:43 +0800 (CST) Received: from [10.67.121.161] (10.67.121.161) by dggpeml500024.china.huawei.com (7.185.36.10) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.1.2507.35; Wed, 20 Mar 2024 12:11:42 +0800 Subject: Re: [PATCH v2] dmadev: fix structure alignment To: "Ma, WenwuX" , "dev@dpdk.org" CC: "Jiale, SongX" , "stable@dpdk.org" , Pavan Nikhilesh , Thomas Monjalon References: <20240308053711.1260154-1-wenwux.ma@intel.com> <20240315014331.1376720-1-wenwux.ma@intel.com> <096d2161-026c-f89e-c44b-340a3148bccc@huawei.com> From: fengchengwen Message-ID: Date: Wed, 20 Mar 2024 12:11:42 +0800 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:68.0) Gecko/20100101 Thunderbird/68.11.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset="utf-8" Content-Language: en-US Content-Transfer-Encoding: 8bit X-Originating-IP: [10.67.121.161] X-ClientProxiedBy: dggems704-chm.china.huawei.com (10.3.19.181) To dggpeml500024.china.huawei.com (7.185.36.10) X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Hi Wenwu, On 2024/3/15 17:27, Ma, WenwuX wrote: > Hi Chengwen > >> -----Original Message----- >> From: fengchengwen >> Sent: Friday, March 15, 2024 4:32 PM >> To: Ma, WenwuX ; dev@dpdk.org >> Cc: Jiale, SongX ; stable@dpdk.org >> Subject: Re: [PATCH v2] dmadev: fix structure alignment >> >> Hi Wenwu, >> >> On 2024/3/15 15:44, Ma, WenwuX wrote: >>> Hi Chengwen, >>> >>>> -----Original Message----- >>>> From: Ma, WenwuX >>>> Sent: Friday, March 15, 2024 2:26 PM >>>> To: fengchengwen ; dev@dpdk.org >>>> Cc: Jiale, SongX ; stable@dpdk.org >>>> Subject: RE: [PATCH v2] dmadev: fix structure alignment >>>> >>>> Hi Chengwen, >>>> >>>>> -----Original Message----- >>>>> From: fengchengwen >>>>> Sent: Friday, March 15, 2024 2:06 PM >>>>> To: Ma, WenwuX ; dev@dpdk.org >>>>> Cc: Jiale, SongX ; stable@dpdk.org >>>>> Subject: Re: [PATCH v2] dmadev: fix structure alignment >>>>> >>>>> Hi Wenwu, >>>>> >>>>> On 2024/3/15 9:43, Wenwu Ma wrote: >>>>>> The structure rte_dma_dev needs only 8 byte alignment. >>>>>> This patch replaces __rte_cache_aligned of rte_dma_dev with >>>>>> __rte_aligned(8). >>>>>> >>>>>> Fixes: b36970f2e13e ("dmadev: introduce DMA device library") >>>>>> Cc: stable@dpdk.org >>>>>> >>>>>> Signed-off-by: Wenwu Ma >>>>>> --- >>>>>> v2: >>>>>> - Because of performance drop, adjust the code to >>>>>> no longer demand cache line alignment >>>>> >>>>> Which two versions observed performance drop? And which benchmark >>>>> observed drop? >>>>> Could you provide more information? >>>>> >>>>>> >>>> V1 patch: >>>> >> https://patches.dpdk.org/project/dpdk/patch/20240308053711.1260154- >>>> 1-wenwux.ma@intel.com/ >>>> >>>> To view detailed results, visit: >>>> https://lab.dpdk.org/results/dashboard/patchsets/29472/ >>>> >>>>>> --- >>>>>> lib/dmadev/rte_dmadev_pmd.h | 2 +- >>>>>> 1 file changed, 1 insertion(+), 1 deletion(-) >>>>>> >>>>>> diff --git a/lib/dmadev/rte_dmadev_pmd.h >>>>> b/lib/dmadev/rte_dmadev_pmd.h >>>>>> index 58729088ff..b569bb3502 100644 >>>>>> --- a/lib/dmadev/rte_dmadev_pmd.h >>>>>> +++ b/lib/dmadev/rte_dmadev_pmd.h >>>>>> @@ -122,7 +122,7 @@ enum rte_dma_dev_state { >>>>>> * @internal >>>>>> * The generic data structure associated with each DMA device. >>>>>> */ >>>>>> -struct __rte_cache_aligned rte_dma_dev { >>>>>> +struct __rte_aligned(8) rte_dma_dev { >>>>> >>>>> The DMA fast-path was implemented by struct rte_dma_fp_objs, which >>>>> is not rte_dma_dev? So why is it a problem here? >>>>> >>>>> Thanks >>>>> >>>> The DMA device object is expected to align cache line, so clang will >>>> use “vmovaps” assembly instruction, >>>> >>>> And the instruction demands 16 bytes alignment or will cause segment >>>> fault in some environments. >>>> >>> Test case: >>> 1. compile dpdk >>> rm -rf x86_64-native-linuxapp-clang >>> CC=clang meson -Denable_kmods=True -Dlibdir=lib >>> --default-library=static x86_64-native-linuxapp-clang ninja -C >>> x86_64-native-linuxapp-clang -j 72 2. start dpdk-test >>> /root/dpdk/x86_64-native-linuxapp-clang/app/dpdk-test -l 0-39 >>> --vdev=dma_skeleton -a 31:00.0 -a 31:00.1 -a 31:00.2 -a 31:00.3 (Note: >>> If it cannot be reproduced, please try using a different core) >>> 3. exit dpdk-test >>> RTE>>quit >>> Segmentation fault (core dumped) I reproduce it just with --vdev=dma_skeleton. When execute quit command, it will invoke rte_dma_close->dma_release, pls see my annotations (//) below: void dma_release(struct rte_dma_dev *dev) { if (rte_eal_process_type() == RTE_PROC_PRIMARY) { rte_free(dev->data->dev_private); memset(dev->data, 0, sizeof(struct rte_dma_dev_data)); } dma_fp_object_dummy(dev->fp_obj); memset(dev, 0, sizeof(struct rte_dma_dev)); // this memset was compiles using vmovaps, its // 8c24da: c5 f8 57 c0 vxorps %xmm0,%xmm0,%xmm0 // 8c24de: c5 fc 29 43 20 vmovaps %ymm0,0x20(%rbx) // 8c24e3: c5 fc 29 03 vmovaps %ymm0,(%rbx) // but the dev is not align 16B (in my env the rte_dma_devices addr is 0x15d39950) } >> >> I will try to reproduce, but still a question: does above test has already merged >> your patch [1] or the current main branch code has this problem? >> >> [1] >> https://patches.dpdk.org/project/dpdk/patch/20240308053711.1260154- >> 1-wenwux.ma@intel.com/ >> >> Thanks >> > the current main branch code has this problem. > > Both patch v1 and v2 are able to solve this problem, but v1 has a performance issue. The performance issue is ethdev benchmark, it will not invoke any dmadev API, I don't think these two has any relations. So I prefer v1, Plus Pavan also submit a commit [1] to align the struct, but it was not a fix for clang-x86-platform. [1] https://lore.kernel.org/all/20240210062758.1510-1-pbhagavatula@marvell.com/T/ > >>> >>>> >>>>>> /** Device info which supplied during device initialization. */ >>>>>> struct rte_device *device; >>>>>> struct rte_dma_dev_data *data; /**< Pointer to shared device data. >>>>>> */ >>>>>> What more, could you please send v3? I hope it will contain the root cause and optional solutions of the segment fault problem. BTW: dmadev is the first one which dynamic alloc dmadev struct, later maybe more xxxdev will use this type, I think that's typical. Maybe we should add a such mem_align() function in eal library, but this could done later. Thanks