From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-10.0 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI, SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 14BD1C433E1 for ; Sun, 21 Jun 2020 20:36:58 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id F348C248A7 for ; Sun, 21 Jun 2020 20:36:57 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730702AbgFUUg5 (ORCPT ); Sun, 21 Jun 2020 16:36:57 -0400 Received: from mail-pf1-f194.google.com ([209.85.210.194]:34229 "EHLO mail-pf1-f194.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730643AbgFUUg5 (ORCPT ); Sun, 21 Jun 2020 16:36:57 -0400 Received: by mail-pf1-f194.google.com with SMTP id z63so7385728pfb.1 for ; Sun, 21 Jun 2020 13:36:56 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=SRvlSKW2+L1ZITInqz8x+X4WB8j2CsnAQUyTFXUacWs=; b=SjIIW1XR2BiLFLws3dGpEQLpJBVlCYdVzWHnByqLnL5UeQpZHCpLa3JGyOQEWPANd+ 00Hhb1pEg4LK84E8c339G6V/TvV0YGBv2jVQqMYXVnWm7rUsFxi4YXAOnIhy4AfJLJ1H /97kEDGzKUUrj3zQ/U1PnyMqzIHF0yLBIYGBtESjsudGEpMA0dphhEA1dAZRLrEJEn0+ xbvmGkQ20r/BuOf8oTM0uRpwlmhqV+B+qSIraCbZywXzM1ihqL8V7DiZS4FY2jMdLSgg 0cRsq9QFdqIgKMlE6YLS4TY4Hpj27rXjXeuoTQtF59hNno6PPmtoUfIHas45Um6YW+z4 sZQw== X-Gm-Message-State: AOAM532/Uw5y5Zv9cOMBcnCPQlC3CTsOGRQiCxMnj+Uh7xFA5NF0sqni kDQVbb0GmzYVn21hVVnpA4U= X-Google-Smtp-Source: ABdhPJx1A7pBHh/eg1eu7prHt0eNG+z2VHHhnCBliBqYJfQT4R1ztdFk9ndRCMFR6H8tDu5FsgiPqQ== X-Received: by 2002:aa7:9edc:: with SMTP id r28mr16908391pfq.139.1592771816468; Sun, 21 Jun 2020 13:36:56 -0700 (PDT) Received: from localhost.localdomain (c-73-241-217-19.hsd1.ca.comcast.net. [73.241.217.19]) by smtp.gmail.com with ESMTPSA id d5sm10861387pjo.20.2020.06.21.13.36.55 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 21 Jun 2020 13:36:55 -0700 (PDT) From: Bart Van Assche To: Jens Axboe Cc: io-uring@vger.kernel.org, Bart Van Assche Subject: [PATCH liburing 2/3] src/include/liburing/barrier.h: Use C11 atomics Date: Sun, 21 Jun 2020 13:36:45 -0700 Message-Id: <20200621203646.14416-3-bvanassche@acm.org> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20200621203646.14416-1-bvanassche@acm.org> References: <20200621203646.14416-1-bvanassche@acm.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Sender: io-uring-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: io-uring@vger.kernel.org Instead of using a combination of open-coding atomic primitives and using gcc builtins, use C11 atomics for all CPU architectures. Note: despite their name, atomic_*() operations do not necessarily translate into an atomic instruction. This patch changes the order of the instructions in e.g. io_uring_get_sqe() but not the number of instructions generated by gcc 10 on x86_64: Without this patch: 0x0000000000000360 <+0>: mov 0x44(%rdi),%eax 0x0000000000000363 <+3>: lea 0x1(%rax),%edx 0x0000000000000366 <+6>: mov (%rdi),%rax 0x0000000000000369 <+9>: mov (%rax),%eax 0x000000000000036b <+11>: mov 0x18(%rdi),%rcx 0x000000000000036f <+15>: mov %edx,%esi 0x0000000000000371 <+17>: sub %eax,%esi 0x0000000000000373 <+19>: xor %eax,%eax 0x0000000000000375 <+21>: cmp (%rcx),%esi 0x0000000000000377 <+23>: ja 0x38d 0x0000000000000379 <+25>: mov 0x10(%rdi),%rax 0x000000000000037d <+29>: mov (%rax),%eax 0x000000000000037f <+31>: and 0x44(%rdi),%eax 0x0000000000000382 <+34>: mov %edx,0x44(%rdi) 0x0000000000000385 <+37>: shl $0x6,%rax 0x0000000000000389 <+41>: add 0x38(%rdi),%rax 0x000000000000038d <+45>: retq With this patch applied: 0x0000000000000360 <+0>: mov 0x44(%rdi),%eax 0x0000000000000363 <+3>: lea 0x1(%rax),%edx 0x0000000000000366 <+6>: mov (%rdi),%rax 0x0000000000000369 <+9>: mov %edx,%esi 0x000000000000036b <+11>: mov (%rax),%eax 0x000000000000036d <+13>: sub %eax,%esi 0x000000000000036f <+15>: xor %eax,%eax 0x0000000000000371 <+17>: mov 0x18(%rdi),%rcx 0x0000000000000375 <+21>: cmp (%rcx),%esi 0x0000000000000377 <+23>: ja 0x38d 0x0000000000000379 <+25>: mov 0x10(%rdi),%rax 0x000000000000037d <+29>: mov (%rax),%eax 0x000000000000037f <+31>: and 0x44(%rdi),%eax 0x0000000000000382 <+34>: mov %edx,0x44(%rdi) 0x0000000000000385 <+37>: shl $0x6,%rax 0x0000000000000389 <+41>: add 0x38(%rdi),%rax 0x000000000000038d <+45>: retq Signed-off-by: Bart Van Assche --- src/include/liburing/barrier.h | 44 ++++++++-------------------------- 1 file changed, 10 insertions(+), 34 deletions(-) diff --git a/src/include/liburing/barrier.h b/src/include/liburing/barrier.h index ad69506bb248..c8aa4210371c 100644 --- a/src/include/liburing/barrier.h +++ b/src/include/liburing/barrier.h @@ -2,6 +2,8 @@ #ifndef LIBURING_BARRIER_H #define LIBURING_BARRIER_H +#include + /* From the kernel documentation file refcount-vs-atomic.rst: @@ -21,40 +23,14 @@ after the acquire operation executes. This is implemented using :c:func:`smp_acquire__after_ctrl_dep`. */ -/* From tools/include/linux/compiler.h */ -/* Optimization barrier */ -/* The "volatile" is due to gcc bugs */ -#define io_uring_barrier() __asm__ __volatile__("": : :"memory") - -/* From tools/virtio/linux/compiler.h */ -#define IO_URING_WRITE_ONCE(var, val) \ - (*((volatile __typeof(val) *)(&(var))) = (val)) -#define IO_URING_READ_ONCE(var) (*((volatile __typeof(var) *)(&(var)))) - +#define IO_URING_WRITE_ONCE(var, val) \ + atomic_store_explicit(&(var), (val), memory_order_relaxed) +#define IO_URING_READ_ONCE(var) \ + atomic_load_explicit(&(var), memory_order_relaxed) -#if defined(__x86_64__) || defined(__i386__) -/* Adapted from arch/x86/include/asm/barrier.h */ -#define io_uring_smp_store_release(p, v) \ -do { \ - io_uring_barrier(); \ - IO_URING_WRITE_ONCE(*(p), (v)); \ -} while (0) - -#define io_uring_smp_load_acquire(p) \ -({ \ - __typeof(*p) ___p1 = IO_URING_READ_ONCE(*(p)); \ - io_uring_barrier(); \ - ___p1; \ -}) - -#else /* defined(__x86_64__) || defined(__i386__) */ -/* - * Add arch appropriate definitions. Use built-in atomic operations for - * archs we don't have support for. - */ -#define io_uring_smp_store_release(p, v) \ - __atomic_store_n(p, v, __ATOMIC_RELEASE) -#define io_uring_smp_load_acquire(p) __atomic_load_n(p, __ATOMIC_ACQUIRE) -#endif /* defined(__x86_64__) || defined(__i386__) */ +#define io_uring_smp_store_release(p, v) \ + atomic_store_explicit((p), (v), memory_order_release) +#define io_uring_smp_load_acquire(p) \ + atomic_load_explicit((p), memory_order_acquire) #endif /* defined(LIBURING_BARRIER_H) */