All of lore.kernel.org
 help / color / mirror / Atom feed
From: Pkshih <pkshih@realtek.com>
To: Arnd Bergmann <arnd@arndb.de>, Kalle Valo <kvalo@codeaurora.org>
Cc: "linux-wireless@vger.kernel.org" <linux-wireless@vger.kernel.org>
Subject: RE: [PATCH v6 03/24] rtw89: add core and trx files
Date: Tue, 5 Oct 2021 09:32:44 +0000	[thread overview]
Message-ID: <a8dd5e59fb8f491fb34e52d495cf4c85@realtek.com> (raw)
In-Reply-To: <CAK8P3a0T4iqtF0wj5+VUT6z3S2yGC4uaOr806NCiQTpYoPawUg@mail.gmail.com>


> -----Original Message-----
> From: Arnd Bergmann <arnd@arndb.de>
> Sent: Tuesday, October 5, 2021 4:42 PM
> To: Kalle Valo <kvalo@codeaurora.org>
> Cc: Pkshih <pkshih@realtek.com>; Arnd Bergmann <arnd@arndb.de>; linux-wireless@vger.kernel.org
> Subject: Re: [PATCH v6 03/24] rtw89: add core and trx files
> 
> On Tue, Oct 5, 2021 at 9:46 AM Kalle Valo <kvalo@codeaurora.org> wrote:
> > Pkshih <pkshih@realtek.com> writes:
> > >> From: kvalo=codeaurora.org@mg.codeaurora.org
> > >>
> > >> > +static __always_inline void RTW89_SET_TXWD(u8 *txdesc, u32 val,
> > >> > u8 offset, u32 mask)
> > >> > +{
> > >> > +  u32 *txd32 = (u32 *)txdesc;
> > >> > +
> > >> > +  le32p_replace_bits((__le32 *)(txd32 + offset), val, mask);
> > >> > +}
> > >>
> > >> I'm not convinced about this either, please just use inline.
> > >
> > > This is because 'mask' argument of le32p_replace_bits() must be constant
> > > only. If I use inline and build this driver with ccflags-y += -Os,
> > > compiler reports errors:
> > >
> > > In function 'field_multiplier',
> > >     inlined from 'le32_encode_bits' at ./include/linux/bitfield.h:154:1,
> > >     inlined from 'le32p_replace_bits' at ./include/linux/bitfield.h:154:1,
> > >     inlined from 'RTW89_SET_FWCMD_UA32.constprop' at
> /work/git-root/rtwlan/rtw89/fw.h:1397:2:
> > > ./include/linux/bitfield.h:119:3: error: call to '__bad_mask' declared with attribute error:
> bad bitfield mask
> > >   119 |   __bad_mask();
> > >       |   ^~~~~~~~~~~~
> > >
> > > I check the implement of le32p_replace_bits(), it looks like
> > >
> > > static __always_inline void type##p_replace_bits(__##type *p,           \
> > >                                         base val, base field)           \
> > > {                                                                       \
> > >         *p = (*p & ~to(field)) | type##_encode_bits(val, field);        \
> > > }
> > >
> > > So, I imitate the function to use __always_inline, and then it works.
> > >
> > > Do you think I don't need to consider the case of Os?
> > > But, -Os seems a standard option of Linux kernel.
> > >
> > > ifdef CONFIG_CC_OPTIMIZE_FOR_PERFORMANCE
> > > KBUILD_CFLAGS += -O2
> > > else ifdef CONFIG_CC_OPTIMIZE_FOR_PERFORMANCE_O3
> > > KBUILD_CFLAGS += -O3
> > > else ifdef CONFIG_CC_OPTIMIZE_FOR_SIZE
> > > KBUILD_CFLAGS += -Os
> > > endif
> >
> > Yeah, we need to support -Os.
> >
> > Arnd, what do you suggest? Is __always_inline good solution for this? I
> > think we should at least add a comment explaining why it's needed.
> 
> __always_inline can make sense to force the compiler to behave
> sanely if it doesn't work it out by itself, and I think that is how this
> function was meant to be used: the __compiletime_error in bitfield.h
> is intended to find any callers that have a non-constant argument,
> because that would result in horrible code.
> 
> I would suggest looking at the object code that you get with -Os after
> the added __always_inline, just to make sure that this isn't also
> horrible.

I check the function rtw89_core_fill_txdesc() which uses these macros.
With inline, the object code size is 0x1AF. With __always_inline and
-Os, the size is 0x1A4. (x86-64 platform)

Compare the object codes side-by-side, they are almost the same except
to some instructions. I think this is because the inline function
I apply __always_inline contains only a simple statement.

> 
> Looking at the driver code, as in
> 
> +#define RTW89_SET_TXWD_BODY_WP_OFFSET(txdesc, val) \
> + RTW89_SET_TXWD(txdesc, val, 0x00, GENMASK(31, 24))
> +#define RTW89_SET_TXWD_BODY_MORE_DATA(txdesc, val) \
> + RTW89_SET_TXWD(txdesc, val, 0x00, BIT(23))
> +#define RTW89_SET_TXWD_BODY_WD_INFO_EN(txdesc, val) \
> + RTW89_SET_TXWD(txdesc, val, 0x00, BIT(22))
> +#define RTW89_SET_TXWD_BODY_FW_DL(txdesc, val) \
> + RTW89_SET_TXWD(txdesc, val, 0x00, BIT(20))
> 
> I would personally write this without the wrappers, instead defining the
> bitmask macros as the masks and then open-coding the
> le32p_replace_bits() calls instead, which I would find more
> intuitive while it avoids the problem with the bitmasks.

Use these macros can address offset and bit fields quickly.
How about I use macro instead of inline function? Like,

#define RTW89_SET_TXWD (txdesc, val, offset, mask) \
do { \
	u32 *txd32 = (u32 *)txdesc; \
	le32p_replace_bits((__le32 *)(txd32 + offset), val, mask); \
} while (0)


> Going back one more step, I see that that rtw89_core_fill_txdesc()
> manipulates the descriptor fields in-memory, which also seems
> like a bad idea: The descriptor is mapped as cache-coherent,
> so on machines with no coherent DMA (i.e. most ARM or MIPS
> machines), that is uncached memory, and writing the descriptor
> using a series of read-modify-write cycles on uncached memory
> will be awfully slow. Maybe the answer is to just completely
> replace the descriptor access.

I'll think if we can use chached memory with single_map/unmap for
descriptor. That would improve the performance.

--
Ping-Ke


  reply	other threads:[~2021-10-05  9:32 UTC|newest]

Thread overview: 57+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-08-20  4:35 [PATCH v6 00/24] rtw89: add Realtek 802.11ax driver Ping-Ke Shih
2021-08-20  4:35 ` [PATCH v6 01/24] rtw89: add CAM files Ping-Ke Shih
2021-10-01 14:46   ` Kalle Valo
2021-08-20  4:35 ` [PATCH v6 02/24] rtw89: add BT coexistence files Ping-Ke Shih
2021-10-01 15:26   ` Kalle Valo
2021-10-01 17:40     ` Small driver submissions and long feedback cycles Brian Norris
2021-08-20  4:35 ` [PATCH v6 03/24] rtw89: add core and trx files Ping-Ke Shih
2021-10-01 16:26   ` Kalle Valo
2021-10-05  7:16     ` Pkshih
2021-10-05  7:46       ` Kalle Valo
2021-10-05  8:42         ` Arnd Bergmann
2021-10-05  9:32           ` Pkshih [this message]
2021-10-05  9:59             ` Arnd Bergmann
2021-10-06  1:35               ` Pkshih
2021-10-06  7:32                 ` Arnd Bergmann
2021-10-06  8:19                   ` Pkshih
2021-08-20  4:35 ` [PATCH v6 04/24] rtw89: add debug files Ping-Ke Shih
2021-08-20  4:35 ` [PATCH v6 05/24] rtw89: add efuse files Ping-Ke Shih
2021-08-20  4:35 ` [PATCH v6 06/24] rtw89: add files to download and communicate with firmware Ping-Ke Shih
2021-10-01 15:55   ` Kalle Valo
2021-08-20  4:35 ` [PATCH v6 07/24] rtw89: add MAC files Ping-Ke Shih
2021-10-01 16:13   ` Kalle Valo
2021-08-20  4:35 ` [PATCH v6 08/24] rtw89: implement mac80211 ops Ping-Ke Shih
2021-08-20  4:35 ` [PATCH v6 09/24] rtw89: add pci files Ping-Ke Shih
2021-08-20  4:35 ` [PATCH v6 10/24] rtw89: add phy files Ping-Ke Shih
2021-08-20  4:35 ` [PATCH v6 11/24] rtw89: define register names Ping-Ke Shih
2021-08-20  4:35 ` [PATCH v6 12/24] rtw89: add regulatory support Ping-Ke Shih
2021-08-20  4:35 ` [PATCH v6 13/24] rtw89: 8852a: add 8852a specific files Ping-Ke Shih
2021-10-01 16:20   ` Kalle Valo
2021-08-20  4:35 ` [PATCH v6 14/24] rtw89: 8852a: add 8852a RFK files Ping-Ke Shih
2021-08-20  4:35 ` [PATCH v6 15/24] rtw89: 8852a: add 8852a RFK tables Ping-Ke Shih
2021-08-20  4:35 ` [PATCH v6 16/24] rtw89: 8852a: add 8852a tables (1 of 5) Ping-Ke Shih
2021-08-20  4:35 ` [PATCH v6 17/24] rtw89: 8852a: add 8852a tables (2 " Ping-Ke Shih
2021-08-20  4:35 ` [PATCH v6 18/24] rtw89: 8852a: add 8852a tables (3 " Ping-Ke Shih
2021-08-20  4:35 ` [PATCH v6 19/24] rtw89: 8852a: add 8852a tables (4 " Ping-Ke Shih
2021-08-20  4:35 ` [PATCH v6 20/24] rtw89: 8852a: add 8852a tables (5 " Ping-Ke Shih
2021-08-20  4:35 ` [PATCH v6 21/24] rtw89: add ser to recover error reported by firmware Ping-Ke Shih
2021-08-20  4:35 ` [PATCH v6 22/24] rtw89: add PS files Ping-Ke Shih
2021-08-20  4:35 ` [PATCH v6 23/24] rtw89: add SAR files Ping-Ke Shih
2021-08-20  4:35 ` [PATCH v6 24/24] rtw89: add Kconfig and Makefile Ping-Ke Shih
2021-08-22  3:43   ` kernel test robot
2021-08-22  3:43     ` kernel test robot
2021-08-23  1:37     ` Pkshih
2021-08-23  1:37       ` Pkshih
2021-10-01 15:57   ` Kalle Valo
2021-10-01 16:34 ` [PATCH v6 00/24] rtw89: add Realtek 802.11ax driver Kalle Valo
2021-10-01 16:42   ` Larry Finger
2021-10-01 16:46     ` Kalle Valo
2021-10-01 17:18       ` Larry Finger
2021-10-05  5:46         ` Kalle Valo
2021-10-04  6:46   ` Pkshih
2021-10-05  5:52     ` Kalle Valo
2021-10-06  0:10       ` Brian Norris
2021-10-08  4:14         ` Pkshih
2021-10-08  4:11       ` Pkshih
2021-10-09  8:28         ` Kalle Valo
2021-10-12  1:53           ` Pkshih

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=a8dd5e59fb8f491fb34e52d495cf4c85@realtek.com \
    --to=pkshih@realtek.com \
    --cc=arnd@arndb.de \
    --cc=kvalo@codeaurora.org \
    --cc=linux-wireless@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.