All of lore.kernel.org
 help / color / mirror / Atom feed
From: Xiao Guangrong <guangrong.xiao@gmail.com>
To: "Emilio G. Cota" <cota@braap.org>
Cc: kvm@vger.kernel.org, mst@redhat.com, mtosatti@redhat.com,
	Xiao Guangrong <xiaoguangrong@tencent.com>,
	dgilbert@redhat.com, peterx@redhat.com, qemu-devel@nongnu.org,
	quintela@redhat.com, wei.w.wang@intel.com,
	jiang.biao2@zte.com.cn, pbonzini@redhat.com
Subject: Re: [PATCH v3 2/5] util: introduce threaded workqueue
Date: Tue, 27 Nov 2018 16:29:05 +0800	[thread overview]
Message-ID: <fb9e053d-629d-d468-613d-2c695bf686ea@gmail.com> (raw)
In-Reply-To: <20181126184919.GA6688@flamenco>



On 11/27/18 2:49 AM, Emilio G. Cota wrote:
> On Mon, Nov 26, 2018 at 16:06:37 +0800, Xiao Guangrong wrote:
>>>> +    /* after the user fills the request, the bit is flipped. */
>>>> +    uint64_t request_fill_bitmap QEMU_ALIGNED(SMP_CACHE_BYTES);
>>>> +    /* after handles the request, the thread flips the bit. */
>>>> +    uint64_t request_done_bitmap QEMU_ALIGNED(SMP_CACHE_BYTES);
>>>
>>> Use DECLARE_BITMAP, otherwise you'll get type errors as David
>>> pointed out.
>>
>> If we do it, the field becomes a pointer... that complicates the
>> thing.
> 
> Not necessarily, see below.
> 
> On Mon, Nov 26, 2018 at 16:18:24 +0800, Xiao Guangrong wrote:
>> On 11/24/18 8:17 AM, Emilio G. Cota wrote:
>>> On Thu, Nov 22, 2018 at 15:20:25 +0800, guangrong.xiao@gmail.com wrote:
>>>> +static uint64_t get_free_request_bitmap(Threads *threads, ThreadLocal *thread)
>>>> +{
>>>> +    uint64_t request_fill_bitmap, request_done_bitmap, result_bitmap;
>>>> +
>>>> +    request_fill_bitmap = atomic_rcu_read(&thread->request_fill_bitmap);
>>>> +    request_done_bitmap = atomic_rcu_read(&thread->request_done_bitmap);
>>>> +    bitmap_xor(&result_bitmap, &request_fill_bitmap, &request_done_bitmap,
>>>> +               threads->thread_requests_nr);
>>>
>>> This is not wrong, but it's a big ugly. Instead, I would:
>>>
>>> - Introduce bitmap_xor_atomic in a previous patch
>>> - Use bitmap_xor_atomic here, getting rid of the rcu reads
>>
>> Hmm, however, we do not need atomic xor operation here... that should be slower than
>> just two READ_ONCE calls.
> 
> If you use DECLARE_BITMAP, you get an in-place array. On a 64-bit
> host, that'd be
> 	unsigned long foo[1]; /* [2] on 32-bit */
> 
> Then again on 64-bit hosts, bitmap_xor_atomic would reduce
> to 2 atomic reads:
> 
> static inline void bitmap_xor_atomic(unsigned long *dst,
> const unsigned long *src1, const unsigned long *src2, long nbits)
> {
>      if (small_nbits(nbits)) {
>          *dst = atomic_read(src1) ^ atomic_read(&src2);
>      } else {
>          slow_bitmap_xor_atomic(dst, src1, src2, nbits);

We needn't do inplace xor operation. i.e, we just fetch the bitmaps to
the local variables do xor locally.

So we need additional complicity to handle the case that is !small_nbits(nbits)
... but it is really not a big deal as you said, it just couple of codes.

However, use u64 for the purpose that only  64 indexes are allowed is more
straightforward and can be naturally understood. :)

WARNING: multiple messages have this Message-ID (diff)
From: Xiao Guangrong <guangrong.xiao@gmail.com>
To: "Emilio G. Cota" <cota@braap.org>
Cc: pbonzini@redhat.com, mst@redhat.com, mtosatti@redhat.com,
	qemu-devel@nongnu.org, kvm@vger.kernel.org, dgilbert@redhat.com,
	peterx@redhat.com, wei.w.wang@intel.com, jiang.biao2@zte.com.cn,
	eblake@redhat.com, quintela@redhat.com,
	Xiao Guangrong <xiaoguangrong@tencent.com>
Subject: Re: [Qemu-devel] [PATCH v3 2/5] util: introduce threaded workqueue
Date: Tue, 27 Nov 2018 16:29:05 +0800	[thread overview]
Message-ID: <fb9e053d-629d-d468-613d-2c695bf686ea@gmail.com> (raw)
In-Reply-To: <20181126184919.GA6688@flamenco>



On 11/27/18 2:49 AM, Emilio G. Cota wrote:
> On Mon, Nov 26, 2018 at 16:06:37 +0800, Xiao Guangrong wrote:
>>>> +    /* after the user fills the request, the bit is flipped. */
>>>> +    uint64_t request_fill_bitmap QEMU_ALIGNED(SMP_CACHE_BYTES);
>>>> +    /* after handles the request, the thread flips the bit. */
>>>> +    uint64_t request_done_bitmap QEMU_ALIGNED(SMP_CACHE_BYTES);
>>>
>>> Use DECLARE_BITMAP, otherwise you'll get type errors as David
>>> pointed out.
>>
>> If we do it, the field becomes a pointer... that complicates the
>> thing.
> 
> Not necessarily, see below.
> 
> On Mon, Nov 26, 2018 at 16:18:24 +0800, Xiao Guangrong wrote:
>> On 11/24/18 8:17 AM, Emilio G. Cota wrote:
>>> On Thu, Nov 22, 2018 at 15:20:25 +0800, guangrong.xiao@gmail.com wrote:
>>>> +static uint64_t get_free_request_bitmap(Threads *threads, ThreadLocal *thread)
>>>> +{
>>>> +    uint64_t request_fill_bitmap, request_done_bitmap, result_bitmap;
>>>> +
>>>> +    request_fill_bitmap = atomic_rcu_read(&thread->request_fill_bitmap);
>>>> +    request_done_bitmap = atomic_rcu_read(&thread->request_done_bitmap);
>>>> +    bitmap_xor(&result_bitmap, &request_fill_bitmap, &request_done_bitmap,
>>>> +               threads->thread_requests_nr);
>>>
>>> This is not wrong, but it's a big ugly. Instead, I would:
>>>
>>> - Introduce bitmap_xor_atomic in a previous patch
>>> - Use bitmap_xor_atomic here, getting rid of the rcu reads
>>
>> Hmm, however, we do not need atomic xor operation here... that should be slower than
>> just two READ_ONCE calls.
> 
> If you use DECLARE_BITMAP, you get an in-place array. On a 64-bit
> host, that'd be
> 	unsigned long foo[1]; /* [2] on 32-bit */
> 
> Then again on 64-bit hosts, bitmap_xor_atomic would reduce
> to 2 atomic reads:
> 
> static inline void bitmap_xor_atomic(unsigned long *dst,
> const unsigned long *src1, const unsigned long *src2, long nbits)
> {
>      if (small_nbits(nbits)) {
>          *dst = atomic_read(src1) ^ atomic_read(&src2);
>      } else {
>          slow_bitmap_xor_atomic(dst, src1, src2, nbits);

We needn't do inplace xor operation. i.e, we just fetch the bitmaps to
the local variables do xor locally.

So we need additional complicity to handle the case that is !small_nbits(nbits)
... but it is really not a big deal as you said, it just couple of codes.

However, use u64 for the purpose that only  64 indexes are allowed is more
straightforward and can be naturally understood. :)

  reply	other threads:[~2018-11-27  8:29 UTC|newest]

Thread overview: 70+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-11-22  7:20 [PATCH v3 0/5] migration: improve multithreads guangrong.xiao
2018-11-22  7:20 ` [Qemu-devel] " guangrong.xiao
2018-11-22  7:20 ` [PATCH v3 1/5] bitops: introduce change_bit_atomic guangrong.xiao
2018-11-22  7:20   ` [Qemu-devel] " guangrong.xiao
2018-11-23 10:23   ` Dr. David Alan Gilbert
2018-11-23 10:23     ` [Qemu-devel] " Dr. David Alan Gilbert
2018-11-28  9:35   ` Juan Quintela
2018-11-28  9:35     ` [Qemu-devel] " Juan Quintela
2018-11-22  7:20 ` [PATCH v3 2/5] util: introduce threaded workqueue guangrong.xiao
2018-11-22  7:20   ` [Qemu-devel] " guangrong.xiao
2018-11-23 11:02   ` Dr. David Alan Gilbert
2018-11-23 11:02     ` [Qemu-devel] " Dr. David Alan Gilbert
2018-11-26  7:57     ` Xiao Guangrong
2018-11-26  7:57       ` [Qemu-devel] " Xiao Guangrong
2018-11-26 10:56       ` Dr. David Alan Gilbert
2018-11-26 10:56         ` [Qemu-devel] " Dr. David Alan Gilbert
2018-11-27  7:17         ` Xiao Guangrong
2018-11-27  7:17           ` [Qemu-devel] " Xiao Guangrong
2018-11-26 18:55       ` Emilio G. Cota
2018-11-26 18:55         ` [Qemu-devel] " Emilio G. Cota
2018-11-27  8:30         ` Xiao Guangrong
2018-11-27  8:30           ` [Qemu-devel] " Xiao Guangrong
2018-11-24  0:12   ` Emilio G. Cota
2018-11-24  0:12     ` [Qemu-devel] " Emilio G. Cota
2018-11-26  8:06     ` Xiao Guangrong
2018-11-26  8:06       ` [Qemu-devel] " Xiao Guangrong
2018-11-26 18:49       ` Emilio G. Cota
2018-11-26 18:49         ` [Qemu-devel] " Emilio G. Cota
2018-11-27  8:29         ` Xiao Guangrong [this message]
2018-11-27  8:29           ` Xiao Guangrong
2018-11-24  0:17   ` Emilio G. Cota
2018-11-24  0:17     ` [Qemu-devel] " Emilio G. Cota
2018-11-26  8:18     ` Xiao Guangrong
2018-11-26  8:18       ` [Qemu-devel] " Xiao Guangrong
2018-11-26 10:28       ` Paolo Bonzini
2018-11-26 10:28         ` [Qemu-devel] " Paolo Bonzini
2018-11-27  8:31         ` Xiao Guangrong
2018-11-27  8:31           ` [Qemu-devel] " Xiao Guangrong
2018-11-27 12:49   ` Christophe de Dinechin
2018-11-27 12:49     ` [Qemu-devel] " Christophe de Dinechin
2018-11-27 13:51     ` Paolo Bonzini
2018-11-27 13:51       ` [Qemu-devel] " Paolo Bonzini
2018-12-04 15:49       ` Christophe de Dinechin
2018-12-04 15:49         ` [Qemu-devel] " Christophe de Dinechin
2018-12-04 17:16         ` Paolo Bonzini
2018-12-04 17:16           ` [Qemu-devel] " Paolo Bonzini
2018-12-10  3:23           ` Xiao Guangrong
2018-12-10  3:23             ` [Qemu-devel] " Xiao Guangrong
2018-11-27 17:39     ` Emilio G. Cota
2018-11-27 17:39       ` [Qemu-devel] " Emilio G. Cota
2018-11-28  8:55     ` Xiao Guangrong
2018-11-28  8:55       ` [Qemu-devel] " Xiao Guangrong
2018-11-22  7:20 ` [PATCH v3 3/5] migration: use threaded workqueue for compression guangrong.xiao
2018-11-22  7:20   ` [Qemu-devel] " guangrong.xiao
2018-11-23 18:17   ` Dr. David Alan Gilbert
2018-11-23 18:17     ` [Qemu-devel] " Dr. David Alan Gilbert
2018-11-23 18:22     ` Paolo Bonzini
2018-11-23 18:22       ` [Qemu-devel] " Paolo Bonzini
2018-11-23 18:29       ` Dr. David Alan Gilbert
2018-11-23 18:29         ` [Qemu-devel] " Dr. David Alan Gilbert
2018-11-26  8:00         ` Xiao Guangrong
2018-11-26  8:00           ` [Qemu-devel] " Xiao Guangrong
2018-11-22  7:20 ` [PATCH v3 4/5] migration: use threaded workqueue for decompression guangrong.xiao
2018-11-22  7:20   ` [Qemu-devel] " guangrong.xiao
2018-11-22  7:20 ` [PATCH v3 5/5] tests: add threaded-workqueue-bench guangrong.xiao
2018-11-22  7:20   ` [Qemu-devel] " guangrong.xiao
2018-11-22 21:25 ` [PATCH v3 0/5] migration: improve multithreads no-reply
2018-11-22 21:25   ` [Qemu-devel] " no-reply
2018-11-22 21:35 ` no-reply
2018-11-22 21:35   ` [Qemu-devel] " no-reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=fb9e053d-629d-d468-613d-2c695bf686ea@gmail.com \
    --to=guangrong.xiao@gmail.com \
    --cc=cota@braap.org \
    --cc=dgilbert@redhat.com \
    --cc=jiang.biao2@zte.com.cn \
    --cc=kvm@vger.kernel.org \
    --cc=mst@redhat.com \
    --cc=mtosatti@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=peterx@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=quintela@redhat.com \
    --cc=wei.w.wang@intel.com \
    --cc=xiaoguangrong@tencent.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.