All of lore.kernel.org
 help / color / mirror / Atom feed
From: Vineet Gupta <vineet.gupta1@synopsys.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: Mark Rutland <mark.rutland@arm.com>,
	Miklos Szeredi <mszeredi@redhat.com>,
	Jani Nikula <jani.nikula@intel.com>,
	Will Deacon <will.deacon@arm.com>, <linux-kernel@vger.kernel.org>,
	Chris Wilson <chris@chris-wilson.co.uk>, <linux-mm@kvack.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	<linux-snps-arc@lists.infradead.org>,
	Ingo Molnar <mingo@kernel.org>
Subject: Re: [PATCH 3/3] bitops.h: set_mask_bits() to return old value
Date: Fri, 11 Jan 2019 12:58:22 -0800	[thread overview]
Message-ID: <d36b8582-184a-37d2-699f-04837745b70a@synopsys.com> (raw)
In-Reply-To: <20190111092408.GM30894@hirez.programming.kicks-ass.net>

On 1/11/19 1:24 AM, Peter Zijlstra wrote:
> diff --git a/include/linux/bitops.h b/include/linux/bitops.h
> index 705f7c442691..2060d26a35f5 100644
> --- a/include/linux/bitops.h
> +++ b/include/linux/bitops.h
> @@ -241,10 +241,10 @@ static __always_inline void __assign_bit(long nr, volatile unsigned long *addr,
>  	const typeof(*(ptr)) mask__ = (mask), bits__ = (bits);	\
>  	typeof(*(ptr)) old__, new__;				\
>  								\
> +	old__ = READ_ONCE(*(ptr));				\
>  	do {							\
> -		old__ = READ_ONCE(*(ptr));			\
>  		new__ = (old__ & ~mask__) | bits__;		\
> -	} while (cmpxchg(ptr, old__, new__) != old__);		\
> +	} while (!try_cmpxchg(ptr, &old__, new__));		\
>  								\
>  	new__;							\
>  })
> 
> 
> While there you probably want something like the above... 

As a separate change perhaps so that a revert (unlikely as it might be) could be
done with less pain.

> although,
> looking at it now, we seem to have 'forgotten' to add try_cmpxchg to the
> generic code :/

So it _has_ to be a separate change ;-)

But can we even provide a sane generic try_cmpxchg. The asm-generic cmpxchg relies
on local irq save etc so it is clearly only to prevent a new arch from failing to
compile. atomic*_cmpxchg() is different story since atomics have to be provided by
arch.

Anyhow what is more interesting is the try_cmpxchg API itself. So commit
a9ebf306f52c756 introduced/use of try_cmpxchg(), which indeed makes the looping
"nicer" to read and obvious code gen improvements.

So,
        for (;;) {
                new = val $op $imm;
                old = cmpxchg(ptr, val, new);
                if (old == val)
                        break;
                val = old;
        }

becomes

        do {
        } while (!try_cmpxchg(ptr, &val, val $op $imm));


But on pure LL/SC retry based arches, we still end up with generated code having 2
loops. We discussed something similar a while back: see [1]

First loop is inside inline asm to retry LL/SC and the outer one due to code
above. Explicit return of try_cmpxchg() means setting up a register with a boolean
status of cmpxchg (AFAIKR ARMv7 already does that but ARC e.g. uses a CPU flag
thus requires an additional insn or two). We could arguably remove the inline asm
loop and retry LL/SC from the outer loop, but it seems cleaner to keep the retry
where it belongs.

Also under the hood, try_cmpxchg() would end up re-reading it for the issue fixed
by commit 44fe84459faf1a.

Heck, it would all be simpler if we could express this w/o use of cmpxchg.

	try_some_op(ptr, &val, val $op $imm);

P.S. the horrible API name is for indicative purposes only

This would remove the outer loop completely, also avoid any re-reads due to the
semantics of cmpxchg etc.

[1] https://www.spinics.net/lists/kernel/msg2029217.html

WARNING: multiple messages have this Message-ID (diff)
From: Vineet Gupta <vineet.gupta1@synopsys.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: Mark Rutland <mark.rutland@arm.com>,
	Miklos Szeredi <mszeredi@redhat.com>,
	Jani Nikula <jani.nikula@intel.com>,
	Will Deacon <will.deacon@arm.com>,
	linux-kernel@vger.kernel.org,
	Chris Wilson <chris@chris-wilson.co.uk>,
	linux-mm@kvack.org, Andrew Morton <akpm@linux-foundation.org>,
	linux-snps-arc@lists.infradead.org,
	Ingo Molnar <mingo@kernel.org>
Subject: Re: [PATCH 3/3] bitops.h: set_mask_bits() to return old value
Date: Fri, 11 Jan 2019 12:58:22 -0800	[thread overview]
Message-ID: <d36b8582-184a-37d2-699f-04837745b70a@synopsys.com> (raw)
In-Reply-To: <20190111092408.GM30894@hirez.programming.kicks-ass.net>

On 1/11/19 1:24 AM, Peter Zijlstra wrote:
> diff --git a/include/linux/bitops.h b/include/linux/bitops.h
> index 705f7c442691..2060d26a35f5 100644
> --- a/include/linux/bitops.h
> +++ b/include/linux/bitops.h
> @@ -241,10 +241,10 @@ static __always_inline void __assign_bit(long nr, volatile unsigned long *addr,
>  	const typeof(*(ptr)) mask__ = (mask), bits__ = (bits);	\
>  	typeof(*(ptr)) old__, new__;				\
>  								\
> +	old__ = READ_ONCE(*(ptr));				\
>  	do {							\
> -		old__ = READ_ONCE(*(ptr));			\
>  		new__ = (old__ & ~mask__) | bits__;		\
> -	} while (cmpxchg(ptr, old__, new__) != old__);		\
> +	} while (!try_cmpxchg(ptr, &old__, new__));		\
>  								\
>  	new__;							\
>  })
> 
> 
> While there you probably want something like the above... 

As a separate change perhaps so that a revert (unlikely as it might be) could be
done with less pain.

> although,
> looking at it now, we seem to have 'forgotten' to add try_cmpxchg to the
> generic code :/

So it _has_ to be a separate change ;-)

But can we even provide a sane generic try_cmpxchg. The asm-generic cmpxchg relies
on local irq save etc so it is clearly only to prevent a new arch from failing to
compile. atomic*_cmpxchg() is different story since atomics have to be provided by
arch.

Anyhow what is more interesting is the try_cmpxchg API itself. So commit
a9ebf306f52c756 introduced/use of try_cmpxchg(), which indeed makes the looping
"nicer" to read and obvious code gen improvements.

So,
        for (;;) {
                new = val $op $imm;
                old = cmpxchg(ptr, val, new);
                if (old == val)
                        break;
                val = old;
        }

becomes

        do {
        } while (!try_cmpxchg(ptr, &val, val $op $imm));


But on pure LL/SC retry based arches, we still end up with generated code having 2
loops. We discussed something similar a while back: see [1]

First loop is inside inline asm to retry LL/SC and the outer one due to code
above. Explicit return of try_cmpxchg() means setting up a register with a boolean
status of cmpxchg (AFAIKR ARMv7 already does that but ARC e.g. uses a CPU flag
thus requires an additional insn or two). We could arguably remove the inline asm
loop and retry LL/SC from the outer loop, but it seems cleaner to keep the retry
where it belongs.

Also under the hood, try_cmpxchg() would end up re-reading it for the issue fixed
by commit 44fe84459faf1a.

Heck, it would all be simpler if we could express this w/o use of cmpxchg.

	try_some_op(ptr, &val, val $op $imm);

P.S. the horrible API name is for indicative purposes only

This would remove the outer loop completely, also avoid any re-reads due to the
semantics of cmpxchg etc.

[1] https://www.spinics.net/lists/kernel/msg2029217.html

WARNING: multiple messages have this Message-ID (diff)
From: vineet.gupta1@synopsys.com (Vineet Gupta)
To: linux-snps-arc@lists.infradead.org
Subject: [PATCH 3/3] bitops.h: set_mask_bits() to return old value
Date: Fri, 11 Jan 2019 12:58:22 -0800	[thread overview]
Message-ID: <d36b8582-184a-37d2-699f-04837745b70a@synopsys.com> (raw)
In-Reply-To: <20190111092408.GM30894@hirez.programming.kicks-ass.net>

On 1/11/19 1:24 AM, Peter Zijlstra wrote:
> diff --git a/include/linux/bitops.h b/include/linux/bitops.h
> index 705f7c442691..2060d26a35f5 100644
> --- a/include/linux/bitops.h
> +++ b/include/linux/bitops.h
> @@ -241,10 +241,10 @@ static __always_inline void __assign_bit(long nr, volatile unsigned long *addr,
>  	const typeof(*(ptr)) mask__ = (mask), bits__ = (bits);	\
>  	typeof(*(ptr)) old__, new__;				\
>  								\
> +	old__ = READ_ONCE(*(ptr));				\
>  	do {							\
> -		old__ = READ_ONCE(*(ptr));			\
>  		new__ = (old__ & ~mask__) | bits__;		\
> -	} while (cmpxchg(ptr, old__, new__) != old__);		\
> +	} while (!try_cmpxchg(ptr, &old__, new__));		\
>  								\
>  	new__;							\
>  })
> 
> 
> While there you probably want something like the above... 

As a separate change perhaps so that a revert (unlikely as it might be) could be
done with less pain.

> although,
> looking at it now, we seem to have 'forgotten' to add try_cmpxchg to the
> generic code :/

So it _has_ to be a separate change ;-)

But can we even provide a sane generic try_cmpxchg. The asm-generic cmpxchg relies
on local irq save etc so it is clearly only to prevent a new arch from failing to
compile. atomic*_cmpxchg() is different story since atomics have to be provided by
arch.

Anyhow what is more interesting is the try_cmpxchg API itself. So commit
a9ebf306f52c756 introduced/use of try_cmpxchg(), which indeed makes the looping
"nicer" to read and obvious code gen improvements.

So,
        for (;;) {
                new = val $op $imm;
                old = cmpxchg(ptr, val, new);
                if (old == val)
                        break;
                val = old;
        }

becomes

        do {
        } while (!try_cmpxchg(ptr, &val, val $op $imm));


But on pure LL/SC retry based arches, we still end up with generated code having 2
loops. We discussed something similar a while back: see [1]

First loop is inside inline asm to retry LL/SC and the outer one due to code
above. Explicit return of try_cmpxchg() means setting up a register with a boolean
status of cmpxchg (AFAIKR ARMv7 already does that but ARC e.g. uses a CPU flag
thus requires an additional insn or two). We could arguably remove the inline asm
loop and retry LL/SC from the outer loop, but it seems cleaner to keep the retry
where it belongs.

Also under the hood, try_cmpxchg() would end up re-reading it for the issue fixed
by commit 44fe84459faf1a.

Heck, it would all be simpler if we could express this w/o use of cmpxchg.

	try_some_op(ptr, &val, val $op $imm);

P.S. the horrible API name is for indicative purposes only

This would remove the outer loop completely, also avoid any re-reads due to the
semantics of cmpxchg etc.

[1] https://www.spinics.net/lists/kernel/msg2029217.html

  reply	other threads:[~2019-01-11 20:58 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-01-11  0:26 [PATCH 0/3] Replace opencoded set_mask_bits Vineet Gupta
2019-01-11  0:26 ` Vineet Gupta
2019-01-11  0:26 ` Vineet Gupta
2019-01-11  0:26 ` [PATCH 1/3] coredump: Replace opencoded set_mask_bits() Vineet Gupta
2019-01-11  0:26   ` Vineet Gupta
2019-01-11  0:26   ` Vineet Gupta
2019-01-11  4:24   ` Anthony Yznaga
2019-01-11  4:24     ` Anthony Yznaga
2019-01-11  4:24     ` Anthony Yznaga
2019-01-11  0:26 ` [PATCH 2/3] fs: inode_set_flags() replace " Vineet Gupta
2019-01-11  0:26   ` Vineet Gupta
2019-01-11  0:26   ` Vineet Gupta
2019-01-11  0:26   ` Vineet Gupta
2019-01-11  4:24   ` Anthony Yznaga
2019-01-11  4:24     ` Anthony Yznaga
2019-01-11  4:24     ` Anthony Yznaga
2019-01-11  0:26 ` [PATCH 3/3] bitops.h: set_mask_bits() to return old value Vineet Gupta
2019-01-11  0:26   ` Vineet Gupta
2019-01-11  0:26   ` Vineet Gupta
2019-01-11  0:26   ` Vineet Gupta
2019-01-11  4:25   ` Anthony Yznaga
2019-01-11  4:25     ` Anthony Yznaga
2019-01-11  9:24   ` Peter Zijlstra
2019-01-11  9:24     ` Peter Zijlstra
2019-01-11 20:58     ` Vineet Gupta [this message]
2019-01-11 20:58       ` Vineet Gupta
2019-01-11 20:58       ` Vineet Gupta
2019-01-14 19:00   ` Will Deacon
2019-01-14 19:00     ` Will Deacon

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=d36b8582-184a-37d2-699f-04837745b70a@synopsys.com \
    --to=vineet.gupta1@synopsys.com \
    --cc=akpm@linux-foundation.org \
    --cc=chris@chris-wilson.co.uk \
    --cc=jani.nikula@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-snps-arc@lists.infradead.org \
    --cc=mark.rutland@arm.com \
    --cc=mingo@kernel.org \
    --cc=mszeredi@redhat.com \
    --cc=peterz@infradead.org \
    --cc=will.deacon@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.