All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ard Biesheuvel <ardb@kernel.org>
To: Justin He <Justin.He@arm.com>
Cc: Borislav Petkov <bp@alien8.de>, Len Brown <lenb@kernel.org>,
	James Morse <James.Morse@arm.com>,
	Tony Luck <tony.luck@intel.com>,
	Mauro Carvalho Chehab <mchehab@kernel.org>,
	Peter Zijlstra <peterz@infradead.org>,
	Robert Richter <rric@kernel.org>,
	Robert Moore <robert.moore@intel.com>,
	Qiuxu Zhuo <qiuxu.zhuo@intel.com>,
	Yazen Ghannam <yazen.ghannam@amd.com>,
	Jan Luebbe <jlu@pengutronix.de>,
	Khuong Dinh <khuong@os.amperecomputing.com>,
	Kani Toshi <toshi.kani@hpe.com>,
	"linux-acpi@vger.kernel.org" <linux-acpi@vger.kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"linux-edac@vger.kernel.org" <linux-edac@vger.kernel.org>,
	"devel@acpica.org" <devel@acpica.org>,
	"Rafael J . Wysocki" <rafael@kernel.org>,
	Shuai Xue <xueshuai@linux.alibaba.com>,
	Jarkko Sakkinen <jarkko@kernel.org>,
	"linux-efi@vger.kernel.org" <linux-efi@vger.kernel.org>,
	kernel test robot <lkp@intel.com>
Subject: Re: [PATCH v8 6/7] apei/ghes: Use unrcu_pointer for cmpxchg
Date: Mon, 17 Oct 2022 11:27:17 +0200	[thread overview]
Message-ID: <CAMj1kXFNEtGYBefwcSLvs8XokyiZXDypC9mCgZhCkbVkS09Jog@mail.gmail.com> (raw)
In-Reply-To: <DBBPR08MB4538066E9C43FCF2983C8B60F7299@DBBPR08MB4538.eurprd08.prod.outlook.com>

Hi Justin,

On Mon, 17 Oct 2022 at 10:47, Justin He <Justin.He@arm.com> wrote:
>
> Hi Ard
>
> > -----Original Message-----
> > Subject: Re: [PATCH v8 6/7] apei/ghes: Use unrcu_pointer for cmpxchg
> >
> > On Fri, 14 Oct 2022 at 17:11, Peter Zijlstra <peterz@infradead.org> wrote:
> > >
> > > On Fri, Oct 14, 2022 at 04:31:37PM +0200, Ard Biesheuvel wrote:
> > > > +       if (slot != -1) {
> > > > +               /*
> > > > +                * Use release semantics to ensure that
> > ghes_estatus_cached()
> > > > +                * running on another CPU will see the updated cache
> > fields if
> > > > +                * it can see the new value of the pointer.
> > > > +                */
> > > > +               victim = xchg_release(ghes_estatus_caches + slot,
> > > > +
> > RCU_INITIALIZER(new_cache));
> > > > +
> > > > +               /*
> > > > +                * At this point, victim may point to a cached item
> > different
> > > > +                * from the one based on which we selected the slot.
> > Instead of
> > > > +                * going to the loop again to pick another slot, let's
> > just
> > > > +                * drop the other item anyway: this may cause a false
> > cache
> > > > +                * miss later on, but that won't cause any problems.
> > > > +                */
> > > > +               if (victim) {
> > > > +                       call_rcu(&rcu_dereference(victim)->rcu,
> > > > +                                ghes_estatus_cache_rcu_free);
> > >                 }
> > >
> > > I think you can use unrcu_pointer() here instead, there should not be
> > > a data dependency since the ->rcu member itself should be otherwise
> > > unused (and if it were, we wouldn't care about its previous content anyway).
> > >
> > > But only Alpha cares about that distinction anyway, so *shrug*.
> > >
> >
> > Ah yeah good point - and we are not actually dereferencing the pointer at all
> > here, just adding an offset to get at the address of the rcu member.
> >
> > So we can take this block out of the rcu_read_lock() section as well.
> >
> >
> > > While I much like the xchg() variant; I still don't really fancy the
> > > verbage the sparse nonsense makes us do.
> > >
> > >                 victim = xchg_release(&ghes_estatus_caches[slot],
> > new_cache);
> > >                 if (victim)
> > >                         call_rcu(&victim->rcu,
> > > ghes_estatus_cache_rcu_free);
> > >
> > > is much nicer code.
> > >
> > > Over all; I'd simply ignore sparse (I often do).
> > >
> >
> > No disagreement there.
>
> What do you think of the updated patch:
>
> apei/ghes: Use xchg() for updating new cache slot instead of
>  cmpxchg()
>
> From: Ard Biesheuvel <ardb@kernel.org>
>
> ghes_estatus_cache_add() selects a slot, and either succeeds in
> replacing its contents with a pointer to a new cached item, or it just
> gives up and frees the new item again, without attempting to select
> another slot even if one might be available.
>
> Since only inserting new items is needed, the race can only cause a failure
> if the selected slot was updated with another new item concurrently,
> which means that it is arbitrary which of those two items gets
> dropped. This means the cmpxchg() and the special case are not necessary,
> and hence just drop the existing item unconditionally. Note that this
> does not result in loss of error events, it simply means we might
> cause a false cache miss, and report the same event one additional
> time in quick succession even if the cache should have prevented that.
>

Please add a line here

Co-developed-by: Jia He <justin.he@arm.com>

> Signed-off-by: Jia He <justin.he@arm.com>
> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
> ---
> [Justin]: I removed __rcu annotation of victim, removed the RCU_INITIALIZER
> cast and added the unptr for xchg.
>
> drivers/acpi/apei/ghes.c | 44 ++++++++++++++++++++--------------------
>  1 file changed, 22 insertions(+), 22 deletions(-)
>
> diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
> index 27c72b175e4b..5fc8a135450b 100644
> --- a/drivers/acpi/apei/ghes.c
> +++ b/drivers/acpi/apei/ghes.c
> @@ -150,7 +150,7 @@ struct ghes_vendor_record_entry {
>  static struct gen_pool *ghes_estatus_pool;
>  static unsigned long ghes_estatus_pool_size_request;
>
> -static struct ghes_estatus_cache *ghes_estatus_caches[GHES_ESTATUS_CACHES_SIZE];
> +static struct ghes_estatus_cache __rcu *ghes_estatus_caches[GHES_ESTATUS_CACHES_SIZE];
>  static atomic_t ghes_estatus_cache_alloced;
>
>  static int ghes_panic_timeout __read_mostly = 30;
> @@ -785,31 +785,26 @@ static struct ghes_estatus_cache *ghes_estatus_cache_alloc(
>         return cache;
>  }
>
> -static void ghes_estatus_cache_free(struct ghes_estatus_cache *cache)
> +static void ghes_estatus_cache_rcu_free(struct rcu_head *head)
>  {
> +       struct ghes_estatus_cache *cache;
>         u32 len;
>
> +       cache = container_of(head, struct ghes_estatus_cache, rcu);
>         len = cper_estatus_len(GHES_ESTATUS_FROM_CACHE(cache));
>         len = GHES_ESTATUS_CACHE_LEN(len);
>         gen_pool_free(ghes_estatus_pool, (unsigned long)cache, len);
>         atomic_dec(&ghes_estatus_cache_alloced);
>  }
>
> -static void ghes_estatus_cache_rcu_free(struct rcu_head *head)
> -{
> -       struct ghes_estatus_cache *cache;
> -
> -       cache = container_of(head, struct ghes_estatus_cache, rcu);
> -       ghes_estatus_cache_free(cache);
> -}
> -
>  static void ghes_estatus_cache_add(
>         struct acpi_hest_generic *generic,
>         struct acpi_hest_generic_status *estatus)
>  {
>         int i, slot = -1, count;
>         unsigned long long now, duration, period, max_period = 0;
> -       struct ghes_estatus_cache *cache, *slot_cache = NULL, *new_cache;
> +       struct ghes_estatus_cache *cache, *new_cache;
> +       struct ghes_estatus_cache *victim;
>
>         new_cache = ghes_estatus_cache_alloc(generic, estatus);
>         if (new_cache == NULL)
> @@ -820,13 +815,11 @@ static void ghes_estatus_cache_add(
>                 cache = rcu_dereference(ghes_estatus_caches[i]);
>                 if (cache == NULL) {
>                         slot = i;
> -                       slot_cache = NULL;
>                         break;
>                 }
>                 duration = now - cache->time_in;
>                 if (duration >= GHES_ESTATUS_IN_CACHE_MAX_NSEC) {
>                         slot = i;
> -                       slot_cache = cache;
>                         break;
>                 }
>                 count = atomic_read(&cache->count);
> @@ -835,17 +828,24 @@ static void ghes_estatus_cache_add(
>                 if (period > max_period) {
>                         max_period = period;
>                         slot = i;
> -                       slot_cache = cache;
>                 }
>         }
> -       /* new_cache must be put into array after its contents are written */
> -       smp_wmb();
> -       if (slot != -1 && cmpxchg(ghes_estatus_caches + slot,
> -                                 slot_cache, new_cache) == slot_cache) {
> -               if (slot_cache)
> -                       call_rcu(&slot_cache->rcu, ghes_estatus_cache_rcu_free);
> -       } else
> -               ghes_estatus_cache_free(new_cache);
> +       if (slot != -1) {
> +               /*
> +                * Use release semantics to ensure that ghes_estatus_cached()
> +                * running on another CPU will see the updated cache fields if
> +                * it can see the new value of the pointer.

Please move the comment back where it was. 'At this point' is now
ambiguous because victim has not been assigned yet.

> +                * At this point, victim may point to a cached item different
> +                * from the one based on which we selected the slot. Instead of
> +                * going to the loop again to pick another slot, let's just
> +                * drop the other item anyway: this may cause a false cache
> +                * miss later on, but that won't cause any problems.
> +                */
> +               victim = unrcu_pointer(xchg_release(&ghes_estatus_caches[slot],
> +                                       new_cache));

Doesn't this still trigger the sparse warning on x86?

> +               if (victim)
> +                       call_rcu(&victim->rcu, ghes_estatus_cache_rcu_free);

I think it is better to add back the __rcu annotation to 'victim', and
change this line to

call_rcu(&unrcu_pointer(victim)->rcu, ghes_estatus_cache_rcu_free);

> +       }
>         rcu_read_unlock();

This can now be moved before the if()

>  }
>
> IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.

Please get rid of this footer.

  reply	other threads:[~2022-10-17  9:28 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-10-10  2:35 [PATCH v8 0/7] Make ghes_edac a proper module Jia He
2022-10-10  2:35 ` [PATCH v8 1/7] efi/cper: export several helpers for ghes_edac to use Jia He
2022-10-10  2:35 ` [PATCH v8 2/7] EDAC/ghes: Add a notifier for reporting memory errors Jia He
2022-10-10  2:35 ` [PATCH v8 3/7] EDAC/ghes: Prepare to make ghes_edac a proper module Jia He
2022-10-10  2:35 ` [PATCH v8 4/7] EDAC/ghes: Make ghes_edac a proper module to remove the dependency on ghes Jia He
2022-10-10  2:35 ` [PATCH v8 5/7] EDAC: Add the ghes_get_devices() check for chipset-specific edac drivers Jia He
2022-10-10  2:35 ` [PATCH v8 6/7] apei/ghes: Use unrcu_pointer for cmpxchg Jia He
2022-10-11 10:33   ` Borislav Petkov
2022-10-11 14:32     ` Justin He
2022-10-11 14:45       ` Borislav Petkov
2022-10-12  4:35         ` Justin He
2022-10-12 12:04         ` Justin He
2022-10-13 13:37           ` Borislav Petkov
2022-10-13 15:41             ` Ard Biesheuvel
2022-10-13 16:37               ` Borislav Petkov
2022-10-13 16:45               ` Peter Zijlstra
2022-10-13 17:42                 ` Borislav Petkov
2022-10-14  9:40                   ` Ard Biesheuvel
2022-10-14 19:40                     ` Borislav Petkov
2022-10-14 12:00               ` Justin He
2022-10-14 14:31                 ` Ard Biesheuvel
2022-10-14 15:10                   ` Peter Zijlstra
2022-10-14 15:24                     ` Ard Biesheuvel
2022-10-17  8:47                       ` Justin He
2022-10-17  9:27                         ` Ard Biesheuvel [this message]
2022-10-17 11:57                           ` Justin He
2022-10-13 16:41           ` Peter Zijlstra
2022-10-10  2:35 ` [PATCH v8 7/7] EDAC/igen6: Return consistent errno when another edac driver is enabled Jia He

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAMj1kXFNEtGYBefwcSLvs8XokyiZXDypC9mCgZhCkbVkS09Jog@mail.gmail.com \
    --to=ardb@kernel.org \
    --cc=James.Morse@arm.com \
    --cc=Justin.He@arm.com \
    --cc=bp@alien8.de \
    --cc=devel@acpica.org \
    --cc=jarkko@kernel.org \
    --cc=jlu@pengutronix.de \
    --cc=khuong@os.amperecomputing.com \
    --cc=lenb@kernel.org \
    --cc=linux-acpi@vger.kernel.org \
    --cc=linux-edac@vger.kernel.org \
    --cc=linux-efi@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lkp@intel.com \
    --cc=mchehab@kernel.org \
    --cc=peterz@infradead.org \
    --cc=qiuxu.zhuo@intel.com \
    --cc=rafael@kernel.org \
    --cc=robert.moore@intel.com \
    --cc=rric@kernel.org \
    --cc=tony.luck@intel.com \
    --cc=toshi.kani@hpe.com \
    --cc=xueshuai@linux.alibaba.com \
    --cc=yazen.ghannam@amd.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.