LKML Archive on lore.kernel.org
 help / color / Atom feed
From: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
To: Szabolcs Nagy <szabolcs.nagy@arm.com>
Cc: nd <nd@arm.com>, Peter Zijlstra <peterz@infradead.org>,
	"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>,
	Boqun Feng <boqun.feng@gmail.com>,
	linux-kernel <linux-kernel@vger.kernel.org>,
	linux-api <linux-api@vger.kernel.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	Andy Lutomirski <luto@amacapital.net>,
	Dave Watson <davejwatson@fb.com>, Paul Turner <pjt@google.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Russell King <linux@arm.linux.org.uk>,
	Ingo Molnar <mingo@redhat.com>, "H. Peter Anvin" <hpa@zytor.com>,
	Andi Kleen <andi@firstfloor.org>, Chris Lameter <cl@linux.com>,
	Ben Maurer <bmaurer@fb.com>, rostedt <rostedt@goodmis.org>,
	Josh Triplett <josh@joshtriplett.org>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Catalin Marinas <Catalin.Marinas@arm.com>,
	Will Deacon <Will.Deacon@arm.com>,
	Michael Kerrisk <mtk.manpages@gmail.com>,
	Joel Fernandes <joelaf@google.com>, shuah <shuah@kernel.org>,
	carlos <carlos@redhat.com>, Florian Weimer <fweimer@redhat.com>,
	Joseph Myers <joseph@codesourcery.com>
Subject: Re: [RFC PATCH for 4.21 01/16] rseq/selftests: Add reference counter to coexist with glibc
Date: Tue, 23 Oct 2018 10:59:05 -0400 (EDT)
Message-ID: <1085544486.4156.1540306745517.JavaMail.zimbra@efficios.com> (raw)
In-Reply-To: <0dcd1e46-7007-3a99-4ada-f09ab66ab2d5@arm.com>

----- On Oct 12, 2018, at 10:59 AM, Szabolcs Nagy szabolcs.nagy@arm.com wrote:

> On 11/10/18 20:42, Mathieu Desnoyers wrote:
>> ----- On Oct 11, 2018, at 1:04 PM, Szabolcs Nagy Szabolcs.Nagy@arm.com wrote:
>> 
>>> On 11/10/18 17:37, Mathieu Desnoyers wrote:
>>>> ----- On Oct 11, 2018, at 12:20 PM, Szabolcs Nagy Szabolcs.Nagy@arm.com wrote:
>>>>> On 11/10/18 16:13, Mathieu Desnoyers wrote:
>>>>>> ----- On Oct 11, 2018, at 6:37 AM, Szabolcs Nagy Szabolcs.Nagy@arm.com wrote:
>>>>>>> On 10/10/18 20:19, Mathieu Desnoyers wrote:
>>>>>>>> +__attribute__((visibility("hidden"))) __thread
>>>>>>>> +volatile struct libc_rseq __lib_rseq_abi = {
>>>>>>> ...
>>>>> but it's in a magic struct that's called "abi" which is confusing,
>>>>> the counter is not abi, it's in a hidden object.
>>>>
>>>> No, it is really an ABI between user-space apps/libs. It's not meant to be
>>>> hidden. glibc implements its own register/unregister functions (it does not
>>>> link against librseq). librseq exposes register/unregister functions as public
>>>> APIs. Those also use the refcount. I also plan to have existing libraries, e.g.
>>>> liblttng-ust and possibly liburcu flavors, implement the
>>>> registration/unregistration and refcount handling on their own, so we don't
>>>> have to add a requirement on additional linking on librseq for pre-existing
>>>> libraries.
>>>>
>>>> So that refcount is not an ABI between kernel and user-space, but it's a
>>>> user-space ABI nevertheless (between program and shared objects).
>>>>
>>>
>>> if that's what you want, then your declaration is wrong.
>>> the object should not have hidden visibility.
>> 
>> Actually, if we look closer into my patch, it defines two symbols,
>> one of which is an alias:
>> 
>> __attribute__((visibility("hidden"))) __thread
>> volatile struct libc_rseq __lib_rseq_abi = {
>>         .cpu_id = RSEQ_CPU_ID_UNINITIALIZED,
>> };
>> 
>> extern __attribute__((weak, alias("__lib_rseq_abi"))) __thread
>> volatile struct rseq __rseq_abi;
>> 
>> Note that the public __rseq_abi symbol is weak but does not have
>> hidden visibility. I do this to ensure I don't get prototype
>> mismatch for __rseq_abi between rseq.c and rseq.h (it is required
>> to be a struct rseq by rseq.h), but I want the space to hold the
>> extra refcount field present in struct libc_rseq.
>> 
>

I notice this email has been sitting in my inbox for a while, sorry
for the delayed reply.
 
> but that's wrong: the weak symbol might get resolved to
> a different object in another module, while you increment
> a local refcounter, so there is no coordination between
> userspace components.

Hrm, good point. I should not use the __lib_rseq_abi symbol at all
here.

> 
> this was the reason for my first question in my original mail,
> as soon as i saw the local counter i suspected this is broken.

Good catch, yes. I think I should not use the alias approach then.

> 
> and "assume there is an extra counter field" is not
> acceptable as user space abi, if the counter is relevant
> across modules then expose the entire struct.

The question that arises here is whether I should update
uapi/linux/rseq.h and add the refcount field directly in
there, even though the kernel does not care about it per se ?

> 
>>> either the struct should be public abi (extern tls
>>> symbol) or the register/unregister functions should
>>> be public abi (so when multiple implementations are
>>> present in the same process only one of them will
>>> provide definition for the public abi symbol and
>>> thus there will be one refcounter).
>> 
>> Those are two possible solutions, indeed. Considering that
>> we already need to expose the __rseq_abi symbol as a public
>> ABI in a way that ensures that multiple implementations
>> in a same process end up only using one of them, it seems
>> straightforward to simply extend that structure and hold the
>> refcount there, rather than having two extra ABI symbols
>> (register/unregister functions).
>> 
>> One very appropriate question here is whether we want to
>> expose the layout of struct libc_rseq (which includes the
>> refcount) in a public header file, and if so, which project
>> should hold it ? Or do we just want to document the layout
>> of this ABI so projects can define the structure layout
>> internally ? As my implementation currently stands, I have
>> the following structure duplicated into rseq selftests,
>> librseq, and glibc:
>> 
> 
> "not exposed" and "the counter is abi" together is not
> useful, either you want coordination in user-space or
> not, that decision should imply the userspace abi/api
> (e.g. adding a counter to the user-space struct).

I'm inclined to add the refcount to struct rseq directly,
unless anyone objects. It seems much simpler.

> 
> it is true that only modules that implement registration
> need to know about the counter and normal users don't,
> but if you want any coordination then the layout must
> be fixed and that should be exposed somewhere to avoid
> breakage.

Yep. Exposing this in uapi/linux/rseq.h is the main
location that seems to make sense to me.

> 
> (i think ideally the api would be controlled by functions
> and not object symbols with magic layout, but the rseq
> design is already full of such magic. and i think it's
> better to do the registration in libc only without
> coordination but that might not be practical if users
> want it now)

Yes, early adopters is my concern here.

> 
>> /*
>>  * linux/rseq.h defines struct rseq as aligned on 32 bytes. The kernel ABI
>>  * size is 20 bytes. For support of multiple rseq users within a process,
>>  * user-space defines an extra 4 bytes field as a reference count, for a
>>  * total of 24 bytes.
>>  */
>> struct libc_rseq {
>>         /* kernel-userspace ABI. */
>>         __u32 cpu_id_start;
>>         __u32 cpu_id;
>>         __u64 rseq_cs;
>>         __u32 flags;
>>         /* user-space ABI. */
>>         __u32 refcount;
>> } __attribute__((aligned(4 * sizeof(__u64))));
>> 
>> That duplicated structure only needs to be present in early-adopter
>> applications/libraries. Those linking on librseq or relying on newer
>> glibc to register rseq don't need to know about this extended layout:
>> all they need to care about is the layout of struct rseq (without the
>> added refcount).
> 
> please decide if you want multiple libraries to
> be able to register rseq and coordinate or not
> and document that decision in the public api.

Yes, I'll try this out and see how this goes.

Thanks for the feedback!

Mathieu


-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com

  reply index

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-10-10 19:19 [RFC PATCH for 4.21 00/16] rseq updates, new cpu_opv system call Mathieu Desnoyers
2018-10-10 19:19 ` [RFC PATCH for 4.21 01/16] rseq/selftests: Add reference counter to coexist with glibc Mathieu Desnoyers
2018-10-11 10:37   ` Szabolcs Nagy
2018-10-11 15:13     ` Mathieu Desnoyers
2018-10-11 16:20       ` Szabolcs Nagy
2018-10-11 16:37         ` Mathieu Desnoyers
2018-10-11 17:04           ` Szabolcs Nagy
2018-10-11 19:42             ` Mathieu Desnoyers
2018-10-12  9:59               ` Szabolcs Nagy
2018-10-23 14:59                 ` Mathieu Desnoyers [this message]
2018-10-10 19:19 ` [RFC PATCH for 4.21 02/16] rseq/selftests: Adapt number of threads to the number of detected cpus Mathieu Desnoyers
2018-10-10 19:19 ` [RFC PATCH for 4.21 03/16] sched: Implement push_task_to_cpu (v2) Mathieu Desnoyers
2018-10-17  6:51   ` Srikar Dronamraju
2018-10-17 15:09     ` Mathieu Desnoyers
2018-10-10 19:19 ` [RFC PATCH for 4.21 04/16] mm: Introduce vm_map_user_ram, vm_unmap_user_ram Mathieu Desnoyers
2018-10-16 18:30   ` Steven Rostedt
2018-10-16 19:21     ` Mathieu Desnoyers
2018-10-16 19:40       ` Steven Rostedt
2018-10-17  0:27     ` Sergey Senozhatsky
2018-10-17 15:00       ` Mathieu Desnoyers
2018-10-17 15:04         ` Mathieu Desnoyers
2018-10-17 15:34           ` Sergey Senozhatsky
2018-10-10 19:19 ` [RFC PATCH for 4.21 05/16] mm: Provide is_vma_noncached Mathieu Desnoyers
2018-10-10 19:19 ` [RFC PATCH for 4.21 06/16] cpu_opv: Provide cpu_opv system call (v8) Mathieu Desnoyers
2018-10-16  8:10   ` Sergey Senozhatsky
2018-10-16 19:17     ` Mathieu Desnoyers
2018-10-17  1:46       ` Sergey Senozhatsky
2018-10-17  7:19   ` Srikar Dronamraju
2018-10-17 15:11     ` Mathieu Desnoyers
2018-10-17 16:09       ` Mathieu Desnoyers
2018-10-10 19:19 ` [RFC PATCH for 4.21 07/16] cpu_opv: limit amount of virtual address space used by cpu_opv Mathieu Desnoyers
2018-10-10 19:19 ` [RFC PATCH for 4.21 08/16] x86: Wire up cpu_opv system call Mathieu Desnoyers
2018-10-10 19:19 ` [RFC PATCH for 4.21 09/16] powerpc: " Mathieu Desnoyers
2018-10-10 19:19 ` [RFC PATCH for 4.21 10/16] arm: " Mathieu Desnoyers
2018-10-10 19:19 ` [RFC PATCH for 4.21 11/16] cpu-opv/selftests: Provide cpu-op library Mathieu Desnoyers
2018-10-10 19:19 ` [RFC PATCH for 4.21 12/16] cpu-opv/selftests: Provide basic test Mathieu Desnoyers
2018-10-10 19:19 ` [RFC PATCH for 4.21 13/16] cpu-opv/selftests: Provide percpu_op API Mathieu Desnoyers
2018-10-10 19:19 ` [RFC PATCH for 4.21 14/16] cpu-opv/selftests: Provide basic percpu ops test Mathieu Desnoyers
2018-10-10 19:19 ` [RFC PATCH for 4.21 15/16] cpu-opv/selftests: Provide parametrized tests Mathieu Desnoyers
2018-10-10 19:19 ` [RFC PATCH for 4.21 16/16] cpu-opv/selftests: Provide Makefile, scripts, gitignore Mathieu Desnoyers

Reply instructions:

You may reply publically to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1085544486.4156.1540306745517.JavaMail.zimbra@efficios.com \
    --to=mathieu.desnoyers@efficios.com \
    --cc=Catalin.Marinas@arm.com \
    --cc=Will.Deacon@arm.com \
    --cc=akpm@linux-foundation.org \
    --cc=andi@firstfloor.org \
    --cc=bmaurer@fb.com \
    --cc=boqun.feng@gmail.com \
    --cc=carlos@redhat.com \
    --cc=cl@linux.com \
    --cc=davejwatson@fb.com \
    --cc=fweimer@redhat.com \
    --cc=hpa@zytor.com \
    --cc=joelaf@google.com \
    --cc=joseph@codesourcery.com \
    --cc=josh@joshtriplett.org \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux@arm.linux.org.uk \
    --cc=luto@amacapital.net \
    --cc=mingo@redhat.com \
    --cc=mtk.manpages@gmail.com \
    --cc=nd@arm.com \
    --cc=paulmck@linux.vnet.ibm.com \
    --cc=peterz@infradead.org \
    --cc=pjt@google.com \
    --cc=rostedt@goodmis.org \
    --cc=shuah@kernel.org \
    --cc=szabolcs.nagy@arm.com \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

LKML Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/lkml/0 lkml/git/0.git
	git clone --mirror https://lore.kernel.org/lkml/1 lkml/git/1.git
	git clone --mirror https://lore.kernel.org/lkml/2 lkml/git/2.git
	git clone --mirror https://lore.kernel.org/lkml/3 lkml/git/3.git
	git clone --mirror https://lore.kernel.org/lkml/4 lkml/git/4.git
	git clone --mirror https://lore.kernel.org/lkml/5 lkml/git/5.git
	git clone --mirror https://lore.kernel.org/lkml/6 lkml/git/6.git
	git clone --mirror https://lore.kernel.org/lkml/7 lkml/git/7.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 lkml lkml/ https://lore.kernel.org/lkml \
		linux-kernel@vger.kernel.org
	public-inbox-index lkml

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-kernel


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git