linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* POSIX message queues
@ 2002-10-02 10:35 Krzysztof Benedyczak
  0 siblings, 0 replies; 11+ messages in thread
From: Krzysztof Benedyczak @ 2002-10-02 10:35 UTC (permalink / raw)
  To: linux-kernel

 Hello

After getting some response from lkml we are ready for
work on new version of POSIX message queues.

Main difference (as Christoph Hellwig suggested) would be
implementing it as a virtual filesystem (based on tmpfs and
parts of Jakub Jelinek code).
I think that we can agree that idea of moving whole stuff to
user space isn't good. There is still a problem with SIGEV_THREAD
version of notification but after (brief) looking into NPTL it
should be possible to implement (in difference to NGPT)


So our question is:
Is above version acceptable for Linux kernel?

Main advantages of such approach are: no need for new
system call and no mess in fork/exit.


 Krzysztof Benedyczak


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: POSIX message queues
  2003-10-07  7:50 Peter Waechtler
@ 2003-10-07  8:11 ` Jakub Jelinek
  0 siblings, 0 replies; 11+ messages in thread
From: Jakub Jelinek @ 2003-10-07  8:11 UTC (permalink / raw)
  To: Peter Waechtler
  Cc: Jamie Lokier, Ulrich Drepper, Krzysztof Benedyczak, linux-kernel,
	Manfred Spraul, Michal Wronski

On Tue, Oct 07, 2003 at 09:50:16AM +0200, Peter Waechtler wrote:
>  
> On Sunday, October 05, 2003, at 08:32PM, Jakub Jelinek <jakub@redhat.com> wrote:
> 
> >> Speaking of librt - I should not have to link in pthreads and the
> >> run-time overhead associated with it (locking stdio etc.) just so I
> >> can use shm_open().  Any chance of fixing this?
> >
> >That overhead is mostly gone in current glibcs (when using NPTL):
> >a) e.g. locking is done unconditionally even when libpthread is not present
> >   (it is just lock cmpxchgl, inlined)
> 
> 
> a "lock cmpxchg" is > 100 cycles (according to a recent Linux Journal article
> from Paul McKenney: 107ns on 700MHz PentiumIII)

Here is exactly what it does on IA-32/i686+:

# define lll_lock(futex) \
  (void) ({ int ignore1, ignore2;                                             \
            __asm __volatile ("cmpl $0, %%gs:%P6\n\t"                         \
                              "je,pt 0f\n\t"                                  \
                              "lock\n"                                        \
                              "0:\tcmpxchgl %1, %2\n\t"                       \
                              "jnz _L_mutex_lock_%=\n\t"                      \
                              ".subsection 1\n\t"                             \
                              ".type _L_mutex_lock_%=,@function\n"            \
                              "_L_mutex_lock_%=:\n\t"                         \
                              "leal %2, %%ecx\n\t"                            \
                              "call __lll_mutex_lock_wait\n\t"                \
                              "jmp 1f\n\t"                                    \
                              ".size _L_mutex_lock_%=,.-_L_mutex_lock_%=\n"   \
                              ".previous\n"                                   \
                              "1:"                                            \
                              : "=a" (ignore1), "=c" (ignore2), "=m" (futex)  \
                              : "0" (0), "1" (1), "m" (futex),                \
                                "i" (offsetof (tcbhead_t, multiple_threads))  \
                              : "memory"); })

> you suggested naming the syscall number symbols NR_mq_open instead of
> NR_sys_mq_open. In the stub I want to overload some syscalls (e.g. mq_open)
> but others not (e.g. mq_timedsend).
> 
> How to deal with that?

The syscall is still mq_open, isn't it? So it should be __NR_mq_open.
You simply put the ones where you want to implement them directly in the kernel
into syscalls.list (and add sysdeps/generic/mq_*.c stubs; also, mq_{timed,}{receive,send}
are cancellation points according to POSIX 2003, so they need to be marked as such).
Where you need to do some handling before/after the syscall in userland, you simply write
mq_open.c etc. and use INLINE_SYSCALL (or INTERNAL_SYSCALL) in it.

	Jakub

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: POSIX message queues
@ 2003-10-07  7:50 Peter Waechtler
  2003-10-07  8:11 ` Jakub Jelinek
  0 siblings, 1 reply; 11+ messages in thread
From: Peter Waechtler @ 2003-10-07  7:50 UTC (permalink / raw)
  To: Jakub Jelinek
  Cc: Jamie Lokier, Ulrich Drepper, Krzysztof Benedyczak, linux-kernel,
	Manfred Spraul, Michal Wronski

 
On Sunday, October 05, 2003, at 08:32PM, Jakub Jelinek <jakub@redhat.com> wrote:

>> Speaking of librt - I should not have to link in pthreads and the
>> run-time overhead associated with it (locking stdio etc.) just so I
>> can use shm_open().  Any chance of fixing this?
>
>That overhead is mostly gone in current glibcs (when using NPTL):
>a) e.g. locking is done unconditionally even when libpthread is not present
>   (it is just lock cmpxchgl, inlined)


a "lock cmpxchg" is > 100 cycles (according to a recent Linux Journal article
from Paul McKenney: 107ns on 700MHz PentiumIII)

But I think you will have benchmarked the alternatives?
BTW, what are they?

you suggested naming the syscall number symbols NR_mq_open instead of
NR_sys_mq_open. In the stub I want to overload some syscalls (e.g. mq_open)
but others not (e.g. mq_timedsend).

How to deal with that?


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: POSIX message queues
  2003-10-05 10:11 ` Manfred Spraul
@ 2003-10-06 19:04   ` Krzysztof Benedyczak
  0 siblings, 0 replies; 11+ messages in thread
From: Krzysztof Benedyczak @ 2003-10-06 19:04 UTC (permalink / raw)
  To: Manfred Spraul; +Cc: Linux Kernel Mailing List, Michal Wronski

On Sun, 5 Oct 2003, Manfred Spraul wrote:

> Krzysiek: What is MQ_IOC_CLOSE? It looks like a stale ioctl. Please
> remove such code from the patch.
It is used. If this ioctl will succeed we know that it was done on mqueue
fs file. And thanks to it we get rid of the possibility of closing
ordinary file descriptor with mq_close().

 >
> The last time I looked at your patch I noticed a race between creation
> and setting queue attributes. Did you fix that?
Yes - as Alan Cox suggested. But see below.


> I personally prefer syscalls, but that's just my personal preference.
In our opinion also - and that was the reason why we initially had done
this with syscall. But this was criticized. Mostly but Christopher Hellwig
AFAIR. So we changed it ;-) ... Anyway:

Removing ioctls has mostly advantages (maybe except checking for
permissions) and it's simply. Reusing code of msg_load/free/store - also
no problem. Third issue is filesystem. IMHO removing it from userspace is
unnecessary. It gives a lot of valuable informations (about
notifications which can't be gather with POSIX calls). It is also
convenient to rm queue to poll it.
The things I think should be changed:
mqueues should be be accessible from the module init time
touch can create queue with system limits (?)
multiple mounts of mqueue fs should show the same content.

With this functionality we will have some more convenient queues.
Is it sensible?

(
Implementing with syscalls will also solve proc/mounts dependency (which
BTW can be turned of in library configure).
)


Regards
Krzysiek

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: POSIX message queues
  2003-10-05 19:18       ` Jamie Lokier
@ 2003-10-05 21:52         ` Ulrich Drepper
  0 siblings, 0 replies; 11+ messages in thread
From: Ulrich Drepper @ 2003-10-05 21:52 UTC (permalink / raw)
  To: Jamie Lokier; +Cc: linux-kernel

Jamie Lokier wrote:

> Why isn't shm_open() simply part of libc?

Because it's defined to be part of librt by POSIX.

-- 
--------------.                        ,-.            444 Castro Street
Ulrich Drepper \    ,-----------------'   \ Mountain View, CA 94041 USA
Red Hat         `--' drepper at redhat.com `---------------------------


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: POSIX message queues
  2003-10-05 18:32     ` Jakub Jelinek
@ 2003-10-05 19:18       ` Jamie Lokier
  2003-10-05 21:52         ` Ulrich Drepper
  0 siblings, 1 reply; 11+ messages in thread
From: Jamie Lokier @ 2003-10-05 19:18 UTC (permalink / raw)
  To: Jakub Jelinek
  Cc: Ulrich Drepper, Krzysztof Benedyczak, linux-kernel,
	Manfred Spraul, pwaechtler, Michal Wronski

Jakub Jelinek wrote:
> > Speaking of librt - I should not have to link in pthreads and the
> > run-time overhead associated with it (locking stdio etc.) just so I
> > can use shm_open().  Any chance of fixing this?
> 
> That overhead is mostly gone in current glibcs (when using NPTL):
> a) e.g. locking is done unconditionally even when libpthread is not present
>    (it is just lock cmpxchgl, inlined)
> b) things like cancellation aware syscall wrappers for cancellable syscalls
>    and various other things are only done after first pthread_create has
>    been called, it doesn't matter whether libpthread is loaded or not

That's good.  I still don't like linking in pthreads when I'm not
using threads or any thread-using services, so I'll continue to use a
non-libc version of shm_open() in my own programs, particularly the
ones which use clone() directly.

Why isn't shm_open() simply part of libc?

-- Jamie

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: POSIX message queues
  2003-10-05 18:16   ` Jamie Lokier
@ 2003-10-05 18:32     ` Jakub Jelinek
  2003-10-05 19:18       ` Jamie Lokier
  0 siblings, 1 reply; 11+ messages in thread
From: Jakub Jelinek @ 2003-10-05 18:32 UTC (permalink / raw)
  To: Jamie Lokier
  Cc: Ulrich Drepper, Krzysztof Benedyczak, linux-kernel,
	Manfred Spraul, pwaechtler, Michal Wronski

On Sun, Oct 05, 2003 at 07:16:30PM +0100, Jamie Lokier wrote:
> Ulrich Drepper wrote:
> > > In another words: is our implementation in the position
> > > of NGPT or better? ;-)
> > 
> > I don't understand.  Why NGPT and what about "position"?
> 
> He is asking if the work will be wasted effort that is dismissed or
> superceded, like NGPT was.
> 
> > If you mean
> > including a solution in the runtime (librt), sure, this will happen.
> > But not before I see a solution in the official kernel.
> 
> Speaking of librt - I should not have to link in pthreads and the
> run-time overhead associated with it (locking stdio etc.) just so I
> can use shm_open().  Any chance of fixing this?

That overhead is mostly gone in current glibcs (when using NPTL):
a) e.g. locking is done unconditionally even when libpthread is not present
   (it is just lock cmpxchgl, inlined)
b) things like cancellation aware syscall wrappers for cancellable syscalls
   and various other things are only done after first pthread_create has
   been called, it doesn't matter whether libpthread is loaded or not

	Jakub

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: POSIX message queues
  2003-10-05 16:35 ` Ulrich Drepper
@ 2003-10-05 18:16   ` Jamie Lokier
  2003-10-05 18:32     ` Jakub Jelinek
  0 siblings, 1 reply; 11+ messages in thread
From: Jamie Lokier @ 2003-10-05 18:16 UTC (permalink / raw)
  To: Ulrich Drepper
  Cc: Krzysztof Benedyczak, linux-kernel, Manfred Spraul, pwaechtler,
	Michal Wronski

Ulrich Drepper wrote:
> > In another words: is our implementation in the position
> > of NGPT or better? ;-)
> 
> I don't understand.  Why NGPT and what about "position"?

He is asking if the work will be wasted effort that is dismissed or
superceded, like NGPT was.

> If you mean
> including a solution in the runtime (librt), sure, this will happen.
> But not before I see a solution in the official kernel.

Speaking of librt - I should not have to link in pthreads and the
run-time overhead associated with it (locking stdio etc.) just so I
can use shm_open().  Any chance of fixing this?

-- Jamie

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: POSIX message queues
  2003-10-05  9:13 Krzysztof Benedyczak
  2003-10-05 10:11 ` Manfred Spraul
@ 2003-10-05 16:35 ` Ulrich Drepper
  2003-10-05 18:16   ` Jamie Lokier
  1 sibling, 1 reply; 11+ messages in thread
From: Ulrich Drepper @ 2003-10-05 16:35 UTC (permalink / raw)
  To: Krzysztof Benedyczak
  Cc: linux-kernel, Manfred Spraul, pwaechtler, Michal Wronski

Krzysztof Benedyczak wrote:

> There are a lot of differencies but if the most important one is use of
> ioctl vs syscalls it can be changed (in fact our implementation loong time
> ago used syscalls).

Syscalls are always better.  At least from my perspective.  Just imagine
how the runtime should determine that the kernel doesn't support msqs?
With syscalls I get -ENOSYS back.  With ioctls I get EINVAL.  But what
this mean?  Functionality not available?  Invalid parameters to the
existing implementation?

ALso think about strace which is an important part in many peoples life.
 Hiding the functionality in some ioctls doesn't make it easy to follow
the program even if strace gets even more code added to the ioctl decoder.

Basically, demultiplexers are bad.  Syscalls are cheap.


> In another words: is our implementation in the position
> of NGPT or better? ;-)

I don't understand.  Why NGPT and what about "position"?  If you mean
including a solution in the runtime (librt), sure, this will happen.
But not before I see a solution in the official kernel.

-- 
--------------.                        ,-.            444 Castro Street
Ulrich Drepper \    ,-----------------'   \ Mountain View, CA 94041 USA
Red Hat         `--' drepper at redhat.com `---------------------------


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: POSIX message queues
  2003-10-05  9:13 Krzysztof Benedyczak
@ 2003-10-05 10:11 ` Manfred Spraul
  2003-10-06 19:04   ` Krzysztof Benedyczak
  2003-10-05 16:35 ` Ulrich Drepper
  1 sibling, 1 reply; 11+ messages in thread
From: Manfred Spraul @ 2003-10-05 10:11 UTC (permalink / raw)
  To: Krzysztof Benedyczak
  Cc: Linux Kernel Mailing List, pwaechtler, Michal Wronski

Krzysztof Benedyczak wrote:

>Hello
>
>For quite a long time there are two implementations of posix mqueues
>around. I think it is time to decide at least if both of them have
>chances of beeing applied to official kernel. So I would
>like to know if Peter Waechtler's implementations is considered superior
>or there is possible some discussion and further work on our
>implementation is worthwhile.
>  
>
Could you try to merge your work? Or at least: look at each others work. 
For example Krzysiek/Michal's implementation has wake-one semantics, 
which is IMHO a requirement.

Krzysiek: What is MQ_IOC_CLOSE? It looks like a stale ioctl. Please 
remove such code from the patch.

The last time I looked at your patch I noticed a race between creation and setting queue attributes. Did you fix that?


>There are a lot of differencies but if the most important one is use of
>ioctl vs syscalls it can be changed (in fact our implementation loong time
>ago used syscalls).
>  
>
I personally prefer syscalls, but that's just my personal preference. 
For example the notification info is a structure, and printing it to a 
text stream and then parsing it back again is just odd. And I don't see 
how you can fix the O_CREAT+unusual mq_maxmsg races.
Why do you check against MQ_MAXMSG in user space? That's wrong. The 
kernel will reject too large limits, probably depending on 
/proc/sys/kern/ configuration. Checking in user space doesn't gain 
anything, except that you loose the ability for runtime changes.
Please reuse the load_msg/store_msg functions instead of a 
kmalloc(arg.msg_len, GFP_KERNEL) + copy_from_user. kmalloc(16384) is not 
reliable - it needs a continuous block of 16 kB, and after a long 
runtime, the memory is so fragmented that such memory may not exist. 
This is a known problem for x86_64: They would prefer to have 16 kB 
blocks for the stack, but this results in errors during stress testing.
proc_write_max_queues: off-by-one error. tmp[16] ='\0' overwrites the stack.
Is is necessary that the filesystem is visible to user space? What about 
chroot environments, or environments with per-user mount points. I don't 
like the dependence on /proc/mounts.

>In another words: is our implementation in the position
>of NGPT or better? ;-)
>  
>
Do you know if Ulrich Drepper has looked at your user space libraries? 
Your code must end up in glibc, and he's the maintainer.

--
    Manfred


^ permalink raw reply	[flat|nested] 11+ messages in thread

* POSIX message queues
@ 2003-10-05  9:13 Krzysztof Benedyczak
  2003-10-05 10:11 ` Manfred Spraul
  2003-10-05 16:35 ` Ulrich Drepper
  0 siblings, 2 replies; 11+ messages in thread
From: Krzysztof Benedyczak @ 2003-10-05  9:13 UTC (permalink / raw)
  To: linux-kernel; +Cc: Manfred Spraul, pwaechtler, Michal Wronski

Hello

For quite a long time there are two implementations of posix mqueues
around. I think it is time to decide at least if both of them have
chances of beeing applied to official kernel. So I would
like to know if Peter Waechtler's implementations is considered superior
or there is possible some discussion and further work on our
implementation is worthwhile.

There are a lot of differencies but if the most important one is use of
ioctl vs syscalls it can be changed (in fact our implementation loong time
ago used syscalls).

In another words: is our implementation in the position
of NGPT or better? ;-)

Regards
Krzysiek

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2003-10-07  8:11 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2002-10-02 10:35 POSIX message queues Krzysztof Benedyczak
2003-10-05  9:13 Krzysztof Benedyczak
2003-10-05 10:11 ` Manfred Spraul
2003-10-06 19:04   ` Krzysztof Benedyczak
2003-10-05 16:35 ` Ulrich Drepper
2003-10-05 18:16   ` Jamie Lokier
2003-10-05 18:32     ` Jakub Jelinek
2003-10-05 19:18       ` Jamie Lokier
2003-10-05 21:52         ` Ulrich Drepper
2003-10-07  7:50 Peter Waechtler
2003-10-07  8:11 ` Jakub Jelinek

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).