linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Ulrich Drepper <drepper@redhat.com>
To: Evgeniy Polyakov <johnpol@2ka.mipt.ru>
Cc: David Miller <davem@davemloft.net>, Andrew Morton <akpm@osdl.org>,
	netdev <netdev@vger.kernel.org>,
	Zach Brown <zach.brown@oracle.com>,
	Christoph Hellwig <hch@infradead.org>,
	Chase Venters <chase.venters@clientec.com>,
	Johann Borck <johann.borck@densedata.com>,
	linux-kernel@vger.kernel.org, Jeff Garzik <jeff@garzik.org>,
	Alexander Viro <aviro@redhat.com>
Subject: Re: [take24 0/6] kevent: Generic event handling mechanism.
Date: Tue, 21 Nov 2006 08:58:49 -0800	[thread overview]
Message-ID: <45633049.2000209@redhat.com> (raw)
In-Reply-To: <20061121095302.GA15210@2ka.mipt.ru>

Evgeniy Polyakov wrote:
>> You don't want to have a channel like this.  The userlevel code doesn't 
>> know which threads are waiting in the kernel on the event queue.  And it 
>> seems to be much more complicated then simply have an kevent call which 
>> tells the kernel "wake up N or 1 more threads since I cannot handle it". 
>>  Basically a futex_wake()-like call.
> 
> Kernel does not know about any threads which waits for events, it only
> has queue of events, it can only wake those who was parked in
> kevent_get_events() or kevent_wait(), but syscall will return only when
> condition it waits on is true, i.e. when there is new event in the ready
> queue and/or ring buffer has empty slots, but kernel will wake them up
> in any case if those conditions are true.
> 
> How should it know which syscall should be interrupted when special syscall
> is called?

It's not about interrupting any threads.

The issue is that the wakeup of a thread from the kevent_wait call 
constitutes an "event notification".  If, as it should be, only one 
thread is woken than this information mustn't get lost.  If the woken 
thread cannot work on the events it got notified for, then it must tell 
the kernel about it so that, *if* there are other threads waiting in 
kevent_wait, one of those other threads can be woken.

What is needed is a simple "wake another thread waiting on this event 
queue" syscall.  Yes, in theory we could open an additional pipe with 
each event queue and use it for waking threads, but this is influencing 
the ABI through the use of a file descriptor.  It's much better to have 
an explicit way to do this.


> No AIO, but syscall.
> Only syscall time matters.
> Syscall starts, it sould be sometime stopped. When it should be stopped?
> It should be stopped after some time after it was started!
> 
> I still do not understand how will you use absolute timeout values
> there. Please exaplain.

What is there to explain?  If you are waiting for events which must 
coincide with real-world events you'll naturally will want to formulate 
something like "wait for X until 10:15h".  You cannot formulate this 
correctly with relative timeouts since the realtime clock might be adjusted.


> futex_wait() uses relative timeouts:
>  static int futex_wait(u32 __user *uaddr, u32 val, unsigned long time)
> 
> Kernel use relative timeouts.

Look again.  This time at the implementation.  For FUTEX_LOCK_PI the 
timeout is an absolute timeout.

> We have not have such symmetry.
> Other event handling interfaces can not work with events, which do not
> have file descriptor behind them. Kevent can and works.
> Signals are just usual events.
> 
> You request to get events - and you get them.
> You request to not get events during syscall - you remove events.

None of this matches what I'm talking about.  If you want to block a 
signal for the duration of the kevent_wait call this is nothing you can 
do by registering an event.

Registering events has nothing to do with signal masks.  They are not 
modified.  It is the program's responsibility to set the mask up 
correctly.  Just like sigwaitinfo() etc expect all signals which are 
waited on to be blocked.

The signal mask handling is orthogonal to all this and must be explicit. 
  In some cases explicit pthread_sigmask/sigprocmask calls.  But this is 
not atomic if a signal must be masked/unmasked for the *_wait call. 
This is why we have variants like pselect/ppoll/epoll_pwait which 
explicitly and *atomically* change the signal mask for the duration of 
the call.


> Btw, please point me to the discussion about real life usefullness of
> that parameter for epoll. I read thread where sys_pepoll() was
> intruduced, but except some theoretical handwaving about possible
> usefullness there are no real signs of that requirement.

Don't search for epoll_pwait, it's not widely used yet.  Search for 
pselect, which is standardized.  You'll find plenty of uses of that 
interface.  The number is certainly depressed in the moment since until 
recently there was no correct implementation on Linux.  And the 
interface is mostly used in real-time contexts where signals are more 
commonly used.


> What is the ground research or extended explaination about
> blocking/unblocking some signals during syscall execution?

Why is this even a question?  Have you done programming with signals? 
You hatred of signals makes me think this isn't the case.

You might want to unblock a signal on a *_wait call if it can be used to 
interrupt the wait but you don't want this to happen during when the 
thread is working on a request.

You might want to block a signal, for instance, around a sigwaitinfo 
call or, in this case, a kevent_wait call where the signal might be 
delivered to the queue.

There are countless possibilities.  Signals are very flexible.


> There are _no_ additional syscalls.
> I just introduced new case for event type.

Which is a new syscall.  All demultiplexer cases are no syscalls. 
Which, BTW, implies that unrecognized types should actually cause a 
ENOSYS return value (this affects kevent_break).  We've been over this 
many times.  If EINVAL is return this case cannot be distinguished from 
invalid parameters.  This is crucial for future extensions where 
userland (esp glibc) needs to be able to determine whether a new feature 
is supported on the system.


> You _need_ it to be done, since any kernel kevent user must have
> enqueue/dequeue/callback callbacks. It is just an implementation of that
> callbacks.

I don't question that.  But there is no need to add the callback.  It 
extends the kernel ABI/API.  And for what?  A vastly inferior timer 
implementation compared to the POSIX timers.  And this while all that 
needs to be done is to extend the POSIX timer code slightly to handle 
SIGEV_KEVENT in addition to the other notification methods currently 
used.  If you do it right then the code can be shared with the file AIO 
code which currently is circulated as well and which uses parts of the 
POSIX timer infrastructure.


> Btw, how POSIX API should be extended to allow to queue events - queue
> is required (which is created when user calls kevent_init() or
> previoisly opened /dev/kevent), how should it be accessed, since it is
> just a file descriptor in process task_struct.

I've explained this multiple times.  The struct sigevent structure needs 
to be extended to get a new part in the union.  Something like

   struct {
     int kevent_fd;
     void *data;
   } _sigev_kevent;

Then define SIGEV_KEVENT as a value distinct from the other SIGEV_ 
values.  In the code which handles setup of timers (the timer_create 
syscall), recognize SIGEV_KEVENT and handle it appropriately.  I.e., 
call into the code to register the event source, just like you'd do with 
the current interface.  Then add the code to post an event to the event 
queue where currently signals would be sent et voilà.

-- 
➧ Ulrich Drepper ➧ Red Hat, Inc. ➧ 444 Castro St ➧ Mountain View, CA ❖

  reply	other threads:[~2006-11-21 17:02 UTC|newest]

Thread overview: 214+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <1154985aa0591036@2ka.mipt.ru>
2006-10-27 16:10 ` [take21 0/4] kevent: Generic event handling mechanism Evgeniy Polyakov
2006-10-27 16:10   ` [take21 1/4] kevent: Core files Evgeniy Polyakov
2006-10-27 16:10     ` [take21 2/4] kevent: poll/select() notifications Evgeniy Polyakov
2006-10-27 16:10       ` [take21 3/4] kevent: Socket notifications Evgeniy Polyakov
2006-10-27 16:10         ` [take21 4/4] kevent: Timer notifications Evgeniy Polyakov
2006-10-28 10:04       ` [take21 2/4] kevent: poll/select() notifications Eric Dumazet
2006-10-28 10:08         ` Evgeniy Polyakov
2006-10-28 10:28     ` [take21 1/4] kevent: Core files Eric Dumazet
2006-10-28 10:53       ` Evgeniy Polyakov
2006-10-28 12:36         ` Eric Dumazet
2006-10-28 13:03           ` Evgeniy Polyakov
2006-10-28 13:23             ` Eric Dumazet
2006-10-28 13:28               ` Evgeniy Polyakov
2006-10-28 13:34                 ` Eric Dumazet
2006-10-28 13:47                   ` Evgeniy Polyakov
2006-10-27 16:42   ` [take21 0/4] kevent: Generic event handling mechanism Evgeniy Polyakov
2006-11-07 11:26   ` Jeff Garzik
2006-11-07 11:46     ` Jeff Garzik
2006-11-07 11:58       ` Evgeniy Polyakov
2006-11-07 11:51     ` Evgeniy Polyakov
2006-11-07 12:17       ` Jeff Garzik
2006-11-07 12:29         ` Evgeniy Polyakov
2006-11-07 12:32       ` Jeff Garzik
2006-11-07 19:34         ` Andrew Morton
2006-11-07 20:52           ` David Miller
2006-11-07 21:38             ` Andrew Morton
2006-11-01 11:36 ` [take22 " Evgeniy Polyakov
2006-11-01 11:36   ` [take22 1/4] kevent: Core files Evgeniy Polyakov
2006-11-01 11:36     ` [take22 2/4] kevent: poll/select() notifications Evgeniy Polyakov
2006-11-01 11:36       ` [take22 3/4] kevent: Socket notifications Evgeniy Polyakov
2006-11-01 11:36         ` [take22 4/4] kevent: Timer notifications Evgeniy Polyakov
2006-11-01 13:06   ` [take22 0/4] kevent: Generic event handling mechanism Pavel Machek
2006-11-01 13:25     ` Evgeniy Polyakov
2006-11-01 16:05       ` Pavel Machek
2006-11-01 16:24         ` Evgeniy Polyakov
2006-11-01 18:13           ` Oleg Verych
2006-11-01 18:57             ` Evgeniy Polyakov
2006-11-02  2:12               ` Nate Diller
     [not found]                 ` <aaf959cb0611011829k36deda6ahe61bcb9bf8e612e1@mail.gmail.com>
2006-11-02  2:30                   ` zhou drangon
2006-11-02  7:46                     ` Eric Dumazet
2006-11-02  8:01                       ` Evgeniy Polyakov
2006-11-02  8:18                         ` Eric Dumazet
2006-11-02  8:46                           ` Evgeniy Polyakov
2006-11-02 11:33                             ` Eric Dumazet
2006-11-06 21:17                         ` Eric Dumazet
2006-11-07  8:32                           ` Evgeniy Polyakov
2006-11-07  9:18                           ` Evgeniy Polyakov
2006-11-07 12:09                             ` Evgeniy Polyakov
2006-11-09  7:48                               ` Evgeniy Polyakov
2006-11-03  2:42                       ` zhou drangon
2006-11-03  9:16                         ` Evgeniy Polyakov
2006-11-02  6:21                 ` Evgeniy Polyakov
2006-11-02 19:40                   ` Nate Diller
2006-11-03  8:42                     ` Evgeniy Polyakov
2006-11-03  8:57                       ` Pavel Machek
2006-11-03  9:04                         ` David Miller
2006-11-07 12:05                           ` Jeff Garzik
2006-11-03  9:13                         ` Evgeniy Polyakov
2006-11-05 11:19                           ` Pavel Machek
2006-11-05 11:43                             ` Evgeniy Polyakov
2006-11-07 12:02                 ` Jeff Garzik
2006-11-03 18:49               ` Oleg Verych
2006-11-04 10:24                 ` Evgeniy Polyakov
2006-11-04 17:47                 ` Evgeniy Polyakov
2006-11-01 16:07     ` James Morris
2006-11-07 16:50 ` [take23 0/5] " Evgeniy Polyakov
2006-11-07 16:50   ` [take23 1/5] kevent: Description Evgeniy Polyakov
2006-11-07 16:50     ` [take23 2/5] kevent: Core files Evgeniy Polyakov
2006-11-07 16:50       ` [take23 3/5] kevent: poll/select() notifications Evgeniy Polyakov
2006-11-07 16:50         ` [take23 4/5] kevent: Socket notifications Evgeniy Polyakov
2006-11-07 16:50           ` [take23 5/5] kevent: Timer notifications Evgeniy Polyakov
2006-11-07 22:53         ` [take23 3/5] kevent: poll/select() notifications Davide Libenzi
2006-11-08  8:45           ` Evgeniy Polyakov
2006-11-08 17:03             ` Evgeniy Polyakov
2006-11-07 22:16       ` [take23 2/5] kevent: Core files Andrew Morton
2006-11-08  8:24         ` Evgeniy Polyakov
2006-11-07 22:16     ` [take23 1/5] kevent: Description Andrew Morton
2006-11-08  8:23       ` Evgeniy Polyakov
2006-11-07 22:17   ` [take23 0/5] kevent: Generic event handling mechanism Andrew Morton
2006-11-08  8:21     ` Evgeniy Polyakov
2006-11-08 14:51       ` Eric Dumazet
2006-11-08 22:03         ` Andrew Morton
2006-11-08 22:44           ` Davide Libenzi
2006-11-08 23:07             ` Eric Dumazet
2006-11-08 23:56               ` Davide Libenzi
2006-11-09  7:24                 ` Eric Dumazet
2006-11-09  7:52                   ` Eric Dumazet
2006-11-09 17:12                     ` Davide Libenzi
2006-11-09  8:23 ` [take24 0/6] " Evgeniy Polyakov
2006-11-09  8:23   ` [take24 1/6] kevent: Description Evgeniy Polyakov
2006-11-09  8:23     ` [take24 2/6] kevent: Core files Evgeniy Polyakov
2006-11-09  8:23       ` [take24 3/6] kevent: poll/select() notifications Evgeniy Polyakov
2006-11-09  8:23         ` [take24 4/6] kevent: Socket notifications Evgeniy Polyakov
2006-11-09  8:23           ` [take24 5/6] kevent: Timer notifications Evgeniy Polyakov
2006-11-09  8:23             ` [take24 6/6] kevent: Pipe notifications Evgeniy Polyakov
2006-11-09  9:08         ` [take24 3/6] kevent: poll/select() notifications Eric Dumazet
2006-11-09  9:29           ` Evgeniy Polyakov
2006-11-09 18:51         ` Davide Libenzi
2006-11-09 19:10           ` Evgeniy Polyakov
2006-11-09 19:42             ` Davide Libenzi
2006-11-09 20:10               ` Davide Libenzi
2006-11-11 17:36   ` [take24 7/6] kevent: signal notifications Evgeniy Polyakov
2006-11-11 22:28   ` [take24 0/6] kevent: Generic event handling mechanism Ulrich Drepper
2006-11-13 10:54     ` Evgeniy Polyakov
2006-11-13 11:16       ` Evgeniy Polyakov
2006-11-20  0:02       ` Ulrich Drepper
2006-11-20  8:25         ` Evgeniy Polyakov
2006-11-20  8:43           ` Andrew Morton
2006-11-20  8:51             ` Evgeniy Polyakov
2006-11-20  9:15               ` Andrew Morton
2006-11-20  9:19                 ` Evgeniy Polyakov
2006-11-20 20:29           ` Ulrich Drepper
2006-11-20 21:46             ` Jeff Garzik
2006-11-20 21:52               ` Ulrich Drepper
2006-11-21  9:09                 ` Ingo Oeser
2006-11-22 11:38                 ` Michael Tokarev
2006-11-22 11:47                   ` Evgeniy Polyakov
2006-11-22 12:33                   ` Jeff Garzik
2006-11-21  9:53             ` Evgeniy Polyakov
2006-11-21 16:58               ` Ulrich Drepper [this message]
2006-11-21 17:43                 ` Evgeniy Polyakov
2006-11-21 18:46                   ` Evgeniy Polyakov
2006-11-21 20:01                     ` Jeff Garzik
2006-11-22 10:41                       ` Evgeniy Polyakov
2006-11-21 20:19                     ` Jeff Garzik
2006-11-22 10:39                       ` Evgeniy Polyakov
2006-11-22  7:38                     ` Ulrich Drepper
2006-11-22 10:44                       ` Evgeniy Polyakov
2006-11-22 21:02                         ` Ulrich Drepper
2006-11-23 12:23                           ` Evgeniy Polyakov
2006-11-23  8:52                         ` Kevent POSIX timers support Evgeniy Polyakov
2006-11-23 20:26                           ` Ulrich Drepper
2006-11-24  9:50                             ` Evgeniy Polyakov
2006-11-27 18:20                               ` Ulrich Drepper
2006-11-27 18:24                                 ` David Miller
2006-11-27 18:36                                   ` Ulrich Drepper
2006-11-27 18:49                                     ` David Miller
2006-11-28  9:16                                       ` Evgeniy Polyakov
2006-11-28 19:13                                         ` David Miller
2006-11-28 19:22                                           ` Evgeniy Polyakov
2006-12-12  1:36                                             ` David Miller
2006-12-12  5:31                                               ` Evgeniy Polyakov
2006-11-28  9:16                                 ` Evgeniy Polyakov
2006-12-13 13:21                           ` Tushar Adeshara
2006-12-13 13:27                             ` Evgeniy Polyakov
2006-11-22  7:33                   ` [take24 0/6] kevent: Generic event handling mechanism Ulrich Drepper
2006-11-22 10:38                     ` Evgeniy Polyakov
2006-11-22 22:22                       ` Ulrich Drepper
2006-11-23 12:18                         ` Evgeniy Polyakov
2006-11-23 22:23                           ` Ulrich Drepper
2006-11-24 10:57                             ` Evgeniy Polyakov
2006-11-27 19:12                               ` Ulrich Drepper
2006-11-28 11:00                                 ` Evgeniy Polyakov
2006-11-22 12:09                     ` Evgeniy Polyakov
2006-11-22 12:15                       ` Evgeniy Polyakov
2006-11-22 13:46                         ` Evgeniy Polyakov
2006-11-22 22:24                         ` Ulrich Drepper
2006-11-23 12:22                           ` Evgeniy Polyakov
2006-11-23 20:34                             ` Ulrich Drepper
2006-11-24 10:58                               ` Evgeniy Polyakov
2006-11-27 18:23                                 ` Ulrich Drepper
2006-11-28 10:13                                   ` Evgeniy Polyakov
2006-12-27 20:45                                     ` Ulrich Drepper
2006-12-28  9:50                                       ` Evgeniy Polyakov
2006-11-21 16:29 ` [take25 " Evgeniy Polyakov
2006-11-21 16:29   ` [take25 1/6] kevent: Description Evgeniy Polyakov
2006-11-21 16:29     ` [take25 2/6] kevent: Core files Evgeniy Polyakov
2006-11-21 16:29       ` [take25 3/6] kevent: poll/select() notifications Evgeniy Polyakov
2006-11-21 16:29         ` [take25 4/6] kevent: Socket notifications Evgeniy Polyakov
2006-11-21 16:29           ` [take25 5/6] kevent: Timer notifications Evgeniy Polyakov
2006-11-21 16:29             ` [take25 6/6] kevent: Pipe notifications Evgeniy Polyakov
2006-11-22 11:20               ` Eric Dumazet
2006-11-22 11:30                 ` Evgeniy Polyakov
2006-11-22 23:46     ` [take25 1/6] kevent: Description Ulrich Drepper
2006-11-23 11:52       ` Evgeniy Polyakov
2006-11-23 19:45         ` Ulrich Drepper
2006-11-24 11:01           ` Evgeniy Polyakov
2006-11-24 16:06             ` Ulrich Drepper
2006-11-24 16:14               ` Evgeniy Polyakov
2006-11-24 16:31                 ` Evgeniy Polyakov
2006-11-27 19:20                 ` Ulrich Drepper
2006-11-22 23:52     ` Ulrich Drepper
2006-11-23 11:55       ` Evgeniy Polyakov
2006-11-23 20:00         ` Ulrich Drepper
2006-11-23 21:49           ` Hans Henrik Happe
2006-11-23 22:34             ` Ulrich Drepper
2006-11-24 11:50               ` Evgeniy Polyakov
2006-11-24 16:17                 ` Ulrich Drepper
2006-11-24 11:46           ` Evgeniy Polyakov
2006-11-24 16:30             ` Ulrich Drepper
2006-11-24 16:49               ` Evgeniy Polyakov
2006-11-27 19:23                 ` Ulrich Drepper
2006-11-23 22:33     ` Ulrich Drepper
2006-11-23 22:48       ` Jeff Garzik
2006-11-23 23:45         ` Ulrich Drepper
2006-11-24  0:48           ` Eric Dumazet
2006-11-24  8:14             ` Andrew Morton
2006-11-24  8:33               ` Eric Dumazet
2006-11-24 15:26                 ` Ulrich Drepper
2006-11-24 13:07           ` Miquel van Smoorenburg
2006-11-24  0:14         ` Hans Henrik Happe
2006-11-24 12:05       ` Evgeniy Polyakov
2006-11-24 12:13         ` Evgeniy Polyakov
2006-11-27 19:43         ` Ulrich Drepper
2006-11-28 10:26           ` Evgeniy Polyakov
2006-11-30 19:14 ` [take26 0/8] kevent: Generic event handling mechanism Evgeniy Polyakov
2006-11-30 19:14   ` [take26 1/8] kevent: Description Evgeniy Polyakov
2006-11-30 19:14     ` [take26 2/8] kevent: Core files Evgeniy Polyakov
2006-11-30 19:14       ` [take26 3/8] kevent: poll/select() notifications Evgeniy Polyakov
2006-11-30 19:14         ` [take26 4/8] kevent: Socket notifications Evgeniy Polyakov
2006-11-30 19:14           ` [take26 5/8] kevent: Timer notifications Evgeniy Polyakov
2006-11-30 19:14             ` [take26 6/8] kevent: Pipe notifications Evgeniy Polyakov
2006-11-30 19:14               ` [take26 7/8] kevent: Signal notifications Evgeniy Polyakov
2006-11-30 19:14                 ` [take26 8/8] kevent: Kevent posix timer notifications Evgeniy Polyakov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=45633049.2000209@redhat.com \
    --to=drepper@redhat.com \
    --cc=akpm@osdl.org \
    --cc=aviro@redhat.com \
    --cc=chase.venters@clientec.com \
    --cc=davem@davemloft.net \
    --cc=hch@infradead.org \
    --cc=jeff@garzik.org \
    --cc=johann.borck@densedata.com \
    --cc=johnpol@2ka.mipt.ru \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=zach.brown@oracle.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).