* Behavior of poll() within a module @ 2001-10-23 12:53 Richard B. Johnson 2001-10-23 13:33 ` Mike Jagdis 0 siblings, 1 reply; 8+ messages in thread From: Richard B. Johnson @ 2001-10-23 12:53 UTC (permalink / raw) To: Linux kernel What is the intended behavior of poll within a module when two or more tasks are sleeping in poll? Specifically, when wake_up_interruptible is executed from a module, are all tasks awakened or is only one? If only one, is it the first task to call poll/select or the last, which is awakened first? Cheers, Dick Johnson Penguin : Linux version 2.4.1 on an i686 machine (799.53 BogoMips). I was going to compile a list of innovations that could be attributed to Microsoft. Once I realized that Ctrl-Alt-Del was handled in the BIOS, I found that there aren't any. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Behavior of poll() within a module 2001-10-23 12:53 Behavior of poll() within a module Richard B. Johnson @ 2001-10-23 13:33 ` Mike Jagdis 2001-10-23 14:13 ` Richard B. Johnson 0 siblings, 1 reply; 8+ messages in thread From: Mike Jagdis @ 2001-10-23 13:33 UTC (permalink / raw) To: root; +Cc: Linux kernel Richard B. Johnson wrote: > What is the intended behavior of poll within a module when > two or more tasks are sleeping in poll? Specifically, when > wake_up_interruptible is executed from a module, are all > tasks awakened or is only one? If only one, is it the > first task to call poll/select or the last, which is awakened > first? Normally all but... In the case of accept() the kernel knows that a woken up process will service the event (the code is all in the kernel) so the accept code flags its wait_queue entry to say, "If this gets woken don't wake the rest". Unfortunately the way wake one and wake all semantics are handled within the wait queue mean we get FIFO rather than LIFO behaviour I believe :-( I guess you could do "the accept() thing" yourself from a module - or add a Linux specific poll flag and do it from user space for that matter. Oh, and yes, wake all is required behaviour of poll/select. It's just accept that is special because all the code is in the kernel and if it gets woken we _know_ that event is no longer present. Mike ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Behavior of poll() within a module 2001-10-23 13:33 ` Mike Jagdis @ 2001-10-23 14:13 ` Richard B. Johnson 2001-10-24 9:52 ` Mike Jagdis 0 siblings, 1 reply; 8+ messages in thread From: Richard B. Johnson @ 2001-10-23 14:13 UTC (permalink / raw) To: Mike Jagdis; +Cc: Linux kernel On Tue, 23 Oct 2001, Mike Jagdis wrote: > Richard B. Johnson wrote: > > What is the intended behavior of poll within a module when > > two or more tasks are sleeping in poll? Specifically, when > > wake_up_interruptible is executed from a module, are all > > tasks awakened or is only one? If only one, is it the > > first task to call poll/select or the last, which is awakened > > first? > > Normally all but... In the case of accept() the kernel knows > that a woken up process will service the event (the code is > all in the kernel) so the accept code flags its wait_queue > entry to say, "If this gets woken don't wake the rest". > Unfortunately the way wake one and wake all semantics are > handled within the wait queue mean we get FIFO rather than > LIFO behaviour I believe :-( > > I guess you could do "the accept() thing" yourself from a > module - or add a Linux specific poll flag and do it from > user space for that matter. > > Oh, and yes, wake all is required behaviour of poll/select. > It's just accept that is special because all the code is in > the kernel and if it gets woken we _know_ that event is no > longer present. > > Mike > Well it doesn't seem to do as described. The following actual module code: static unsigned int vxi_poll(struct file *fp, struct poll_table_struct *wait) { unsigned long flags; unsigned int mask; DEB(printk("vxi_poll\n")); info->poll_active++; poll_wait(fp, &info->wait, wait); spin_lock_irqsave(&vxi_lock, flags); mask = info->poll_mask; if(!--info->poll_active) info->poll_mask = 0; spin_unlock_irqrestore(&vxi_lock, flags); DEB(printk("vxi_poll returns\n")); return mask; } ... results in one task getting the correct return value. Other tasks get 0 if they are awakened at all. With two tasks sleeping in user mode select(), both tasks seem to always be awakened. With three or more tasks waiting in select(), only one task gets awakened. I haven't been able to figure it out. Also, with no tasks sleeping in select(), debugging shows that the module poll() routine is entered, based upon some previous history. For instance, if a user-mode task never called select, but some event that would have awakened the task occurs, another task that calls select ends up returning immediately with a stale (weeks old) event. Maybe there is some way to clear kernel history that could be accomplished during close()? Cheers, Dick Johnson Penguin : Linux version 2.4.1 on an i686 machine (799.53 BogoMips). I was going to compile a list of innovations that could be attributed to Microsoft. Once I realized that Ctrl-Alt-Del was handled in the BIOS, I found that there aren't any. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Behavior of poll() within a module 2001-10-23 14:13 ` Richard B. Johnson @ 2001-10-24 9:52 ` Mike Jagdis 0 siblings, 0 replies; 8+ messages in thread From: Mike Jagdis @ 2001-10-24 9:52 UTC (permalink / raw) To: root; +Cc: Linux kernel Richard B. Johnson wrote: > ... results in one task getting the correct return value. Other tasks > get 0 if they are awakened at all. With two tasks sleeping in user mode > select(), both tasks seem to always be awakened. With three or more > tasks waiting in select(), only one task gets awakened. I haven't > been able to figure it out. Ah, now there's a difference between the process being woken up and poll/select returning. When an event happens the sleeping processes all get woken up. One of them gets scheduled - which means it continues from where it went to sleep in the kernel's poll code. The first thing that does is scan the descriptors to see if any events are outstanding. If they are poll() returns. Otherwise the process goes back to sleep. If the first process scheduled returns to user space and does a read/write/whatever it can clear the event condition on the descriptor. Then when other processes eventually get scheduled they recheck the descriptor, find no interesting events, and go back to sleep. Of course, if the first process returns then has to wait for, say, a page to be faulted in a second process may get scheduled, see the event, and also return to user space. So you get one _or more_ processes returning from poll(). This is why you should always use non-blocking I/O if there are multiple processes doing poll/select on the same descriptors :-). > Also, with no tasks sleeping in select(), debugging shows that the > module poll() routine is entered, based upon some previous history. > For instance, if a user-mode task never called select, but some > event that would have awakened the task occurs, another task that > calls select ends up returning immediately with a stale (weeks old) > event. > > Maybe there is some way to clear kernel history that could be > accomplished during close()? There is no history. Poll is level based, not edge based. When you call poll() the descriptor's poll function is polled to find the current state. If there are no flags of interest set the process sleeps. A wake up on the descriptor is simply an indication that something _might_ have changed and sleeping processes should wake up and invoke the descriptor's poll function to recheck the state. If I followed what you were trying to do correctly I don't think poll/select is the way to go. I think what you want is to have an "event count" associated with the driver, incrementing on each interrupt (or whatever changes state). Then use an ioctl to for user space apps to track it. The app passes the last event number it handled through the ioctl. If the driver event number is greater it is returned otherwise the process is added to the descriptor's wait_queue. Oh yeah, the interrupt should call wake_up on the descriptor as well as increment the event number :-). Mike ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Behavior of poll() within a module
@ 2001-10-23 15:58 Manfred Spraul
2001-10-23 17:17 ` Richard B. Johnson
0 siblings, 1 reply; 8+ messages in thread
From: Manfred Spraul @ 2001-10-23 15:58 UTC (permalink / raw)
To: "Richard B. Johnson"; +Cc: linux-kernel
>
> The following actual module code:
>
> static unsigned int vxi_poll(struct file *fp, struct poll_table_struct *wait)
> {
> unsigned long flags;
> unsigned int mask;
> DEB(printk("vxi_poll\n"));
> info->poll_active++;
> poll_wait(fp, &info->wait, wait);
> spin_lock_irqsave(&vxi_lock, flags);
> mask = info->poll_mask;
> if(!--info->poll_active)
> info->poll_mask = 0;
> spin_unlock_irqrestore(&vxi_lock, flags);
> DEB(printk("vxi_poll returns\n"));
> return mask;
> }
Which module is that? I can't find it in Linus tree.
Is "info" a global variable?
* poll is called without any SMP locking, "info->poll_active++" is not SMP safe. Use atomic_inc, or even better just delete that
line.
* Clearing poll_mask during poll is wrong.
poll should return the events that are currently available, i.e. what would happen if read() or write() would be called now.
read() on a non-blocking file handle would return immediately with 1 or more bytes read --> set POLLIN
write() on a non-blocking file handle would return immediately with a nonzero byte count written--> set POLLOUT.
The clearing of poll_mask must occur during read() and write() if these conditions are not true anymore.
--
Manfred
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Behavior of poll() within a module 2001-10-23 15:58 Manfred Spraul @ 2001-10-23 17:17 ` Richard B. Johnson 2001-10-23 18:02 ` Manfred Spraul 0 siblings, 1 reply; 8+ messages in thread From: Richard B. Johnson @ 2001-10-23 17:17 UTC (permalink / raw) To: Manfred Spraul; +Cc: linux-kernel On Tue, 23 Oct 2001, Manfred Spraul wrote: > > > > The following actual module code: > > > > static unsigned int vxi_poll(struct file *fp, struct poll_table_struct *wait) > > { > > unsigned long flags; > > unsigned int mask; > > DEB(printk("vxi_poll\n")); > > info->poll_active++; > > poll_wait(fp, &info->wait, wait); > > spin_lock_irqsave(&vxi_lock, flags); > > mask = info->poll_mask; > > if(!--info->poll_active) > > info->poll_mask = 0; > > spin_unlock_irqrestore(&vxi_lock, flags); > > DEB(printk("vxi_poll returns\n")); > > return mask; > > } > Which module is that? I can't find it in Linus tree. > Is "info" a global variable? info is a global, a structure in allocated memory. > > * poll is called without any SMP locking, "info->poll_active++" is not SMP safe. Use atomic_inc, or even better just delete that > line. > * Clearing poll_mask during poll is wrong. > poll should return the events that are currently available, > i.e. what would happen if read() or write() would be called now. > In this module, there isn't any read() or write() event that can clear the poll mask. Instead, the sole purpose of poll is to tell the user-mode caller that there is new status available as a result of an interrupt. This is a module that controls a motor. It runs <forever> on its own. It gets new parameters via an ioctl(). It reports exceptions (like overload conditions) using the poll mechanism. The caller reads the cached status via an ioctl(). Any caller sleeping in poll must be awakened as a result of the interrupt event. Any caller can read the cached status at any time. If this was allowed to clear the poll mask, only one caller would be awakened. If I don't clear the poll mask when the last caller has been awakened, then any subsequent call to poll results in an immediate bogus return. Although the code may need to be changed, the logic is correct and it must work as specified. The problem is that sometimes it works and sometimes it doesn't. In particular, if one user is sleeping in poll() it works. If two users are sleeping in poll() sometimes one of the callers doesn't get awakened. If three users are sleeping in poll(), all bets are off and I get strange return values, not related to the poll mask. It four users are sleeping in poll(), they all return immediately <forever>. Going to a single poll-caller doesn't fix the problem, some history is left in the kernel. I have to remove the module, then insert the module to "fix" the problem. Here is a little test program that shows the same problem when executed using /dev/console, (or any other device you chose) etc., for input multiple times.. /* * This tests select() in various modules. */ #include <stdio.h> #include <stdlib.h> #include <unistd.h> #include <fcntl.h> #include <sys/poll.h> #include <errno.h> #include <string.h> int main(int args, char *argv[]) { int fd, retval; fd_set rfds; char buf[0x1000]; if(args < 2) { fprintf(stderr,"Usage:\n%s device\n", argv[0]); exit(EXIT_FAILURE); } if((fd = open(argv[1], O_RDWR)) < 0) { fprintf(stderr, "Can't open device, %s\n", argv[1]); exit(EXIT_FAILURE); } for(;;) { FD_ZERO(&rfds); FD_SET(fd, &rfds); if((retval = select(fd+1, &rfds, NULL, NULL, NULL)) < 0) fprintf(stderr, "Error was %s\n", strerror(errno)); else { // printf("Return = %d\n", retval); if(FD_ISSET(fd, &rfds)) { fprintf(stderr,"."); // (void)read(fd, buf, sizeof(buf)); } // printf("Input is available\n"); } } if(close(fd) < 0) { fprintf(stderr, "Can't close device, %s\n", argv[1]); exit(EXIT_FAILURE); } return 0; } Cheers, Dick Johnson Penguin : Linux version 2.4.1 on an i686 machine (799.53 BogoMips). I was going to compile a list of innovations that could be attributed to Microsoft. Once I realized that Ctrl-Alt-Del was handled in the BIOS, I found that there aren't any. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Behavior of poll() within a module 2001-10-23 17:17 ` Richard B. Johnson @ 2001-10-23 18:02 ` Manfred Spraul 2001-10-23 18:16 ` Richard B. Johnson 0 siblings, 1 reply; 8+ messages in thread From: Manfred Spraul @ 2001-10-23 18:02 UTC (permalink / raw) To: root; +Cc: linux-kernel From: "Richard B. Johnson" <root@chaos.analogic.com> > > In this module, there isn't any read() or write() event that can > clear the poll mask. Instead, the sole purpose of poll is to tell > the user-mode caller that there is new status available as a result > of an interrupt. This is a module that controls a motor. It runs > <forever> on its own. It gets new parameters via an ioctl(). It > reports exceptions (like overload conditions) using the poll > mechanism. > > The caller reads the cached status via an ioctl(). Any caller sleeping > in poll must be awakened as a result of the interrupt event. Any caller > can read the cached status at any time. If this was allowed to > clear the poll mask, only one caller would be awakened. > Ugh. ->poll must never change any state. The kernel is free to call poll multiple times (common are once, twice and three times). Can you use a per-filp pollmask? * remove poll_active * remove poll_mask * add event counters for every possible event. poll_count_POLLIN, poll_count_POLLOUT, etc. * interrupt handler increases the counter. * ioctl() copies the value of the counters into filp->private_data->event_handled_POLL{IN,OUT} * poll sets the pollmask if filp->private_data->event_seen != info->poll_count If filps (file descriptors) are shared between apps, then I have no idea how to fix your design. -- Manfred ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Behavior of poll() within a module 2001-10-23 18:02 ` Manfred Spraul @ 2001-10-23 18:16 ` Richard B. Johnson 0 siblings, 0 replies; 8+ messages in thread From: Richard B. Johnson @ 2001-10-23 18:16 UTC (permalink / raw) To: Manfred Spraul; +Cc: linux-kernel On Tue, 23 Oct 2001, Manfred Spraul wrote: > From: "Richard B. Johnson" <root@chaos.analogic.com> > > > > In this module, there isn't any read() or write() event that can > > clear the poll mask. Instead, the sole purpose of poll is to tell > > the user-mode caller that there is new status available as a result > > of an interrupt. This is a module that controls a motor. It runs > > <forever> on its own. It gets new parameters via an ioctl(). It > > reports exceptions (like overload conditions) using the poll > > mechanism. > > > > The caller reads the cached status via an ioctl(). Any caller sleeping > > in poll must be awakened as a result of the interrupt event. Any caller > > can read the cached status at any time. If this was allowed to > > clear the poll mask, only one caller would be awakened. > > > Ugh. > ->poll must never change any state. The kernel is free to call poll > multiple times (common are once, twice and three times). That's the problem. It calls poll even when there is not any user-mode select/poll active. > > Can you use a per-filp pollmask? Yes. You've got it. Private data is a place to put the per-process poll mask. > * remove poll_active > * remove poll_mask Yes. > * add event counters for every possible event. > poll_count_POLLIN, poll_count_POLLOUT, etc. > * interrupt handler increases the counter. > * ioctl() copies the value of the counters into > filp->private_data->event_handled_POLL{IN,OUT} > * poll sets the pollmask if > filp->private_data->event_seen != info->poll_count > Yes. May not actually need counters because each interrupt will cache a status that can be read by anybody. The struct file_pointer.private_data is the savior. > If filps (file descriptors) are shared between apps, then I have no > idea how to fix your design. > Cheers, Dick Johnson Penguin : Linux version 2.4.1 on an i686 machine (799.53 BogoMips). I was going to compile a list of innovations that could be attributed to Microsoft. Once I realized that Ctrl-Alt-Del was handled in the BIOS, I found that there aren't any. ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2001-10-24 9:52 UTC | newest] Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2001-10-23 12:53 Behavior of poll() within a module Richard B. Johnson 2001-10-23 13:33 ` Mike Jagdis 2001-10-23 14:13 ` Richard B. Johnson 2001-10-24 9:52 ` Mike Jagdis 2001-10-23 15:58 Manfred Spraul 2001-10-23 17:17 ` Richard B. Johnson 2001-10-23 18:02 ` Manfred Spraul 2001-10-23 18:16 ` Richard B. Johnson
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).