I've bumped SRCREV to include your latest fix and re-enabled epoll and now it doesn't get stuck for me, thanks! I did quick check for UID/GID issue: https://bugzilla.yoctoproject.org/show_bug.cgi?id=12434 without epoll disabled and SRCREV "b6a015a Handle O_TMPFILE more better" it was still reproducible, now I'm testing with epoll enabled and latest SRCREV "f3f4459 Epoll: use the correct client" Regards, On Mon, Feb 19, 2018 at 6:55 PM, Seebs wrote: > On Mon, 19 Feb 2018 11:27:56 +0200 > Alexander Kanavin wrote: > > > > Huh. It's possible that the initial "don't try to close fd 0" was > > > correct, and the real problem is that the attempt is getting made > > > mistakenly. I'll study that more; the epoll code was a contribution > > > and I may not have fully understood it. > > > > To be honest, it would have been better to apply my epoll patch as it > > is, and then do additional modifications as separate commits. That > > would make it simpler to isolate the issue. We've used my epoll patch > > for many months without problems on the autobuilder and elsewhere. > > ... Wow, you know, now that you *mention* it, that is a really good > idea. *sigh* Sorry about that. > > Hmm. > > > if (clients[events[i].data.u64].fd == listen_fd) { > > Just a sanity-check: Should this be equivalent to: > if (events[[i].data.u64 == 0) > ? > > The reason I ask is that, looking at the code, we should never, ever, > be getting into close_client(0). The "<=" check was right. > > The only call to close_client anywhere in the epoll case is: > > > } else { > > int n = 0; > > ioctl(clients[i].fd, FIONREAD, &n); > > if (n == 0) { > > close_client(i); > > } else { > > serve_client(i); > > } > > And that's the else for clients... oh hey > > > > if (clients[events[i].data.u64].fd == listen_fd) { > ... > > ioctl(clients[i].fd, FIONREAD, &n); > > do you see the error? I do. > > This gets back to "and one of the problems with testing is that > if I don't actually check the logs, I often don't see problems", > because pseudo does enough internal disaster recovery that things can > explode horribly without observable failures. > > Now extracting the data.u64 value and using that consistently as the > index. Pushed fix to master. > > -s >