linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Procfs race condition bug
@ 2012-07-20 18:12 Mike Cardwell
  0 siblings, 0 replies; 5+ messages in thread
From: Mike Cardwell @ 2012-07-20 18:12 UTC (permalink / raw)
  To: linux-kernel

[-- Attachment #1: Type: text/plain, Size: 1085 bytes --]

I *think* I've uncovered a race condition bug in procfs. If I attempt to
open a file in /proc/net, eg "/proc/net/tcp" it works fine, but if I
spawn a POSIX thread and attempt to do it from there, it *usually* fails
with a "No such file or directory", but some times succeeds. If I do a
system call inside the thread to look up the thread ID and then open
"/proc/THREADID/net/tcp" instead, it works fine.

There are more details and  some example code so you can replicate the
problem on a stack overflow question I asked earlier today here:
http://stackoverflow.com/questions/11580020/opening-proc-net-tcp-in-c-from-a-posix-thread-fails-most-of-the-time

This is the first time I have attempted to report a (suspected) Linux
kernel bug, so I apologise if I have made any mistakes. I am not
subscribed to the list, so please Cc me in on any responses.

Regards,

-- 
Mike Cardwell  https://grepular.com/     http://cardwellit.com/
OpenPGP Key    35BC AF1D 3AA2 1F84 3DC3  B0CF 70A5 F512 0018 461F
XMPP OTR Key   8924 B06A 7917 AAF3 DBB1  BF1B 295C 3C78 3EF1 46B4


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 598 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Procfs race condition bug
  2014-07-09 12:20 ` Alexey Dobriyan
@ 2014-07-10  4:40   ` Eric W. Biederman
  0 siblings, 0 replies; 5+ messages in thread
From: Eric W. Biederman @ 2014-07-10  4:40 UTC (permalink / raw)
  To: Alexey Dobriyan
  Cc: Mike Cardwell, Andrew Morton, Pavel Emelianov, Linux Kernel,
	netdev, One Thousand Gnomes

Alexey Dobriyan <adobriyan@gmail.com> writes:

> [broken email]
>
> On Wed, Jul 9, 2014 at 3:17 PM, Alexey Dobriyan <adobriyan@gmail.com> wrote:
>>> I originally posted this two years ago (*) but received no response.
>>> I just had a look and the problem still exists on the 3.14 kernel
>>> I am currently running.
>>>
>>> I *think* I've uncovered a race condition bug in procfs.
>>> If I attempt to open a file in /proc/net, eg "/proc/net/tcp"
>>> it works fine, but if I spawn a POSIX thread and attempt to do it
>>> from there, it *usually* fails with a "No such file or directory",
>>> but some times succeeds. If I do a system call inside the thread
>>> to look up the thread ID and then open "/proc/THREADID/net/tcp"
>>> instead, it works fine.
>>>
>>> There are more details and some example code
>>> so you can replicate the problem on a stack overflow question
>>> I asked previously here:
>>> http://stackoverflow.com/questions/11580020/opening-proc-net-tcp-in-c-from-a-posix-thread-fails-most-of-the-time
>>>
>>> (*) https://lkml.org/lkml/2012/7/20/331
>>
>> Mike,
>>
>> as was correctly notes on SO, what's happening is that original thread exits
>> before spawned thread does open().
>>
>> ->lookup
>> proc_tgid_net_lookup
>> get_proc_task_net
>> nsproxy = NULL          <== thread is dead
>> ENOENT
>>
>> This was probably broken when /proc/net became symlink:
>>
>> commit e9720acd728a46cb40daa52c99a979f7c4ff195c
>> Author: Pavel Emelyanov <xemul@openvz.org>
>> Date:   Fri Mar 7 11:08:40 2008 -0800
>>
>>     [NET]: Make /proc/net a symlink on /proc/self/net (v3)
>>
>>
>> So, userspace has two solutions:
>> 1) original thread doesn't exit too early
>> 2) spawned thread uses /proc/$TID
>>
>>
>> So,
>> we definitely broke /proc/net/tcp somewhere after netns concept was introduced.
>>
>> But,
>> you'd have very same problem with other /proc files (anything under
>> /proc/$PID/).


Agreed it is a /proc/$TGID vs /proc/$TID issue.

In principle this is fixable by creating a /proc/current symlink that
always points to the proc directory for the current thread and then
pointing /proc/net and /proc/mounts at it.

This is one of those weird cases it so that /proc/net or /proc/mounts
resolves may actually break an existing userspace application, because
different threads can point at different values.  (I very much dislike
what the linux pthread support did to /proc/self).

I tilted at that windmill once and ran out of steam.  While I can write
the patch I don't have the energy to test and see if there are any
pthread programs that will break if /proc/net points to
/proc/current/net instead of /proc/self/net.

Frankly new applications should be using netlink and not /proc/net so I
personally don't think this is worth fixing for the /proc/net case.  Are
there real world applications that are broken by the kernel change in
behavior?  The stackoverflow discussion sounds like it was just an
investigation into weird kernel behavior.

Eric

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Procfs race condition bug
  2014-07-09 12:17 Alexey Dobriyan
@ 2014-07-09 12:20 ` Alexey Dobriyan
  2014-07-10  4:40   ` Eric W. Biederman
  0 siblings, 1 reply; 5+ messages in thread
From: Alexey Dobriyan @ 2014-07-09 12:20 UTC (permalink / raw)
  To: Mike Cardwell
  Cc: Andrew Morton, Pavel Emelianov, Eric W. Biederman, Linux Kernel

[broken email]

On Wed, Jul 9, 2014 at 3:17 PM, Alexey Dobriyan <adobriyan@gmail.com> wrote:
>> I originally posted this two years ago (*) but received no response.
>> I just had a look and the problem still exists on the 3.14 kernel
>> I am currently running.
>>
>> I *think* I've uncovered a race condition bug in procfs.
>> If I attempt to open a file in /proc/net, eg "/proc/net/tcp"
>> it works fine, but if I spawn a POSIX thread and attempt to do it
>> from there, it *usually* fails with a "No such file or directory",
>> but some times succeeds. If I do a system call inside the thread
>> to look up the thread ID and then open "/proc/THREADID/net/tcp"
>> instead, it works fine.
>>
>> There are more details and some example code
>> so you can replicate the problem on a stack overflow question
>> I asked previously here:
>> http://stackoverflow.com/questions/11580020/opening-proc-net-tcp-in-c-from-a-posix-thread-fails-most-of-the-time
>>
>> (*) https://lkml.org/lkml/2012/7/20/331
>
> Mike,
>
> as was correctly notes on SO, what's happening is that original thread exits
> before spawned thread does open().
>
> ->lookup
> proc_tgid_net_lookup
> get_proc_task_net
> nsproxy = NULL          <== thread is dead
> ENOENT
>
> This was probably broken when /proc/net became symlink:
>
> commit e9720acd728a46cb40daa52c99a979f7c4ff195c
> Author: Pavel Emelyanov <xemul@openvz.org>
> Date:   Fri Mar 7 11:08:40 2008 -0800
>
>     [NET]: Make /proc/net a symlink on /proc/self/net (v3)
>
>
> So, userspace has two solutions:
> 1) original thread doesn't exit too early
> 2) spawned thread uses /proc/$TID
>
>
> So,
> we definitely broke /proc/net/tcp somewhere after netns concept was introduced.
>
> But,
> you'd have very same problem with other /proc files (anything under
> /proc/$PID/).
>
>     Alexey
>
>
> #include <sys/types.h>
> #include <sys/stat.h>
> #include <fcntl.h>
> #include <stdio.h>
> #include <pthread.h>
> #include <unistd.h>
>
> void *f(void *_)
> {
> int fd;
>
> sleep(1);
>
> fd = open("/proc/net/tcp", O_RDONLY);
> if (fd == -1) {
> fprintf(stderr, "FAIL\n");
> return NULL;
> }
> fprintf(stderr, "OK\n");
> return NULL;
> }
>
> int main(void)
> {
> pthread_t thread;
>
> pthread_create(&thread, NULL, f, NULL);
> pthread_exit(0);
> return 0;
> }

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Procfs race condition bug
@ 2014-07-09 12:17 Alexey Dobriyan
  2014-07-09 12:20 ` Alexey Dobriyan
  0 siblings, 1 reply; 5+ messages in thread
From: Alexey Dobriyan @ 2014-07-09 12:17 UTC (permalink / raw)
  To: Mike Cardwell
  Cc: Andrew Morton, Pavel Emelianov, Eric W. Biederman, Linux Kernel

> I originally posted this two years ago (*) but received no response.
> I just had a look and the problem still exists on the 3.14 kernel
> I am currently running.
>
> I *think* I've uncovered a race condition bug in procfs.
> If I attempt to open a file in /proc/net, eg "/proc/net/tcp"
> it works fine, but if I spawn a POSIX thread and attempt to do it
> from there, it *usually* fails with a "No such file or directory",
> but some times succeeds. If I do a system call inside the thread
> to look up the thread ID and then open "/proc/THREADID/net/tcp"
> instead, it works fine.
>
> There are more details and some example code
> so you can replicate the problem on a stack overflow question
> I asked previously here:
> http://stackoverflow.com/questions/11580020/opening-proc-net-tcp-in-c-from-a-posix-thread-fails-most-of-the-time
>
> (*) https://lkml.org/lkml/2012/7/20/331

Mike,

as was correctly notes on SO, what's happening is that original thread exits
before spawned thread does open().

->lookup
proc_tgid_net_lookup
get_proc_task_net
nsproxy = NULL          <== thread is dead
ENOENT

This was probably broken when /proc/net became symlink:

commit e9720acd728a46cb40daa52c99a979f7c4ff195c
Author: Pavel Emelyanov <xemul@openvz.org>
Date:   Fri Mar 7 11:08:40 2008 -0800

    [NET]: Make /proc/net a symlink on /proc/self/net (v3)


So, userspace has two solutions:
1) original thread doesn't exit too early
2) spawned thread uses /proc/$TID


So,
we definitely broke /proc/net/tcp somewhere after netns concept was introduced.

But,
you'd have very same problem with other /proc files (anything under
/proc/$PID/).

    Alexey


#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <stdio.h>
#include <pthread.h>
#include <unistd.h>

void *f(void *_)
{
int fd;

sleep(1);

fd = open("/proc/net/tcp", O_RDONLY);
if (fd == -1) {
fprintf(stderr, "FAIL\n");
return NULL;
}
fprintf(stderr, "OK\n");
return NULL;
}

int main(void)
{
pthread_t thread;

pthread_create(&thread, NULL, f, NULL);
pthread_exit(0);
return 0;
}

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Procfs race condition bug
@ 2014-07-04 10:13 Mike Cardwell
  0 siblings, 0 replies; 5+ messages in thread
From: Mike Cardwell @ 2014-07-04 10:13 UTC (permalink / raw)
  To: linux-kernel

[-- Attachment #1: Type: text/plain, Size: 1078 bytes --]

I originally posted this two years ago (*) but received no response. I
just had a look and the problem still exists on the 3.14 kernel I am
currently running.

I *think* I've uncovered a race condition bug in procfs. If I attempt to
open a file in /proc/net, eg "/proc/net/tcp" it works fine, but if I
spawn a POSIX thread and attempt to do it from there, it *usually* fails
with a "No such file or directory", but some times succeeds. If I do a
system call inside the thread to look up the thread ID and then open
"/proc/THREADID/net/tcp" instead, it works fine.

There are more details and some example code so you can replicate the
problem on a stack overflow question I asked previously here:
http://stackoverflow.com/questions/11580020/opening-proc-net-tcp-in-c-from-a-posix-thread-fails-most-of-the-time

(*) https://lkml.org/lkml/2012/7/20/331

-- 
Mike Cardwell  https://grepular.com https://emailprivacytester.com
OpenPGP Key    35BC AF1D 3AA2 1F84 3DC3   B0CF 70A5 F512 0018 461F
XMPP OTR Key   8924 B06A 7917 AAF3 DBB1   BF1B 295C 3C78 3EF1 46B4

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 598 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2014-07-10  4:43 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-07-20 18:12 Procfs race condition bug Mike Cardwell
2014-07-04 10:13 Mike Cardwell
2014-07-09 12:17 Alexey Dobriyan
2014-07-09 12:20 ` Alexey Dobriyan
2014-07-10  4:40   ` Eric W. Biederman

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).