All of lore.kernel.org
 help / color / mirror / Atom feed
* Wrong UIDs reported in /proc/net/tcp
@ 2004-11-09 20:53 Chad N. Tindel
  2004-11-09 21:06 ` Herbert Xu
  2004-11-09 21:11 ` akepner
  0 siblings, 2 replies; 19+ messages in thread
From: Chad N. Tindel @ 2004-11-09 20:53 UTC (permalink / raw)
  To: netdev; +Cc: linux-net

Hello-

In our testing we've found some occurrences of identd reporting back the
wrong username for the owner of a socket.  We added some instrumentation
to identd so that we can tell it what we expect the username to be, and when
what it discovers doesn't match what we expect, it logs a message.  Sometimes
non-root users appear as root, and sometimes the root user appears as a non-root
user.  Here's an example log message from our instrumented identd:

Nov  9 00:55:11 rock identd[4139]: ERROR: Expected username 'root' but got 'ident'. Proc line was ' 299: AFAF0D0F:CEE1 ACAF0D0F:14B6 01 00000000:00000000 00:00000000 00000000   100        0 880370 1 f22bb980 21 4 8 3 -1 '.  EUID was 100

As you can see, our client was running as root, but identd reported that
the username was "ident".  /proc/net/tcp reported that the EUID was 100.  
Clearly it isn't correct that the ident user is associated with a socket that
doesn't have port 113 as one of the endpoints.

Here's another example:

Nov  8 17:19:06 rock identd[4139]: ERROR: Expected username 'rba0001f' but got 'root'. Proc line was '  19: AFAF0D0F:8FFD AFAF0D0F:14B6 01 00000000:00000000 02:000AFC6F 00000000     0        0 0 2 f04ed980 '.  EUID was 0

The user rba0001f has a UID of 65535 on this system.

We have seen this problem on RH3u3 as well as SLES9, so our current thoughts
are that it is a generic kernel issue.  We're starting to dive down in the
code to look for possible problems, but I wanted to bring it up to the list
and see if anybody had any ideas.  

In our cursory analysis of the code, we've seen that the inode structure which
holds the user id comes out of a cache.  In the function sock_alloc(), the
UID gets assigned as:

inode->i_uid = current->fsuid;

Is it possible for this code to be invoked in a context where current doesn't
actually point to the right task structure?

Is it possible for the socket to get added to the established hash table
before the uid pointer gets initialized, thus causing /proc/net/tcp to report
the UID of whoever used this socket memory earlier?

We can reproduce this problem fairly easily here, so if anybody has any ideas
or suggestions on kernel instrumentation that would help track this down, I'm
all ears.

Regards,

Chad

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Wrong UIDs reported in /proc/net/tcp
  2004-11-09 20:53 Wrong UIDs reported in /proc/net/tcp Chad N. Tindel
@ 2004-11-09 21:06 ` Herbert Xu
  2004-11-09 22:43   ` Chad N. Tindel
  2004-11-18 19:02   ` Chad N. Tindel
  2004-11-09 21:11 ` akepner
  1 sibling, 2 replies; 19+ messages in thread
From: Herbert Xu @ 2004-11-09 21:06 UTC (permalink / raw)
  To: Chad N. Tindel; +Cc: netdev, linux-net

Chad N. Tindel <chad@tindel.net> wrote:
> 
> In our testing we've found some occurrences of identd reporting back the
> wrong username for the owner of a socket.  We added some instrumentation

/proc/net/tcp is an obsolete interface.  It is inherently unreliable
in that a record may be read using two read(2) calls.  Those two calls
may end up looking at two different records.

So please use the netlink interface or ss(8) from the iproute package.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Wrong UIDs reported in /proc/net/tcp
  2004-11-09 20:53 Wrong UIDs reported in /proc/net/tcp Chad N. Tindel
  2004-11-09 21:06 ` Herbert Xu
@ 2004-11-09 21:11 ` akepner
  2004-11-09 22:41   ` Chad N. Tindel
  1 sibling, 1 reply; 19+ messages in thread
From: akepner @ 2004-11-09 21:11 UTC (permalink / raw)
  To: Chad N. Tindel; +Cc: netdev, linux-net

On Tue, 9 Nov 2004, Chad N. Tindel wrote:

> ....
> In our testing we've found some occurrences of identd reporting back the
> wrong username for the owner of a socket. .....

The following may be of interest:

http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=44854

--
Arthur


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Wrong UIDs reported in /proc/net/tcp
  2004-11-09 21:11 ` akepner
@ 2004-11-09 22:41   ` Chad N. Tindel
  0 siblings, 0 replies; 19+ messages in thread
From: Chad N. Tindel @ 2004-11-09 22:41 UTC (permalink / raw)
  To: akepner; +Cc: netdev, linux-net

> The following may be of interest:
> 
> http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=44854

This isn't really the same issue.  This looks like an euid vs. uid problem
that reproduces every time.  

The problem we are seeing is a timing-window-race-condition kind of problem
that happens maybe 1 in 100,000 times.

Chad

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Wrong UIDs reported in /proc/net/tcp
  2004-11-09 21:06 ` Herbert Xu
@ 2004-11-09 22:43   ` Chad N. Tindel
  2004-11-09 22:58     ` Herbert Xu
  2004-11-18 19:02   ` Chad N. Tindel
  1 sibling, 1 reply; 19+ messages in thread
From: Chad N. Tindel @ 2004-11-09 22:43 UTC (permalink / raw)
  To: Herbert Xu; +Cc: netdev, linux-net

> > In our testing we've found some occurrences of identd reporting back the
> > wrong username for the owner of a socket.  We added some instrumentation
> 
> /proc/net/tcp is an obsolete interface.  It is inherently unreliable
> in that a record may be read using two read(2) calls.  Those two calls
> may end up looking at two different records.

Is it unreliable in that the wrong user id will get returned for an 
established socket?  How could such a thing happen?

Chad

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Wrong UIDs reported in /proc/net/tcp
  2004-11-09 22:43   ` Chad N. Tindel
@ 2004-11-09 22:58     ` Herbert Xu
  2004-11-09 23:04       ` Chad N. Tindel
  0 siblings, 1 reply; 19+ messages in thread
From: Herbert Xu @ 2004-11-09 22:58 UTC (permalink / raw)
  To: Chad N. Tindel; +Cc: netdev, linux-net

On Tue, Nov 09, 2004 at 05:43:37PM -0500, Chad N. Tindel wrote:
>
> > /proc/net/tcp is an obsolete interface.  It is inherently unreliable
> > in that a record may be read using two read(2) calls.  Those two calls
> > may end up looking at two different records.
> 
> Is it unreliable in that the wrong user id will get returned for an 
> established socket?  How could such a thing happen?

In 2.4 it is entirely possible to have a record broken up into two
reads.  There is no guarantee that the two reads will be reading the
same record.
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Wrong UIDs reported in /proc/net/tcp
  2004-11-09 22:58     ` Herbert Xu
@ 2004-11-09 23:04       ` Chad N. Tindel
  2004-11-09 23:18         ` Herbert Xu
  0 siblings, 1 reply; 19+ messages in thread
From: Chad N. Tindel @ 2004-11-09 23:04 UTC (permalink / raw)
  To: Herbert Xu; +Cc: netdev, linux-net

> In 2.4 it is entirely possible to have a record broken up into two
> reads.  There is no guarantee that the two reads will be reading the
> same record.

Let me make sure I understand what you're saying here...

You're saying that since pidentd is calling fgets(), that can actually result
in multiple read() calls.  Because of this, the first half of the line
containing the address:port pairs can be with respect to one socket, and the 
second half containing the euid can be from another socket?

Chad

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Wrong UIDs reported in /proc/net/tcp
  2004-11-09 23:04       ` Chad N. Tindel
@ 2004-11-09 23:18         ` Herbert Xu
  0 siblings, 0 replies; 19+ messages in thread
From: Herbert Xu @ 2004-11-09 23:18 UTC (permalink / raw)
  To: Chad N. Tindel; +Cc: netdev, linux-net

On Tue, Nov 09, 2004 at 06:04:11PM -0500, Chad N. Tindel wrote:
> 
> You're saying that since pidentd is calling fgets(), that can actually result
> in multiple read() calls.  Because of this, the first half of the line

fgets() is implemented on top of a buffered read, with my glibc it reads
4096 bytes.  Whatever that size is, the second read is most likely going
to be reading a partial record.  There is no guarantee that the partial
record is going to match the end of the first read.
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Wrong UIDs reported in /proc/net/tcp
  2004-11-09 21:06 ` Herbert Xu
  2004-11-09 22:43   ` Chad N. Tindel
@ 2004-11-18 19:02   ` Chad N. Tindel
  2004-11-18 21:03     ` Herbert Xu
  2004-11-19 14:27     ` Henrik Nordstrom
  1 sibling, 2 replies; 19+ messages in thread
From: Chad N. Tindel @ 2004-11-18 19:02 UTC (permalink / raw)
  To: Herbert Xu; +Cc: netdev, linux-net

> /proc/net/tcp is an obsolete interface.  It is inherently unreliable
> in that a record may be read using two read(2) calls.  Those two calls
> may end up looking at two different records.
> 
> So please use the netlink interface or ss(8) from the iproute package.

OK, so just out of sheer morbid curiousity, I added an ioctl which will
accept 4 parameters (the address/port pairs), and will return the user id
associated with that socket.  I also changed pidentd to call this ioctl
instead of looking at /proc/net/tcp.  This should theoretically get rid
of all race conditions.

However, the problem still happens.  We see many instances of the kernel 
reporting the wrong user id.  What is even more interesting is that we added
a retry loop, and many times the problem goes away after re-trying.  Sometimes
the first retry gets the correct username, and sometimes it takes 4 or 5 
retries.  So there is definitely some sort of race condition going on here.
We have verified that when this problem occurs the process holding the
socket endpoint in question is still running, so it isn't some problem caused
by doing an identd lookup after the other end has gone away.

Does anybody have any idea why the userid associated with the socket's inode
might be changing mid-stream?  What are the chances of a defect where two
sockets are using the same inode memory or something like that?

Chad

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Wrong UIDs reported in /proc/net/tcp
  2004-11-18 19:02   ` Chad N. Tindel
@ 2004-11-18 21:03     ` Herbert Xu
  2004-11-18 21:27       ` Stephen Hemminger
  2004-11-19 21:01       ` Chad N. Tindel
  2004-11-19 14:27     ` Henrik Nordstrom
  1 sibling, 2 replies; 19+ messages in thread
From: Herbert Xu @ 2004-11-18 21:03 UTC (permalink / raw)
  To: Chad N. Tindel; +Cc: netdev, linux-net

On Thu, Nov 18, 2004 at 02:02:57PM -0500, Chad N. Tindel wrote:
> 
> OK, so just out of sheer morbid curiousity, I added an ioctl which will
> accept 4 parameters (the address/port pairs), and will return the user id
> associated with that socket.  I also changed pidentd to call this ioctl
> instead of looking at /proc/net/tcp.  This should theoretically get rid
> of all race conditions.

Please show us the code of your ioctl.

Have you tried netlink yet? Does it exhibit the same problem?
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Wrong UIDs reported in /proc/net/tcp
  2004-11-18 21:03     ` Herbert Xu
@ 2004-11-18 21:27       ` Stephen Hemminger
  2004-11-18 22:07         ` Herbert Xu
  2004-11-19 21:01       ` Chad N. Tindel
  1 sibling, 1 reply; 19+ messages in thread
From: Stephen Hemminger @ 2004-11-18 21:27 UTC (permalink / raw)
  To: Herbert Xu; +Cc: Chad N. Tindel, netdev, linux-net

On Fri, 19 Nov 2004 08:03:07 +1100
Herbert Xu <herbert@gondor.apana.org.au> wrote:

> On Thu, Nov 18, 2004 at 02:02:57PM -0500, Chad N. Tindel wrote:
> > 
> > OK, so just out of sheer morbid curiousity, I added an ioctl which will
> > accept 4 parameters (the address/port pairs), and will return the user id
> > associated with that socket.  I also changed pidentd to call this ioctl
> > instead of looking at /proc/net/tcp.  This should theoretically get rid
> > of all race conditions.
> 
> Please show us the code of your ioctl.
> 
> Have you tried netlink yet? Does it exhibit the same problem?

It could also be the sockets are shared between processes with uid's or that
the real/effective uid or different or even the uid is that of the original
creator and the file was inherited across exec.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Wrong UIDs reported in /proc/net/tcp
  2004-11-18 21:27       ` Stephen Hemminger
@ 2004-11-18 22:07         ` Herbert Xu
  2004-11-18 22:16           ` David Stevens
  0 siblings, 1 reply; 19+ messages in thread
From: Herbert Xu @ 2004-11-18 22:07 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: Chad N. Tindel, netdev, linux-net

On Thu, Nov 18, 2004 at 01:27:00PM -0800, Stephen Hemminger wrote:
> 
> It could also be the sockets are shared between processes with uid's or that
> the real/effective uid or different or even the uid is that of the original
> creator and the file was inherited across exec.

You're right.  However, I think the original poster is getting
different results between calls on the same connection.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Wrong UIDs reported in /proc/net/tcp
  2004-11-18 22:07         ` Herbert Xu
@ 2004-11-18 22:16           ` David Stevens
  2004-11-18 23:40             ` Herbert Xu
  2004-11-18 23:49             ` Chad N. Tindel
  0 siblings, 2 replies; 19+ messages in thread
From: David Stevens @ 2004-11-18 22:16 UTC (permalink / raw)
  To: Herbert Xu
  Cc: Chad N. Tindel, linux-net, netdev, netdev-bounce, Stephen Hemminger

Isn't the read itself atomic?

Assuming so, another solution would be to stat/fstat the file, add some to 
it to account
for growth, allocate a buffer that big and read the whole thing in one 
shot. Then the
results should be self-consistent.

                                                +-DLS

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Wrong UIDs reported in /proc/net/tcp
  2004-11-18 22:16           ` David Stevens
@ 2004-11-18 23:40             ` Herbert Xu
  2004-11-18 23:49             ` Chad N. Tindel
  1 sibling, 0 replies; 19+ messages in thread
From: Herbert Xu @ 2004-11-18 23:40 UTC (permalink / raw)
  To: David Stevens
  Cc: Chad N. Tindel, linux-net, netdev, netdev-bounce, Stephen Hemminger

On Thu, Nov 18, 2004 at 02:16:33PM -0800, David Stevens wrote:
> 
> Assuming so, another solution would be to stat/fstat the file, add some to 
> it to account
> for growth, allocate a buffer that big and read the whole thing in one 
> shot. Then the
> results should be self-consistent.

You can read at most one page at a time for a /proc file.
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Wrong UIDs reported in /proc/net/tcp
  2004-11-18 22:16           ` David Stevens
  2004-11-18 23:40             ` Herbert Xu
@ 2004-11-18 23:49             ` Chad N. Tindel
  2004-11-18 23:56               ` David Stevens
  1 sibling, 1 reply; 19+ messages in thread
From: Chad N. Tindel @ 2004-11-18 23:49 UTC (permalink / raw)
  To: David Stevens; +Cc: linux-net, netdev, netdev-bounce, Stephen Hemminger

> Assuming so, another solution would be to stat/fstat the file, add some to 
> it to account
> for growth, allocate a buffer that big and read the whole thing in one 
> shot. Then the
> results should be self-consistent.

Yeah the kernel doesn't let you read more than 1 page of data.  Thats how big
the block of memory that it passes to the proc handler is.

http://lxr.linux.no/source/fs/proc/generic.c#L61

IIRC, each line in /proc/net/tcp is 115 bytes or something like that, so if
a page is 4k, then you can only read 4096/115 = 35 sockets worth of info per 
read() call.

Chad

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Wrong UIDs reported in /proc/net/tcp
  2004-11-18 23:49             ` Chad N. Tindel
@ 2004-11-18 23:56               ` David Stevens
  2004-11-19  1:26                 ` Herbert Xu
  0 siblings, 1 reply; 19+ messages in thread
From: David Stevens @ 2004-11-18 23:56 UTC (permalink / raw)
  To: Chad N. Tindel; +Cc: linux-net, netdev, netdev-bounce, Stephen Hemminger

Hmm, well, for fixed-length records, at least you could make the buffer

(pagesize/recordsize) * recordsize and avoid getting partial records,
though you could still miss some or get duplicates. At least they would
be complete records rather than a mix of unrelated ones.

Might be more intuitive semantics if the first read resulted in a complete
copy of the data at that point, fed back to the application in whatever
chunks it wants, but only if the buffering didn't eat lots of memory.

                                        +-DLS

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Wrong UIDs reported in /proc/net/tcp
  2004-11-18 23:56               ` David Stevens
@ 2004-11-19  1:26                 ` Herbert Xu
  0 siblings, 0 replies; 19+ messages in thread
From: Herbert Xu @ 2004-11-19  1:26 UTC (permalink / raw)
  To: David Stevens; +Cc: chad, linux-net, netdev, netdev-bounce, shemminger

David Stevens <dlstevens@us.ibm.com> wrote:
> 
> Might be more intuitive semantics if the first read resulted in a complete
> copy of the data at that point, fed back to the application in whatever
> chunks it wants, but only if the buffering didn't eat lots of memory.

There is no point in fixing this.  We already have a working replacement
called tcp_diag.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Wrong UIDs reported in /proc/net/tcp
  2004-11-18 19:02   ` Chad N. Tindel
  2004-11-18 21:03     ` Herbert Xu
@ 2004-11-19 14:27     ` Henrik Nordstrom
  1 sibling, 0 replies; 19+ messages in thread
From: Henrik Nordstrom @ 2004-11-19 14:27 UTC (permalink / raw)
  To: Chad N. Tindel; +Cc: Herbert Xu, netdev, linux-net

On Thu, 18 Nov 2004, Chad N. Tindel wrote:

> However, the problem still happens.  We see many instances of the kernel
> reporting the wrong user id.

Have you aquired the proper read lock on the tcp connection hash?

Regards
Henrik

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Wrong UIDs reported in /proc/net/tcp
  2004-11-18 21:03     ` Herbert Xu
  2004-11-18 21:27       ` Stephen Hemminger
@ 2004-11-19 21:01       ` Chad N. Tindel
  1 sibling, 0 replies; 19+ messages in thread
From: Chad N. Tindel @ 2004-11-19 21:01 UTC (permalink / raw)
  To: Herbert Xu; +Cc: netdev, linux-net

> > OK, so just out of sheer morbid curiousity, I added an ioctl which will
> > accept 4 parameters (the address/port pairs), and will return the user id
> > associated with that socket.  I also changed pidentd to call this ioctl
> > instead of looking at /proc/net/tcp.  This should theoretically get rid
> > of all race conditions.
> 
> Please show us the code of your ioctl.

Hi-

I found the problem... it was a bug in my pidentd changes where I wasn't 
properly handling an ioctl failure.  So, using an ioctl to do a direct hash
table lookup makes the userid mismatches go away.  We've been running tests for
12 hours without any failures.

> Have you tried netlink yet? Does it exhibit the same problem?

Only so many test systems to go around.  ;-)  Will start these tests tonight
and report back.

Chad

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2004-11-19 21:01 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2004-11-09 20:53 Wrong UIDs reported in /proc/net/tcp Chad N. Tindel
2004-11-09 21:06 ` Herbert Xu
2004-11-09 22:43   ` Chad N. Tindel
2004-11-09 22:58     ` Herbert Xu
2004-11-09 23:04       ` Chad N. Tindel
2004-11-09 23:18         ` Herbert Xu
2004-11-18 19:02   ` Chad N. Tindel
2004-11-18 21:03     ` Herbert Xu
2004-11-18 21:27       ` Stephen Hemminger
2004-11-18 22:07         ` Herbert Xu
2004-11-18 22:16           ` David Stevens
2004-11-18 23:40             ` Herbert Xu
2004-11-18 23:49             ` Chad N. Tindel
2004-11-18 23:56               ` David Stevens
2004-11-19  1:26                 ` Herbert Xu
2004-11-19 21:01       ` Chad N. Tindel
2004-11-19 14:27     ` Henrik Nordstrom
2004-11-09 21:11 ` akepner
2004-11-09 22:41   ` Chad N. Tindel

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.