linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Problems with kernel-2.2.19-6.2.7 from RH update for 6.2
@ 2001-08-19 18:01 Denis Perchine
  2001-08-20  2:11 ` Alexey Kuznetsov
  0 siblings, 1 reply; 6+ messages in thread
From: Denis Perchine @ 2001-08-19 18:01 UTC (permalink / raw)
  To: linux-kernel

Hello,

I see quite strange behavior of subj.

socket(PF_INET, SOCK_STREAM, IPPROTO_IP) = 40
fcntl(40, F_GETFL)                      = 0x2 (flags O_RDWR)
fcntl(40, F_SETFL, O_RDWR|O_NONBLOCK)   = 0
setsockopt(40, SOL_SOCKET, SO_LINGER, [1], 8) = 0
connect(40, {sin_family=AF_INET, sin_port=htons(2030), 
sin_addr=inet_addr("127.0.0.1")}}, 16) = -1 EINPROGRESS (Operation now in 
progress)
select(41, NULL, [40], NULL, {180, 0})  = 1 (out [40], left {180, 0})
getsockopt(40, SOL_SOCKET, SO_ERROR, [0], [4]) = 0
select(41, [40], NULL, NULL, {180, 0})  = 1 (in [40], left {175, 550000})
ioctl(4, FIONREAD, [0])                 = 0
select(41, [40], NULL, NULL, {180, 0})  = 1 (in [40], left {180, 0})
recv(4, 0x806aa28, 1, 0x4000)           = -1 EAGAIN (Resource temporarily 
unavailable)

As far as you can see select say that socket is writable after connect. This 
mean that connection is completed... But later before read we do select on 
read, and get OK. But recv fails with EAGAIN. This situation is repeated 
constantly. The program stucks in the loop trying to connect, but fails.

Any ideas what can this be?

Maybe comment from Alan, as RH employee?

As a side comment. Server is highly loaded. The program usually works well, 
but if it happend once, it will repeat forever...

-- 
Sincerely Yours,
Denis Perchine

----------------------------------
E-Mail: dyp@perchine.com
HomePage: http://www.perchine.com/dyp/
FidoNet: 2:5000/120.5
----------------------------------

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Problems with kernel-2.2.19-6.2.7 from RH update for 6.2
  2001-08-19 18:01 Problems with kernel-2.2.19-6.2.7 from RH update for 6.2 Denis Perchine
@ 2001-08-20  2:11 ` Alexey Kuznetsov
  2001-08-20 18:52   ` Denis Perchine
  0 siblings, 1 reply; 6+ messages in thread
From: Alexey Kuznetsov @ 2001-08-20  2:11 UTC (permalink / raw)
  To: Denis Perchine; +Cc: linux-kernel

Hello!

> socket(PF_INET, SOCK_STREAM, IPPROTO_IP) = 40
> fcntl(40, F_GETFL)                      = 0x2 (flags O_RDWR)
> fcntl(40, F_SETFL, O_RDWR|O_NONBLOCK)   = 0
> setsockopt(40, SOL_SOCKET, SO_LINGER, [1], 8) = 0
> connect(40, {sin_family=AF_INET, sin_port=htons(2030), 
> sin_addr=inet_addr("127.0.0.1")}}, 16) = -1 EINPROGRESS (Operation now in 
> progress)
> select(41, NULL, [40], NULL, {180, 0})  = 1 (out [40], left {180, 0})
> getsockopt(40, SOL_SOCKET, SO_ERROR, [0], [4]) = 0
> select(41, [40], NULL, NULL, {180, 0})  = 1 (in [40], left {175, 550000})
> ioctl(4, FIONREAD, [0])                 = 0
> select(41, [40], NULL, NULL, {180, 0})  = 1 (in [40], left {180, 0})
> recv(4, 0x806aa28, 1, 0x4000)           = -1 EAGAIN (Resource temporarily 
> unavailable)
> 
> As far as you can see select say that socket is writable after connect. This 
> mean that connection is completed... But later before read we do select on 
> read, and get OK. But recv fails with EAGAIN. This situation is repeated 
> constantly. The program stucks in the loop trying to connect, but fails.
> 
> Any ideas what can this be?

F.e. this can be recv() on wrong descriptor, which is seen from strace above.
:-)


BTW why do you use funny getsockopt instead of canonical non-blocking connect?
Does standard way have some drawbacks or it is just legal desire
to "think different"? :-) The question is very interesting: it is big puzzle
for me what does motivate people to invent such strange combinations
of selct/ioctl/getsockopt (f.e. qmail did another bizarre thing:
getpeername() in the place where you use getsockopt(), so strace
looks like a shizophrenic dialogue to itself: "I am Bob!", ...
"Am I really Bob?" ... "Am I still Bob?" and so on for 3 minutes. :-))

And the second note: the whole sequence is equivalent to plain blocking
connect, only with lots of overhead. In all the OSes standard connect timeout
is of order 2-4 minutes. Yes, Linux-2.2 is unfortunate exception (13 minutes),
but the difference is purely quantitative yet and for any installation
this should be changed to smaller value via sysctl in any case.
Seems, no reasons to worry.

Alexey

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Problems with kernel-2.2.19-6.2.7 from RH update for 6.2
  2001-08-20  2:11 ` Alexey Kuznetsov
@ 2001-08-20 18:52   ` Denis Perchine
  2001-08-22  0:53     ` Alexey Kuznetsov
  0 siblings, 1 reply; 6+ messages in thread
From: Denis Perchine @ 2001-08-20 18:52 UTC (permalink / raw)
  To: Alexey Kuznetsov; +Cc: linux-kernel

On Monday 20 August 2001 09:11, you wrote:
> Hello!
>
> > socket(PF_INET, SOCK_STREAM, IPPROTO_IP) = 40
> > fcntl(40, F_GETFL)                      = 0x2 (flags O_RDWR)
> > fcntl(40, F_SETFL, O_RDWR|O_NONBLOCK)   = 0
> > setsockopt(40, SOL_SOCKET, SO_LINGER, [1], 8) = 0
> > connect(40, {sin_family=AF_INET, sin_port=htons(2030),
> > sin_addr=inet_addr("127.0.0.1")}}, 16) = -1 EINPROGRESS (Operation now in
> > progress)
> > select(41, NULL, [40], NULL, {180, 0})  = 1 (out [40], left {180, 0})
> > getsockopt(40, SOL_SOCKET, SO_ERROR, [0], [4]) = 0
> > select(41, [40], NULL, NULL, {180, 0})  = 1 (in [40], left {175, 550000})
> > ioctl(4, FIONREAD, [0])                 = 0
> > select(41, [40], NULL, NULL, {180, 0})  = 1 (in [40], left {180, 0})
> > recv(4, 0x806aa28, 1, 0x4000)           = -1 EAGAIN (Resource temporarily
> > unavailable)
> >
> > As far as you can see select say that socket is writable after connect.
> > This mean that connection is completed... But later before read we do
> > select on read, and get OK. But recv fails with EAGAIN. This situation is
> > repeated constantly. The program stucks in the loop trying to connect,
> > but fails.
> >
> > Any ideas what can this be?

> F.e. this can be recv() on wrong descriptor, which is seen from strace
> above.
>
> :-)

Oups... I should sleep for at least 9 hours when I wrote it. :-(( It looks 
like a bug.

> BTW why do you use funny getsockopt instead of canonical non-blocking
> connect?

Hmmm... If I know what canonical non-blocking connect is, I would use it I 
think...

Also I use getsockopt to get an error. I set non-blocking with fcntl.

> Does standard way have some drawbacks or it is just legal desire
> to "think different"? :-)

No. Just usual lameness. I read lot of FAQs, play with the code, and this is 
the combination which works. Actually thttpd also uses this (if I am not 
mistaken).

> The question is very interesting: it is big
> puzzle for me what does motivate people to invent such strange combinations
> of selct/ioctl/getsockopt (f.e. qmail did another bizarre thing:
> getpeername() in the place where you use getsockopt(), so strace
> looks like a shizophrenic dialogue to itself: "I am Bob!", ...
> "Am I really Bob?" ... "Am I still Bob?" and so on for 3 minutes. :-))

Eeerrhhh... Let's not touch Berstein, or he will rfuse answering my 
bugreports if I will ever change Postfix to QMail. He is a special guy. QMail 
is the only software I can not read like a book...

> And the second note: the whole sequence is equivalent to plain blocking
> connect, only with lots of overhead. In all the OSes standard connect
> timeout is of order 2-4 minutes. Yes, Linux-2.2 is unfortunate exception
> (13 minutes), but the difference is purely quantitative yet and for any
> installation this should be changed to smaller value via sysctl in any
> case.

The problem here is that I need to tune timeout for: each connection, and for 
connect, and read/write separately. If you could give me an advise how to do 
this more effective, I would be really glad.

Actually the problem was anyway in my app. I have fixed it.

-- 
Sincerely Yours,
Denis Perchine

----------------------------------
E-Mail: dyp@perchine.com
HomePage: http://www.perchine.com/dyp/
FidoNet: 2:5000/120.5
----------------------------------

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Problems with kernel-2.2.19-6.2.7 from RH update for 6.2
  2001-08-20 18:52   ` Denis Perchine
@ 2001-08-22  0:53     ` Alexey Kuznetsov
  2001-08-23  4:32       ` Denis Perchine
  0 siblings, 1 reply; 6+ messages in thread
From: Alexey Kuznetsov @ 2001-08-22  0:53 UTC (permalink / raw)
  To: Denis Perchine; +Cc: linux-kernel

Hello!

> > BTW why do you use funny getsockopt instead of canonical non-blocking
> > connect?
> 
> Hmmm... If I know what canonical non-blocking connect is, I would use it =
> I=20
> think...

connect() is used to complete asynchronously started connect.

While connection is not complete connect() returns EALREADY.
When connection is established, it succeeds. If connection fails,
it returns an error (the same, which you get with getsockopt()).
So, right way is to repeat connect() after poll() returned POLLOUT,
it will either complete connection or return an error to you.

Actually, this classic interface is very ugly. Seems, it is the only place,
where O_NONBLOCK is used not to do something nonblocking, but to start
an asynchronous operation. And all these terrible unique error codes:
EINPROGRESS, EALREADY suck. Thank to bsd people, who preferred ugly
hacks instead of developing some AIO interface. :-)
It is so ugly (it is the only place where kernel has to maintain history
of user syscalls in addition to tcp state), that it is even offending
that people do not use it; it means that it simply pollutes kernel. :-)


> the combination which works. Actually thttpd also uses this (if I am not=20
> mistaken).

Where? httpd does not connect().

If they do this after accept(), it is really silly. Pure useless syscall.


> The problem here is that I need to tune timeout for: each connection, and=
>  for=20
> connect, and read/write separately. If you could give me an advise how to=
>  do=20
> this more effective, I would be really glad.

I see. If tuning is goal, it is right way. Amount of syscalls
is the same as with alarm, but logic is cleaner.

Though, with read/write SO_RCVTIMEO/SO_SNDTIMEO is preferred.
Unfortunately, linux-2.2 seems to be the only OS not implemented this.
[ I am not sure about Solaris though. ]

In linux-2.4 they work for connect/accept too: SO_SNDTIMEO for
connect, SO_RCVTIMEO for accept.

Alexey


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Problems with kernel-2.2.19-6.2.7 from RH update for 6.2
  2001-08-22  0:53     ` Alexey Kuznetsov
@ 2001-08-23  4:32       ` Denis Perchine
  2001-08-23 16:29         ` kuznet
  0 siblings, 1 reply; 6+ messages in thread
From: Denis Perchine @ 2001-08-23  4:32 UTC (permalink / raw)
  To: Alexey Kuznetsov; +Cc: linux-kernel

> connect() is used to complete asynchronously started connect.
>
> While connection is not complete connect() returns EALREADY.
> When connection is established, it succeeds. If connection fails,
> it returns an error (the same, which you get with getsockopt()).
> So, right way is to repeat connect() after poll() returned POLLOUT,
> it will either complete connection or return an error to you.

:-))) You will never think it works like this. And you is the first one fro 
whom I hear this.

> Actually, this classic interface is very ugly. Seems, it is the only place,
> where O_NONBLOCK is used not to do something nonblocking, but to start
> an asynchronous operation. And all these terrible unique error codes:
> EINPROGRESS, EALREADY suck. Thank to bsd people, who preferred ugly
> hacks instead of developing some AIO interface. :-)
> It is so ugly (it is the only place where kernel has to maintain history
> of user syscalls in addition to tcp state), that it is even offending
> that people do not use it; it means that it simply pollutes kernel. :-)

> > the combination which works. Actually thttpd also uses this (if I am
> > not mistaken).
>
> Where? httpd does not connect().

For read/write. Although it is incorrect to compare as thttpd is serving more 
than one connect.

> If they do this after accept(), it is really silly. Pure useless syscall.
>
> > The problem here is that I need to tune timeout for: each connection,
> > and for
> > connect, and read/write separately. If you could give me an advise how
> > to do this more effective, I would be really glad.
>
> I see. If tuning is goal, it is right way. Amount of syscalls is the same
> as with alarm, but logic is cleaner.

Logic with alarms will not work in multithreaded case.

> Though, with read/write SO_RCVTIMEO/SO_SNDTIMEO is preferred.
> Unfortunately, linux-2.2 seems to be the only OS not implemented this.
> [ I am not sure about Solaris though. ]
>
> In linux-2.4 they work for connect/accept too: SO_SNDTIMEO for
> connect, SO_RCVTIMEO for accept.

I assume that using SO_RCVTIME/SO_SNDTIME would be better in terms of 
performance. Maybe it worse of it to upgrade to 2.4.x, and rewrite network 
layer to use sync IP... I will think about it. How many times better it would 
be (approximately)? if we assume that I have lots of connects which transfers 
small amount of data in each (1-2K).

-- 
Sincerely Yours,
Denis Perchine

----------------------------------
E-Mail: dyp@perchine.com
HomePage: http://www.perchine.com/dyp/
FidoNet: 2:5000/120.5
----------------------------------

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Problems with kernel-2.2.19-6.2.7 from RH update for 6.2
  2001-08-23  4:32       ` Denis Perchine
@ 2001-08-23 16:29         ` kuznet
  0 siblings, 0 replies; 6+ messages in thread
From: kuznet @ 2001-08-23 16:29 UTC (permalink / raw)
  To: Denis Perchine; +Cc: linux-kernel

Hello!

> > Where? httpd does not connect().
> 
> For read/write. Although it is incorrect to compare as thttpd is serving =
> more=20
> than one connect.

It is even worse. Useless operation in data path. read/write will return
the error, if connection died in any case.


> > I see. If tuning is goal, it is right way. Amount of syscalls is the sa=
> me
> > as with alarm, but logic is cleaner.
> 
> Logic with alarms will not work in multithreaded case.

I meaned _your_ logic is cleaner . :-)


> I assume that using SO_RCVTIME/SO_SNDTIME would be better in terms of=20
> performance. 

Not very much. But code becomes simpler.

Select() is better sometimes, f.e. when program uses signals
(and glibc uses signals _internally_ when multithreaded, breaking lots
 of things, do you know this? :-)). In this case you need to raclulate
remaining time to restart poll/read/write, linux select returns it.

> layer to use sync IP... 

What is "sync"?


> be (approximately)? if we assume that I have lots of connects which trans=
> fers=20
> small amount of data in each (1-2K).

It depends. The advantage of read/write with SO_*TIMEO is that
in all 100% of cases data arrive in time or you send immeadiately
and appear in right place and do not waste cache and cycles to exit from
select and to enter to read/write. Also, select() is pretty
suboptimal.

Alexey

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2001-08-23 16:30 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2001-08-19 18:01 Problems with kernel-2.2.19-6.2.7 from RH update for 6.2 Denis Perchine
2001-08-20  2:11 ` Alexey Kuznetsov
2001-08-20 18:52   ` Denis Perchine
2001-08-22  0:53     ` Alexey Kuznetsov
2001-08-23  4:32       ` Denis Perchine
2001-08-23 16:29         ` kuznet

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).