All of lore.kernel.org
 help / color / mirror / Atom feed
* QUESTION: Why might Linux suddenly stop replying to pings for no apparent reason?
@ 2012-07-18 14:42 Terry Phelps
  2012-07-18 15:06 ` richard -rw- weinberger
  0 siblings, 1 reply; 7+ messages in thread
From: Terry Phelps @ 2012-07-18 14:42 UTC (permalink / raw)
  To: linux-kernel

I have this strange recurring problem with SEVERAL machines, all
running the Oracle "Unbreakable Enterprise Kernel", which is based on
the 3.0.16 kernel.

Here is a quick description, while I still have your attention:

I have server S1, and two desktops, D1 and D2, separated by a router.
The D1 and D2 boxes are side by side, on the same IPv4 subnet,
different from S1's subnet. Maybe once a day, or oftener, I find that
D1 cannot ping S1, but D2 can. There are many possible causes for
that, of course, BUT:

I can SSH to S1 from D2, and S1 can ping both D1 and D2 just fine.
TCPDUMP shows that the ICMP request packets from D1 ARE arriving at
S1. S1 is siimply not replying!

Even stranger:
>From S1, I can traceroute to D2, but cannot traceroute to D1. A
traceroute to D1 gets an ENETDOWN returned from sendto(). But there is
only one NIC in the S1, and it certainly isn't down!

One more thing: If I enter "ip route flush cache" on S1, the problem
clears up immediately.

And another: If I leave D1 pinging S1 every 5 seconds, say, the
problem will NEVER clear up by itself. But if D1 stops pinging S1 for
a few minutes, it works again!

No, there's no firewall or selinux running on any machine involved,
and no firewall between the boxes.

I'm totally confused. Can anyone suggest what to look at?

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: QUESTION: Why might Linux suddenly stop replying to pings for no apparent reason?
  2012-07-18 14:42 QUESTION: Why might Linux suddenly stop replying to pings for no apparent reason? Terry Phelps
@ 2012-07-18 15:06 ` richard -rw- weinberger
  2012-07-18 15:42   ` Terry Phelps
  0 siblings, 1 reply; 7+ messages in thread
From: richard -rw- weinberger @ 2012-07-18 15:06 UTC (permalink / raw)
  To: Terry Phelps; +Cc: linux-kernel

On Wed, Jul 18, 2012 at 4:42 PM, Terry Phelps <tgphelps50@gmail.com> wrote:
> I have this strange recurring problem with SEVERAL machines, all
> running the Oracle "Unbreakable Enterprise Kernel", which is based on
> the 3.0.16 kernel.

Ask Oracle.

-- 
Thanks,
//richard

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: QUESTION: Why might Linux suddenly stop replying to pings for no apparent reason?
  2012-07-18 15:06 ` richard -rw- weinberger
@ 2012-07-18 15:42   ` Terry Phelps
  2012-07-18 16:03     ` Eric Dumazet
  2012-07-18 16:14     ` Alan Cox
  0 siblings, 2 replies; 7+ messages in thread
From: Terry Phelps @ 2012-07-18 15:42 UTC (permalink / raw)
  To: richard -rw- weinberger; +Cc: linux-kernel

On Wed, Jul 18, 2012 at 11:06 AM, richard -rw- weinberger
<richard.weinberger@gmail.com> wrote:
> On Wed, Jul 18, 2012 at 4:42 PM, Terry Phelps <tgphelps50@gmail.com> wrote:
>> I have this strange recurring problem with SEVERAL machines, all
>> running the Oracle "Unbreakable Enterprise Kernel", which is based on
>> the 3.0.16 kernel.
>
> Ask Oracle.

Is there some way to "ask Oracle", in which my question will get to
someone who can possibly help me? I've tried contacting "Oracle Linux
support". They're hopeless. I've posted my question to Oracle forums,
and even general Linux forums, but all I get is boilerplate networking
advice (check ifconfig, check your routes, check your switches, check
autonegotiation, etc.) that doesn't address the specific symptoms I'm
seeing.

If I should be posting my question to somewhere other than the
linux-kernel mailing list, please advise me where that might be. It
sure looks like something in the kernel isn't working properly
somewhere.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: QUESTION: Why might Linux suddenly stop replying to pings for no apparent reason?
  2012-07-18 15:42   ` Terry Phelps
@ 2012-07-18 16:03     ` Eric Dumazet
  2012-07-18 16:14     ` Alan Cox
  1 sibling, 0 replies; 7+ messages in thread
From: Eric Dumazet @ 2012-07-18 16:03 UTC (permalink / raw)
  To: Terry Phelps; +Cc: richard -rw- weinberger, linux-kernel

On Wed, 2012-07-18 at 11:42 -0400, Terry Phelps wrote:
> On Wed, Jul 18, 2012 at 11:06 AM, richard -rw- weinberger
> <richard.weinberger@gmail.com> wrote:
> > On Wed, Jul 18, 2012 at 4:42 PM, Terry Phelps <tgphelps50@gmail.com> wrote:
> >> I have this strange recurring problem with SEVERAL machines, all
> >> running the Oracle "Unbreakable Enterprise Kernel", which is based on
> >> the 3.0.16 kernel.
> >
> > Ask Oracle.
> 
> Is there some way to "ask Oracle", in which my question will get to
> someone who can possibly help me? I've tried contacting "Oracle Linux
> support". They're hopeless. I've posted my question to Oracle forums,
> and even general Linux forums, but all I get is boilerplate networking
> advice (check ifconfig, check your routes, check your switches, check
> autonegotiation, etc.) that doesn't address the specific symptoms I'm
> seeing.
> 
> If I should be posting my question to somewhere other than the
> linux-kernel mailing list, please advise me where that might be. It
> sure looks like something in the kernel isn't working properly
> somewhere.

Before saying that, can you try a pristine linux kernel ?

For example linux-3.4.5, or even better the git tree, so that
it will be easier to add knobs and patches.

Obviously we dont know what Oracle added to their kernels, and
dont want to know.




^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: QUESTION: Why might Linux suddenly stop replying to pings for no apparent reason?
  2012-07-18 15:42   ` Terry Phelps
  2012-07-18 16:03     ` Eric Dumazet
@ 2012-07-18 16:14     ` Alan Cox
  2012-07-18 20:30       ` Terry Phelps
  1 sibling, 1 reply; 7+ messages in thread
From: Alan Cox @ 2012-07-18 16:14 UTC (permalink / raw)
  To: Terry Phelps; +Cc: richard -rw- weinberger, linux-kernel

> If I should be posting my question to somewhere other than the
> linux-kernel mailing list, please advise me where that might be. It
> sure looks like something in the kernel isn't working properly
> somewhere.

If you can duplicate the problem with an *upstream* kernel then yes it
is. If you only see it on an Oracle kernel then talk to Oracle.

Alan

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: QUESTION: Why might Linux suddenly stop replying to pings for no apparent reason?
  2012-07-18 16:14     ` Alan Cox
@ 2012-07-18 20:30       ` Terry Phelps
  2012-07-18 20:37         ` Alan Cox
  0 siblings, 1 reply; 7+ messages in thread
From: Terry Phelps @ 2012-07-18 20:30 UTC (permalink / raw)
  To: Alan Cox; +Cc: linux-kernel

On Wed, Jul 18, 2012 at 12:14 PM, Alan Cox <alan@lxorguk.ukuu.org.uk> wrote:
>> If I should be posting my question to somewhere other than the
>> linux-kernel mailing list, please advise me where that might be. It
>> sure looks like something in the kernel isn't working properly
>> somewhere.
>
> If you can duplicate the problem with an *upstream* kernel then yes it
> is. If you only see it on an Oracle kernel then talk to Oracle.
>
> Alan

Alan,
Thank you for your help. I am not a kernel hacker by any means, but
will try to do what you suggest.
Should I clone the current kernel git tree, and build the latest
kernel? Or build from some other point in time?

Terry

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: QUESTION: Why might Linux suddenly stop replying to pings for no apparent reason?
  2012-07-18 20:30       ` Terry Phelps
@ 2012-07-18 20:37         ` Alan Cox
  0 siblings, 0 replies; 7+ messages in thread
From: Alan Cox @ 2012-07-18 20:37 UTC (permalink / raw)
  To: Terry Phelps; +Cc: linux-kernel

On Wed, 18 Jul 2012 16:30:59 -0400
Terry Phelps <tgphelps50@gmail.com> wrote:

> On Wed, Jul 18, 2012 at 12:14 PM, Alan Cox <alan@lxorguk.ukuu.org.uk> wrote:
> >> If I should be posting my question to somewhere other than the
> >> linux-kernel mailing list, please advise me where that might be. It
> >> sure looks like something in the kernel isn't working properly
> >> somewhere.
> >
> > If you can duplicate the problem with an *upstream* kernel then yes it
> > is. If you only see it on an Oracle kernel then talk to Oracle.
> >
> > Alan
> 
> Alan,
> Thank you for your help. I am not a kernel hacker by any means, but
> will try to do what you suggest.
> Should I clone the current kernel git tree, and build the latest
> kernel? Or build from some other point in time?

I'd grab a modern release kernel - the current 3.4 version is probably
best. Building off the git tree tends to add other "interesting"
variables you don't need.

Alan

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2012-07-18 20:33 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-07-18 14:42 QUESTION: Why might Linux suddenly stop replying to pings for no apparent reason? Terry Phelps
2012-07-18 15:06 ` richard -rw- weinberger
2012-07-18 15:42   ` Terry Phelps
2012-07-18 16:03     ` Eric Dumazet
2012-07-18 16:14     ` Alan Cox
2012-07-18 20:30       ` Terry Phelps
2012-07-18 20:37         ` Alan Cox

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.