All of lore.kernel.org
 help / color / mirror / Atom feed
* Network Issue
@ 2016-09-08  1:49 Hardik A Gohil (WMSC-HW)
  2016-09-08  6:19 ` Uwe Kleine-König
  0 siblings, 1 reply; 6+ messages in thread
From: Hardik A Gohil (WMSC-HW) @ 2016-09-08  1:49 UTC (permalink / raw)
  To: linux-rt-users

Hello,

I am working on Linux 3.2.0.

We had a requirement to make Linux Real Time so using RT-PREEMPT
(patch-3.2-rt10.patch.bz2)

We are facing a network hang issue while running the periodic task of 5 ms.

cannot ping any more when task is running is ran for some time.

I am using Timer fd protocol to achieve this.

I would like to know if there is such know issue in this patch.

Any help will be appreciated.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Network Issue
  2016-09-08  1:49 Network Issue Hardik A Gohil (WMSC-HW)
@ 2016-09-08  6:19 ` Uwe Kleine-König
  2016-09-14  7:51   ` Hardik A Gohil (WMSC-HW)
  0 siblings, 1 reply; 6+ messages in thread
From: Uwe Kleine-König @ 2016-09-08  6:19 UTC (permalink / raw)
  To: Hardik A Gohil (WMSC-HW); +Cc: linux-rt-users

Hello,

On Thu, Sep 08, 2016 at 09:49:14AM +0800, Hardik A Gohil (WMSC-HW) wrote:
> I am working on Linux 3.2.0.

Wow, you're still developing on a kernel that is 4.5 years old and
you're not even taking the stable updates.

> We had a requirement to make Linux Real Time so using RT-PREEMPT
> (patch-3.2-rt10.patch.bz2)

Even if you stick to 3.2.x, there is 3.2.82-rt118.

> We are facing a network hang issue while running the periodic task of 5 ms.
> 
> cannot ping any more when task is running is ran for some time.
> 
> I am using Timer fd protocol to achieve this.
> 
> I would like to know if there is such know issue in this patch.
> 
> Any help will be appreciated.

I bet you won't find help here until you reproduce the issue with a
recent kernel (currently there is v4.6.7-rt11) and show the code of your
periodic task.

Best regards
Uwe

-- 
Pengutronix e.K.                           | Uwe Kleine-König            |
Industrial Linux Solutions                 | http://www.pengutronix.de/  |

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Network Issue
  2016-09-08  6:19 ` Uwe Kleine-König
@ 2016-09-14  7:51   ` Hardik A Gohil (WMSC-HW)
  2016-09-14 11:26     ` Uwe Kleine-König
  0 siblings, 1 reply; 6+ messages in thread
From: Hardik A Gohil (WMSC-HW) @ 2016-09-14  7:51 UTC (permalink / raw)
  To: Uwe Kleine-König; +Cc: linux-rt-users

[-- Attachment #1: Type: text/plain, Size: 1727 bytes --]

On Thu, Sep 8, 2016 at 2:19 PM, Uwe Kleine-König
<u.kleine-koenig@pengutronix.de> wrote:
> Hello,
>
> On Thu, Sep 08, 2016 at 09:49:14AM +0800, Hardik A Gohil (WMSC-HW) wrote:
>> I am working on Linux 3.2.0.
>
> Wow, you're still developing on a kernel that is 4.5 years old and
> you're not even taking the stable updates.

since we did development with phyCORE-AM335x-PD13.1.2 which is using 3.2 kernel.
we have already completed production and it wont be a easy task to
shift to 4. series
>
>> We had a requirement to make Linux Real Time so using RT-PREEMPT
>> (patch-3.2-rt10.patch.bz2)
>
> Even if you stick to 3.2.x, there is 3.2.82-rt118.
>
>> We are facing a network hang issue while running the periodic task of 5 ms.
>>
>> cannot ping any more when task is running is ran for some time.
>>
>> I am using Timer fd protocol to achieve this.
>>
>> I would like to know if there is such know issue in this patch.
>>
>> Any help will be appreciated.
>
> I bet you won't find help here until you reproduce the issue with a
> recent kernel (currently there is v4.6.7-rt11) and show the code of your
> periodic task.

I am using iperf to transfer the data over network same problem of
network hang happens
cannot ping any more.

Attached code for rt task
>
> Best regards
> Uwe
>
> --
> Pengutronix e.K.                           | Uwe Kleine-König            |
> Industrial Linux Solutions                 | http://www.pengutronix.de/  |



-- 
-- 
Regards,
Hardik A Gohil


WILLOWGLEN MSC BERHAD (462648-V)
 NO 17 JALAN 2/149B, TAMAN SRI ENDAH,
 BANDAR BARU SRI PETALING,
 57000 KUALA LUMPUR, MALAYSIA
 TEL +603 90571228  FAX +603 90571218
  WILLOWGLEN.COM.MY

[-- Attachment #2: rttask.c --]
[-- Type: text/x-csrc, Size: 1835 bytes --]

#include <stdio.h>
#include <stdlib.h>
#include <time.h>
#include <sched.h>
#include <sys/mman.h>
#include <string.h>
#include <signal.h>

#define MY_PRIORITY (99)

struct periodic_info
{
	int timer_fd;
	unsigned long long wakeups_missed;
};

static int make_periodic (unsigned int period, struct periodic_info *info)
{
	int ret;
	unsigned int ns;
	unsigned int sec;
	int fd;
	struct itimerspec itval;

	/* Create the timer */
	fd = timerfd_create (CLOCK_MONOTONIC, 0);
	info->wakeups_missed = 0;
	info->timer_fd = fd;
	if (fd == -1)
		return fd;

	/* Make the timer periodic */
	sec = period/1000000;
	ns = (period - (sec * 1000000)) * 1000;
	itval.it_interval.tv_sec = sec;
	itval.it_interval.tv_nsec = ns;
	itval.it_value.tv_sec = sec;
	itval.it_value.tv_nsec = ns;
	ret = timerfd_settime (fd, 0, &itval, NULL);
	return ret;
}

static void wait_period (struct periodic_info *info)
{
	unsigned long long missed;
	int ret;

	/* Wait for the next timer event. If we have missed any the
	   number is written to "missed" */
	ret = read (info->timer_fd, &missed, sizeof (missed));
	if (ret == -1)
	{
		perror ("read timer");
		return;
	}

	/* "missed" should always be >= 1, but just to be sure, check it is not 0 anyway */
	if (missed > 0)
		info->wakeups_missed += (missed - 1);
}

int main(void)
{
	struct periodic_info info;
	struct sched_param param;
	struct timeval stop, start;

	param.sched_priority = MY_PRIORITY;
        if(sched_setscheduler(0, SCHED_FIFO, &param) == -1) {
                perror("sched_setscheduler failed");
                exit(-1);
        }
	make_periodic(5000, &info); //1 sec 
	while (1)
	{
		/* Do useful work */
		printf("This is real time task");
		gettimeofday(&start, NULL);
		wait_period (&info);
		gettimeofday(&stop, NULL);
		printf("took %lu\n", stop.tv_usec - start.tv_usec);
	}
	return 0;
}

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Network Issue
  2016-09-14  7:51   ` Hardik A Gohil (WMSC-HW)
@ 2016-09-14 11:26     ` Uwe Kleine-König
  2016-09-23  2:55       ` Hardik A Gohil (WMSC-HW)
  0 siblings, 1 reply; 6+ messages in thread
From: Uwe Kleine-König @ 2016-09-14 11:26 UTC (permalink / raw)
  To: Hardik A Gohil (WMSC-HW); +Cc: linux-rt-users

On Wed, Sep 14, 2016 at 03:51:28PM +0800, Hardik A Gohil (WMSC-HW) wrote:
> On Thu, Sep 8, 2016 at 2:19 PM, Uwe Kleine-König
> <u.kleine-koenig@pengutronix.de> wrote:
> > Hello,
> >
> > On Thu, Sep 08, 2016 at 09:49:14AM +0800, Hardik A Gohil (WMSC-HW) wrote:
> >> I am working on Linux 3.2.0.
> >
> > Wow, you're still developing on a kernel that is 4.5 years old and
> > you're not even taking the stable updates.
> 
> since we did development with phyCORE-AM335x-PD13.1.2 which is using 3.2 kernel.
> we have already completed production and it wont be a easy task to
> shift to 4. series

Consider it a development step to update to a newer kernel. This allows
you to test if the problem exists there, too, fix it and then being able
to backport it. Note all this doesn't force you to upgrade your machines
in the field.

Best regards
Uwe

-- 
Pengutronix e.K.                           | Uwe Kleine-König            |
Industrial Linux Solutions                 | http://www.pengutronix.de/  |

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Network Issue
  2016-09-14 11:26     ` Uwe Kleine-König
@ 2016-09-23  2:55       ` Hardik A Gohil (WMSC-HW)
  0 siblings, 0 replies; 6+ messages in thread
From: Hardik A Gohil (WMSC-HW) @ 2016-09-23  2:55 UTC (permalink / raw)
  To: Uwe Kleine-König; +Cc: linux-rt-users

Hello,

I am found the problem in driver.

I am using debug by printing method.

for normal transmission dev_queue_xmit (buffer queue to tx to network
device) and then cpsw_tx_handler (Transmit handler).

when it hangs Transmit handler is not getting called after queued into buffer.

Please help me to solve the issue.

On Wed, Sep 14, 2016 at 7:26 PM, Uwe Kleine-König
<u.kleine-koenig@pengutronix.de> wrote:
> On Wed, Sep 14, 2016 at 03:51:28PM +0800, Hardik A Gohil (WMSC-HW) wrote:
>> On Thu, Sep 8, 2016 at 2:19 PM, Uwe Kleine-König
>> <u.kleine-koenig@pengutronix.de> wrote:
>> > Hello,
>> >
>> > On Thu, Sep 08, 2016 at 09:49:14AM +0800, Hardik A Gohil (WMSC-HW) wrote:
>> >> I am working on Linux 3.2.0.
>> >
>> > Wow, you're still developing on a kernel that is 4.5 years old and
>> > you're not even taking the stable updates.
>>
>> since we did development with phyCORE-AM335x-PD13.1.2 which is using 3.2 kernel.
>> we have already completed production and it wont be a easy task to
>> shift to 4. series
>
> Consider it a development step to update to a newer kernel. This allows
> you to test if the problem exists there, too, fix it and then being able
> to backport it. Note all this doesn't force you to upgrade your machines
> in the field.
>
> Best regards
> Uwe
>
> --
> Pengutronix e.K.                           | Uwe Kleine-König            |
> Industrial Linux Solutions                 | http://www.pengutronix.de/  |



-- 
-- 
Regards,
Hardik A Gohil


WILLOWGLEN MSC BERHAD (462648-V)
 NO 17 JALAN 2/149B, TAMAN SRI ENDAH,
 BANDAR BARU SRI PETALING,
 57000 KUALA LUMPUR, MALAYSIA
 TEL +603 90571228  FAX +603 90571218
  WILLOWGLEN.COM.MY

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Network issue
@ 2017-08-04  7:25 Herve Kergourlay
  0 siblings, 0 replies; 6+ messages in thread
From: Herve Kergourlay @ 2017-08-04  7:25 UTC (permalink / raw)
  To: linux-mm

[-- Attachment #1: Type: text/plain, Size: 2012 bytes --]

Hi

We have a customer suffering from network issues on a server in a context of RAC cluster / https://en.wikipedia.org/wiki/Oracle_RAC

The symptom is the following ;

A call to getaddrinfo on a server name fails after some times in production, with errno=11 "Resource temporarily unavailable"

The customer have a second cluster node in the same conditions without the issue.

When the issue occurs we can see in system logs the following sequence


Jan 24 14:22:06 raca-srv2 kernel: device eth0 entered promiscuous mode
Jan 24 14:22:06 raca-srv2 kernel: device eth2 entered promiscuous mode
Jan 24 14:22:06 raca-srv2 kernel: device eth9 entered promiscuous mode
Jan 24 14:22:06 raca-srv2 kernel: device eth8 entered promiscuous mode
Jan 24 14:22:06 raca-srv2 kernel: device eth1 entered promiscuous mode
Jan 24 14:22:06 raca-srv2 kernel: device eth1 left promiscuous mode
Jan 24 14:22:08 raca-srv2 kernel: device eth2 left promiscuous mode
Jan 24 14:22:29 raca-srv2 kernel: device eth8 left promiscuous mode
Jan 24 14:22:35 raca-srv2 kernel: device eth9 left promiscuous mode
Jan 24 14:22:39 raca-srv2 kernel: nr_pdflush_threads exported in /proc is scheduled for removal
Jan 24 14:22:39 raca-srv2 kernel: sysctl: The scan_unevictable_pages sysctl/node-interface has been disabled for lack of a legitimate use case. If you have one, please send an email to linux-mm@kvack.org.
Jan 24 14:22:45 raca-srv2 kernel: device eth0 left promiscuous mode
Jan 24 14:26:46 raca-srv2 kernel: bnx2x 0000:88:00.0 eth8: NIC Link is Down
Jan 24 14:26:46 raca-srv2 kernel: bonding: bond0: link status definitely down for interface eth8, disabling it
Jan 24 14:26:46 raca-srv2 kernel: bonding: bond0: making interface eth0 the new active one.


As your email address is explicitly notified, I send you this email. I hope you will be able to us to understand what is happening

If you have questions on the context, I will be pleased to give you any usefull answers needed.

Regards
Hervé



[-- Attachment #2: Type: text/html, Size: 6641 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2017-08-04  7:25 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-09-08  1:49 Network Issue Hardik A Gohil (WMSC-HW)
2016-09-08  6:19 ` Uwe Kleine-König
2016-09-14  7:51   ` Hardik A Gohil (WMSC-HW)
2016-09-14 11:26     ` Uwe Kleine-König
2016-09-23  2:55       ` Hardik A Gohil (WMSC-HW)
2017-08-04  7:25 Network issue Herve Kergourlay

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.