From: Chris Paterson <Chris.Paterson2@renesas.com> To: "Bhola, Bikram" <Bikram_Bhola@mentor.com>, Pavel Machek <pavel@denx.de> Cc: "cip-dev@lists.cip-project.org" <cip-dev@lists.cip-project.org>, Jan Kiszka <jan.kiszka@siemens.com> Subject: RE: Prompt timeouts on ipc227e board -- randomness related? Date: Tue, 5 Oct 2021 13:41:08 +0000 [thread overview] Message-ID: <OSZPR01MB68770F80AE64400871800F6EB7AF9@OSZPR01MB6877.jpnprd01.prod.outlook.com> (raw) In-Reply-To: <a34a0fcb84f449ef83e2843bec5a6d02@SVR-IES-MBX-03.mgc.mentorg.com> Hello Bikram, > From: Bhola, Bikram <Bikram_Bhola@mentor.com> > Sent: 30 September 2021 12:19 > > Hi Chris, > > We investigated the failure job and looks like before getting login prompt job > timeout is happening . In the job definition file - job timeout is mentioned > 15mins and sometimes due to slow network issue, it takes more time while > downloading, untar and deploying image. So we are seeing timeout during > login prompt or in some cases in earlier stages also. The work in progress to > double up the network bandwidth within a few weeks, which will reduce the > occurrence of this type of issues. Thank you for your investigation. I've have increased the timeout as you have suggested: https://gitlab.com/cip-project/cip-testing/linux-cip-ci/-/merge_requests/49 One additional thing I've noticed, the default x86 character delay during boot is 500ms, which seems a long time inbetween each character sent to the platform https://lava.ciplatform.org/scheduler/device/x86-simatic-ipc227e-01/devicedict#defline5 Has a lower value for boot_character_delay ever been tried? Kind regards, Chris > > Time being, with an increased job timeout to 20mins, failure is not observed. > We tested 10 times to be working fine. > Example : > https://lava.ciplatform.org/scheduler/device/x86-simatic-ipc227e-01 > > > changes in the job definition file > https://lava.ciplatform.org/scheduler/job/444336/definition > Current implementation > ------------------------ > timeouts: > job: > minutes: 15 > > Need to Modify > ----------------------------------- > timeouts: > job: > minutes: 20 > > > Regards, > Bikram > > -----Original Message----- > From: Chris Paterson <Chris.Paterson2@renesas.com> > Sent: 28 September 2021 15:38 > To: Pavel Machek <pavel@denx.de>; Bhola, Bikram > <Bikram_Bhola@mentor.com> > Cc: cip-dev@lists.cip-project.org; Jan Kiszka <jan.kiszka@siemens.com> > Subject: RE: Prompt timeouts on ipc227e board -- randomness related? > > Hello Pavel, > > > From: Pavel Machek <pavel@denx.de> > > Sent: 25 September 2021 21:06 > > > > Hi! > > > > It is not first time I see this failure: > > Thank you for reporting the issue. > > Bikram is going to take a look for us (thank you). > > Kind regards, Chris > > > > > > https://jpn01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flava.c > iplatform.org%2Fscheduler%2Fjob%2F444336&data=04%7C01%7CChris. > Paterson2%40renesas.com%7Cacb37b995a6c41d2090808d98404189e%7C53d > 82571da1947e49cb4625a166a4a2a%7C0%7C0%7C637685975391318359%7CUn > known%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6 > Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=yIPWAcblC17x6PpT3TSiGT > QWiiinQiSuu9a3HRg4u3Q%3D&reserved=0 > > > > > > [[0;32m OK [0m] Started Login Service. > > [[0m[0;31m* [0m] (1 of 2) A start job is running for…ate sshd host keys > (7s / > > no limit)[K[[0;1;31m*[0m[0;31m* [0m] (1 of 2) A start job is running > for…ate > > sshd host keys (8s / no limit)[K[[0;31m*[0;1;31m*[0m[0;31m* [0m] (1 of 2) > > A start job is running for…ate sshd host keys (9s / no limit)[K[ > > [0;31m*[0;1;31m*[0m[0;31m* [0m] (2 of 2) A start job is running > for…evices- > > eth0.device (8s / 1min 30s)[ 19.855328] systemd[1]: apt-daily- > > upgrade.timer: Adding 3min 2.027476s random time. > > [ 19.864207] systemd[1]: apt-daily.timer: Adding 1h 54min 15.794344s > > random time. > > [ 21.406490] systemd[1]: apt-daily-upgrade.timer: Adding 55min > 47.041488s > > random time. > > [ 21.415357] systemd[1]: apt-daily.timer: Adding 11h 48min 4.457495s > > random time. > > [ 22.049807] systemd[1]: apt-daily-upgrade.timer: Adding 3min 54.125406s > > random time. > > [ 22.058500] systemd[1]: apt-daily.timer: Adding 8h 34min 47.388595s > > random time. > > [ 22.511646] systemd[1]: apt-daily-upgrade.timer: Adding 25min > 13.015405s > > random time. > > [ 22.520510] systemd[1]: apt-daily.timer: Adding 11h 58min 24.212170s > > random time. > > [K[[0;32m OK [0m] Started Regenerate sshd host keys. > > wait for prompt timed out > > end: 2.3.4.1 login-action (duration 00:00:24) [common] > > case: login-action > > case_id: 9417066 > > definition: lava > > duration: 23.98 > > > > Any idea what is going on there? Is it just a test problem, or do we > > have kernel regression that only happens sometimes? > > > > Best regards, > > Pavel > > -- > > DENX Software Engineering GmbH, Managing Director: Wolfgang Denk > > HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
WARNING: multiple messages have this Message-ID (diff)
From: "Chris Paterson" <chris.paterson2@renesas.com> To: "Bhola, Bikram" <Bikram_Bhola@mentor.com>, Pavel Machek <pavel@denx.de> Cc: "cip-dev@lists.cip-project.org" <cip-dev@lists.cip-project.org>, Jan Kiszka <jan.kiszka@siemens.com> Subject: Re: [cip-dev] Prompt timeouts on ipc227e board -- randomness related? Date: Tue, 5 Oct 2021 13:41:08 +0000 [thread overview] Message-ID: <OSZPR01MB68770F80AE64400871800F6EB7AF9@OSZPR01MB6877.jpnprd01.prod.outlook.com> (raw) Message-ID: <20211005134108.tFhfqrNiUqrI7HWYHOPYEsEDb1ZTzf-6vreScGmLp_Q@z> (raw) In-Reply-To: <a34a0fcb84f449ef83e2843bec5a6d02@SVR-IES-MBX-03.mgc.mentorg.com> [-- Attachment #1: Type: text/plain, Size: 4653 bytes --] Hello Bikram, > From: Bhola, Bikram <Bikram_Bhola@mentor.com> > Sent: 30 September 2021 12:19 > > Hi Chris, > > We investigated the failure job and looks like before getting login prompt job > timeout is happening . In the job definition file - job timeout is mentioned > 15mins and sometimes due to slow network issue, it takes more time while > downloading, untar and deploying image. So we are seeing timeout during > login prompt or in some cases in earlier stages also. The work in progress to > double up the network bandwidth within a few weeks, which will reduce the > occurrence of this type of issues. Thank you for your investigation. I've have increased the timeout as you have suggested: https://gitlab.com/cip-project/cip-testing/linux-cip-ci/-/merge_requests/49 One additional thing I've noticed, the default x86 character delay during boot is 500ms, which seems a long time inbetween each character sent to the platform https://lava.ciplatform.org/scheduler/device/x86-simatic-ipc227e-01/devicedict#defline5 Has a lower value for boot_character_delay ever been tried? Kind regards, Chris > > Time being, with an increased job timeout to 20mins, failure is not observed. > We tested 10 times to be working fine. > Example : > https://lava.ciplatform.org/scheduler/device/x86-simatic-ipc227e-01 > > > changes in the job definition file > https://lava.ciplatform.org/scheduler/job/444336/definition > Current implementation > ------------------------ > timeouts: > job: > minutes: 15 > > Need to Modify > ----------------------------------- > timeouts: > job: > minutes: 20 > > > Regards, > Bikram > > -----Original Message----- > From: Chris Paterson <Chris.Paterson2@renesas.com> > Sent: 28 September 2021 15:38 > To: Pavel Machek <pavel@denx.de>; Bhola, Bikram > <Bikram_Bhola@mentor.com> > Cc: cip-dev@lists.cip-project.org; Jan Kiszka <jan.kiszka@siemens.com> > Subject: RE: Prompt timeouts on ipc227e board -- randomness related? > > Hello Pavel, > > > From: Pavel Machek <pavel@denx.de> > > Sent: 25 September 2021 21:06 > > > > Hi! > > > > It is not first time I see this failure: > > Thank you for reporting the issue. > > Bikram is going to take a look for us (thank you). > > Kind regards, Chris > > > > > > https://jpn01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flava.c > iplatform.org%2Fscheduler%2Fjob%2F444336&data=04%7C01%7CChris. > Paterson2%40renesas.com%7Cacb37b995a6c41d2090808d98404189e%7C53d > 82571da1947e49cb4625a166a4a2a%7C0%7C0%7C637685975391318359%7CUn > known%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6 > Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=yIPWAcblC17x6PpT3TSiGT > QWiiinQiSuu9a3HRg4u3Q%3D&reserved=0 > > > > > > [[0;32m OK [0m] Started Login Service. > > [[0m[0;31m* [0m] (1 of 2) A start job is running for…ate sshd host keys > (7s / > > no limit)[K[[0;1;31m*[0m[0;31m* [0m] (1 of 2) A start job is running > for…ate > > sshd host keys (8s / no limit)[K[[0;31m*[0;1;31m*[0m[0;31m* [0m] (1 of 2) > > A start job is running for…ate sshd host keys (9s / no limit)[K[ > > [0;31m*[0;1;31m*[0m[0;31m* [0m] (2 of 2) A start job is running > for…evices- > > eth0.device (8s / 1min 30s)[ 19.855328] systemd[1]: apt-daily- > > upgrade.timer: Adding 3min 2.027476s random time. > > [ 19.864207] systemd[1]: apt-daily.timer: Adding 1h 54min 15.794344s > > random time. > > [ 21.406490] systemd[1]: apt-daily-upgrade.timer: Adding 55min > 47.041488s > > random time. > > [ 21.415357] systemd[1]: apt-daily.timer: Adding 11h 48min 4.457495s > > random time. > > [ 22.049807] systemd[1]: apt-daily-upgrade.timer: Adding 3min 54.125406s > > random time. > > [ 22.058500] systemd[1]: apt-daily.timer: Adding 8h 34min 47.388595s > > random time. > > [ 22.511646] systemd[1]: apt-daily-upgrade.timer: Adding 25min > 13.015405s > > random time. > > [ 22.520510] systemd[1]: apt-daily.timer: Adding 11h 58min 24.212170s > > random time. > > [K[[0;32m OK [0m] Started Regenerate sshd host keys. > > wait for prompt timed out > > end: 2.3.4.1 login-action (duration 00:00:24) [common] > > case: login-action > > case_id: 9417066 > > definition: lava > > duration: 23.98 > > > > Any idea what is going on there? Is it just a test problem, or do we > > have kernel regression that only happens sometimes? > > > > Best regards, > > Pavel > > -- > > DENX Software Engineering GmbH, Managing Director: Wolfgang Denk > > HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany [-- Attachment #2: Type: text/plain, Size: 429 bytes --] -=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#6785): https://lists.cip-project.org/g/cip-dev/message/6785 Mute This Topic: https://lists.cip-project.org/mt/85867572/4520388 Group Owner: cip-dev+owner@lists.cip-project.org Unsubscribe: https://lists.cip-project.org/g/cip-dev/leave/10495289/4520388/727948398/xyzzy [cip-dev@archiver.kernel.org] -=-=-=-=-=-=-=-=-=-=-=-
next prev parent reply other threads:[~2021-10-05 13:41 UTC|newest] Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top 2021-09-25 20:06 Prompt timeouts on ipc227e board -- randomness related? Pavel Machek 2021-09-25 20:06 ` [cip-dev] " Pavel Machek 2021-09-28 10:08 ` Chris Paterson 2021-09-28 10:08 ` [cip-dev] " Chris Paterson [not found] ` <a34a0fcb84f449ef83e2843bec5a6d02@SVR-IES-MBX-03.mgc.mentorg.com> 2021-10-05 13:41 ` Chris Paterson [this message] 2021-10-05 13:41 ` Chris Paterson 2021-10-05 17:47 ` Pavel Machek 2021-10-05 17:47 ` [cip-dev] " Pavel Machek
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=OSZPR01MB68770F80AE64400871800F6EB7AF9@OSZPR01MB6877.jpnprd01.prod.outlook.com \ --to=chris.paterson2@renesas.com \ --cc=Bikram_Bhola@mentor.com \ --cc=cip-dev@lists.cip-project.org \ --cc=jan.kiszka@siemens.com \ --cc=pavel@denx.de \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.