linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] ksefltest: pidfd: Fix wait_states: Test terminated by timeout
@ 2022-07-18  2:58 lizhijian
  2022-07-18  3:08 ` lizhijian
  2022-07-19  8:32 ` Christian Brauner
  0 siblings, 2 replies; 3+ messages in thread
From: lizhijian @ 2022-07-18  2:58 UTC (permalink / raw)
  To: Christian Brauner, Shuah Khan
  Cc: linux-kernel, linux-kselftest, lizhijian, Philip Li, kernel test robot

0Day/LKP observed that the kselftest blocks forever since one of the
pidfd_wait doesn't terminate in 1 of 30 runs. After digging into
the source, we found that it blocks at:
ASSERT_EQ(sys_waitid(P_PIDFD, pidfd, &info, WCONTINUED, NULL), 0);

wait_states has below testing flow:
  CHILD                 PARENT
  ---------------+--------------
1 STOP itself
2                   WAIT for CHILD STOPPED
3                   SIGNAL CHILD to CONT
4 CONT
5 STOP itself
5'                  WAIT for CHILD CONT
6                   WAIT for CHILD STOPPED

The problem is that the kernel cannot ensure the order of 5 and 5', once
5's goes first, the test will fail.

we can reproduce it by:
$ while true; do make run_tests -C pidfd; done

Introduce a blocking read in child process to make sure the parent can
check its WCONTINUED.

CC: Philip Li <philip.li@intel.com>
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Li Zhijian <lizhijian@fujitsu.com>
---
I have almost forgotten this patch since the former version post over 6 months
ago. This time I just do a rebase and update the comments.
V2: rewrite with pipe to avoid usleep
---
 tools/testing/selftests/pidfd/pidfd_wait.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/tools/testing/selftests/pidfd/pidfd_wait.c b/tools/testing/selftests/pidfd/pidfd_wait.c
index 070c1c876df1..3f7bc6517dea 100644
--- a/tools/testing/selftests/pidfd/pidfd_wait.c
+++ b/tools/testing/selftests/pidfd/pidfd_wait.c
@@ -95,20 +95,27 @@ TEST(wait_states)
 		.flags = CLONE_PIDFD | CLONE_PARENT_SETTID,
 		.exit_signal = SIGCHLD,
 	};
+	int ret, pfd[2];
 	pid_t pid;
 	siginfo_t info = {
 		.si_signo = 0,
 	};
 
+	ASSERT_EQ(pipe(pfd), 0);
 	pid = sys_clone3(&args);
 	ASSERT_GE(pid, 0);
 
 	if (pid == 0) {
+		char buf[2];
+		close(pfd[1]);
 		kill(getpid(), SIGSTOP);
+		ASSERT_EQ(read(pfd[0], buf, 1), 1);
+		close(pfd[0]);
 		kill(getpid(), SIGSTOP);
 		exit(EXIT_SUCCESS);
 	}
 
+	close(pfd[0]);
 	ASSERT_EQ(sys_waitid(P_PIDFD, pidfd, &info, WSTOPPED, NULL), 0);
 	ASSERT_EQ(info.si_signo, SIGCHLD);
 	ASSERT_EQ(info.si_code, CLD_STOPPED);
@@ -117,6 +124,8 @@ TEST(wait_states)
 	ASSERT_EQ(sys_pidfd_send_signal(pidfd, SIGCONT, NULL, 0), 0);
 
 	ASSERT_EQ(sys_waitid(P_PIDFD, pidfd, &info, WCONTINUED, NULL), 0);
+	ASSERT_EQ(write(pfd[1], "C", 1), 1);
+	close(pfd[1]);
 	ASSERT_EQ(info.si_signo, SIGCHLD);
 	ASSERT_EQ(info.si_code, CLD_CONTINUED);
 	ASSERT_EQ(info.si_pid, parent_tid);
-- 
2.36.0

^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH] ksefltest: pidfd: Fix wait_states: Test terminated by timeout
  2022-07-18  2:58 [PATCH] ksefltest: pidfd: Fix wait_states: Test terminated by timeout lizhijian
@ 2022-07-18  3:08 ` lizhijian
  2022-07-19  8:32 ` Christian Brauner
  1 sibling, 0 replies; 3+ messages in thread
From: lizhijian @ 2022-07-18  3:08 UTC (permalink / raw)
  To: Christian Brauner, Shuah Khan
  Cc: linux-kernel, linux-kselftest, Philip Li, kernel test robot



On 18/07/2022 10:58, Li, Zhijian/李 智坚 wrote:
> 0Day/LKP observed that the kselftest blocks forever since one of the
> pidfd_wait doesn't terminate in 1 of 30 runs. After digging into
> the source, we found that it blocks at:
> ASSERT_EQ(sys_waitid(P_PIDFD, pidfd, &info, WCONTINUED, NULL), 0);
>
> wait_states has below testing flow:
>    CHILD                 PARENT
>    ---------------+--------------
> 1 STOP itself
> 2                   WAIT for CHILD STOPPED
> 3                   SIGNAL CHILD to CONT
> 4 CONT
> 5 STOP itself
> 5'                  WAIT for CHILD CONT
> 6                   WAIT for CHILD STOPPED
>
> The problem is that the kernel cannot ensure the order of 5 and 5', once
> 5's goes first, the test will fail.
Correct:
s/once 5's goes first/once 5 goes first


>
> we can reproduce it by:
> $ while true; do make run_tests -C pidfd; done
>
> Introduce a blocking read in child process to make sure the parent can
> check its WCONTINUED.
>
> CC: Philip Li <philip.li@intel.com>
> Reported-by: kernel test robot <lkp@intel.com>
> Signed-off-by: Li Zhijian <lizhijian@fujitsu.com>
> ---
> I have almost forgotten this patch since the former version post over 6 months
> ago. This time I just do a rebase and update the comments.
> V2: rewrite with pipe to avoid usleep
> ---
>   tools/testing/selftests/pidfd/pidfd_wait.c | 9 +++++++++
>   1 file changed, 9 insertions(+)
>
> diff --git a/tools/testing/selftests/pidfd/pidfd_wait.c b/tools/testing/selftests/pidfd/pidfd_wait.c
> index 070c1c876df1..3f7bc6517dea 100644
> --- a/tools/testing/selftests/pidfd/pidfd_wait.c
> +++ b/tools/testing/selftests/pidfd/pidfd_wait.c
> @@ -95,20 +95,27 @@ TEST(wait_states)
>   		.flags = CLONE_PIDFD | CLONE_PARENT_SETTID,
>   		.exit_signal = SIGCHLD,
>   	};
> +	int ret, pfd[2];
>   	pid_t pid;
>   	siginfo_t info = {
>   		.si_signo = 0,
>   	};
>   
> +	ASSERT_EQ(pipe(pfd), 0);
>   	pid = sys_clone3(&args);
>   	ASSERT_GE(pid, 0);
>   
>   	if (pid == 0) {
> +		char buf[2];
> +		close(pfd[1]);
>   		kill(getpid(), SIGSTOP);
> +		ASSERT_EQ(read(pfd[0], buf, 1), 1);
> +		close(pfd[0]);
>   		kill(getpid(), SIGSTOP);
>   		exit(EXIT_SUCCESS);
>   	}
>   
> +	close(pfd[0]);
>   	ASSERT_EQ(sys_waitid(P_PIDFD, pidfd, &info, WSTOPPED, NULL), 0);
>   	ASSERT_EQ(info.si_signo, SIGCHLD);
>   	ASSERT_EQ(info.si_code, CLD_STOPPED);
> @@ -117,6 +124,8 @@ TEST(wait_states)
>   	ASSERT_EQ(sys_pidfd_send_signal(pidfd, SIGCONT, NULL, 0), 0);
>   
>   	ASSERT_EQ(sys_waitid(P_PIDFD, pidfd, &info, WCONTINUED, NULL), 0);
> +	ASSERT_EQ(write(pfd[1], "C", 1), 1);
> +	close(pfd[1]);
>   	ASSERT_EQ(info.si_signo, SIGCHLD);
>   	ASSERT_EQ(info.si_code, CLD_CONTINUED);
>   	ASSERT_EQ(info.si_pid, parent_tid);

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] ksefltest: pidfd: Fix wait_states: Test terminated by timeout
  2022-07-18  2:58 [PATCH] ksefltest: pidfd: Fix wait_states: Test terminated by timeout lizhijian
  2022-07-18  3:08 ` lizhijian
@ 2022-07-19  8:32 ` Christian Brauner
  1 sibling, 0 replies; 3+ messages in thread
From: Christian Brauner @ 2022-07-19  8:32 UTC (permalink / raw)
  To: lizhijian
  Cc: Shuah Khan, linux-kernel, linux-kselftest, Philip Li, kernel test robot

On Mon, Jul 18, 2022 at 02:58:39AM +0000, lizhijian@fujitsu.com wrote:
> 0Day/LKP observed that the kselftest blocks forever since one of the
> pidfd_wait doesn't terminate in 1 of 30 runs. After digging into
> the source, we found that it blocks at:
> ASSERT_EQ(sys_waitid(P_PIDFD, pidfd, &info, WCONTINUED, NULL), 0);
> 
> wait_states has below testing flow:
>   CHILD                 PARENT
>   ---------------+--------------
> 1 STOP itself
> 2                   WAIT for CHILD STOPPED
> 3                   SIGNAL CHILD to CONT
> 4 CONT
> 5 STOP itself
> 5'                  WAIT for CHILD CONT
> 6                   WAIT for CHILD STOPPED
> 
> The problem is that the kernel cannot ensure the order of 5 and 5', once
> 5's goes first, the test will fail.
> 
> we can reproduce it by:
> $ while true; do make run_tests -C pidfd; done
> 
> Introduce a blocking read in child process to make sure the parent can
> check its WCONTINUED.
> 
> CC: Philip Li <philip.li@intel.com>
> Reported-by: kernel test robot <lkp@intel.com>
> Signed-off-by: Li Zhijian <lizhijian@fujitsu.com>
> ---
> I have almost forgotten this patch since the former version post over 6 months
> ago. This time I just do a rebase and update the comments.
> V2: rewrite with pipe to avoid usleep
> ---

Thanks for sticking with this!
Reviewed-by: Christian Brauner (Microsoft) <brauner@kernel.org>

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2022-07-19  8:32 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-07-18  2:58 [PATCH] ksefltest: pidfd: Fix wait_states: Test terminated by timeout lizhijian
2022-07-18  3:08 ` lizhijian
2022-07-19  8:32 ` Christian Brauner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).