From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932404AbdJJPq0 (ORCPT ); Tue, 10 Oct 2017 11:46:26 -0400 Received: from shelob.surriel.com ([96.67.55.147]:41120 "EHLO shelob.surriel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932193AbdJJPqZ (ORCPT ); Tue, 10 Oct 2017 11:46:25 -0400 Message-ID: <1507650370.10046.41.camel@surriel.com> Subject: Re: [PATCH v4 1/2] pid: Replace pid bitmap implementation with IDR API From: Rik van Riel To: Gargi Sharma , Oleg Nesterov Cc: Andrew Morton , linux-kernel@vger.kernel.org, Julia Lawall , mingo@kernel.org, pasha.tatashin@oracle.com, ktkhai@virtuozzo.com, "Eric W. Biederman" , Christoph Hellwig Date: Tue, 10 Oct 2017 11:46:10 -0400 In-Reply-To: References: <1507583624-22146-1-git-send-email-gs051095@gmail.com> <1507583624-22146-2-git-send-email-gs051095@gmail.com> <20171009161737.ea8c62441cc12dfd909ee0b2@linux-foundation.org> <20171010115034.GA28545@redhat.com> Content-Type: multipart/signed; micalg="pgp-sha256"; protocol="application/pgp-signature"; boundary="=-+swzI9RRpbOPjvW5c/Ce" X-Mailer: Evolution 3.22.6 (3.22.6-2.fc25) Mime-Version: 1.0 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org --=-+swzI9RRpbOPjvW5c/Ce Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Tue, 2017-10-10 at 13:35 +0100, Gargi Sharma wrote: > On Tue, Oct 10, 2017 at 12:50 PM, Oleg Nesterov > wrote: > > On 10/09, Andrew Morton wrote: > > >=20 > > > > @@ -240,17 +230,11 @@ void zap_pid_ns_processes(struct > > > > pid_namespace *pid_ns) > > > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0* > > > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0*/ > > > > =C2=A0=C2=A0=C2=A0=C2=A0read_lock(&tasklist_lock); > > > > -=C2=A0=C2=A0=C2=A0nr =3D next_pidmap(pid_ns, 1); > > > > -=C2=A0=C2=A0=C2=A0while (nr > 0) { > > > > -=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= rcu_read_lock(); > > > > - > > > > -=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= task =3D pid_task(find_vpid(nr), PIDTYPE_PID); > > > > +=C2=A0=C2=A0=C2=A0nr =3D 2; > > > > +=C2=A0=C2=A0=C2=A0idr_for_each_entry_continue(&pid_ns->idr, pid, n= r) { > > > > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= task =3D pid_task(pid, PIDTYPE_PID); > > > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0if (task && !__fatal_signal_pending(task)) > > > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0send_sig_info(SIGKILL= , SEND_SIG_FORCED, > > > > task); > > > > - > > > > -=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= rcu_read_unlock(); > > > > - > > > > -=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= nr =3D next_pidmap(pid_ns, nr); > > > > =C2=A0=C2=A0=C2=A0=C2=A0} > > > > =C2=A0=C2=A0=C2=A0=C2=A0read_unlock(&tasklist_lock); > > >=20 > > > Especially here.=C2=A0=C2=A0I don't think pidmap_lock is held.=C2=A0= =C2=A0Is that IDR > > > iteration safe? > >=20 > > Yes, this doesn't look right, we need rcu_read_lock() or > > pidmap_lock. > >=20 > > And, we also need rcu_read_lock() for another reason, to protect > > "struct pid". >=20 > Ah, I missed this. From what I understood idr_for_each_entry_continue > should be safe because calls idr_get_next which in turn calls > radix_tree_iter_find to find the next populated entry in the idr. If > the pid that you are looking up the task for is deleted, task will > get > a NULL from pid_task and no signal to kill will be sent. > >=20 > > Gargi, I suggested to use idr_for_each_entry_continue(), but now I > > am wondering > > if we should use idr_for_each() instead. IIUC this would be a bit > > faster? Not > > that I think this is really important... >=20 > I can run benchmarks with idr_for_each to see how much speed up is > achieved and then we can go with whatever we think is better. How > does > that sounds? I suspect this code will not be a hot path in any conceivable "kill off hundreds of containers" benchmark, since the overhead of having all of the tasks in those containers exit will dwarf any changes in this code. Simply making it safe for fully preemptible kernels by adding rcu_read_lock() around the section is what matters the most. The choice between idr_for_each_entry_continue() and idr_for_each() is dictated more by which of the two results in easier to read code. --=20 All rights reversed --=-+swzI9RRpbOPjvW5c/Ce Content-Type: application/pgp-signature; name="signature.asc" Content-Description: This is a digitally signed message part Content-Transfer-Encoding: 7bit -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQEcBAABCAAGBQJZ3OtCAAoJEM553pKExN6DNT4H/3G0TBWlEY8purA6rYfdWbNO BR5gRi6VyLa291w6F7zgvuLixRigjX6o1HN3riPAxUbeOg8zkGk/mTndbgj+3PYA soSkoSz/+DMPVOR3GFWo/IPxGXamF4eawslJf8Lxw4iOUSKIQ8L0ERepcamRDbfh pIP6hXDUCG8IdiY05qFBb2wQ1ST2f+kSCdB3p3fdP5wewyvKiAX8zqr5FpYBYC3v wxbk12Xb16P8LmToD9TMPv4bvRQBb70FUP/H9zitUt6p6mAntW8SAPZxsIsmP3ts 3jx7GfYrqPY3dISrwOSQ5LBjSHKxdlJTACsdrKGDiDlnYgPB5HYzTD3cmtDAqMs= =qxKv -----END PGP SIGNATURE----- --=-+swzI9RRpbOPjvW5c/Ce--