From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752764AbbEDLki (ORCPT ); Mon, 4 May 2015 07:40:38 -0400 Received: from ip4-83-240-67-251.cust.nbox.cz ([83.240.67.251]:50919 "EHLO ip4-83-240-18-248.cust.nbox.cz" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1752323AbbEDLk2 (ORCPT ); Mon, 4 May 2015 07:40:28 -0400 From: Jiri Slaby To: live-patching@vger.kernel.org Cc: jpoimboe@redhat.com, sjenning@redhat.com, jkosina@suse.cz, vojtech@suse.cz, mingo@redhat.com, linux-kernel@vger.kernel.org, Miroslav Benes , Oleg Nesterov , Peter Zijlstra , Jiri Slaby Subject: [RFC kgr on klp 9/9] livepatch: send a fake signal to all tasks Date: Mon, 4 May 2015 13:40:25 +0200 Message-Id: <1430739625-4658-9-git-send-email-jslaby@suse.cz> X-Mailer: git-send-email 2.3.5 In-Reply-To: <1430739625-4658-1-git-send-email-jslaby@suse.cz> References: <1430739625-4658-1-git-send-email-jslaby@suse.cz> Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Miroslav Benes kGraft consistency model is of LEAVE_KERNEL and SWITCH_THREAD. This means that all tasks in the system have to be marked one by one as safe to call a new patched function. Safe place is on the boundary between kernel and userspace. The patching waits for all tasks to cross this boundary and finishes the process afterwards. The problem is that a task can block the finalization of patching process for quite a long time, if not forever. The task could sleep somewhere in the kernel or could be running in the userspace with no prospect of entering the kernel and thus going through the safe place. Luckily we can force the task to do that by sending it a fake signal, that is a signal with no data in signal pending structures (no handler, no sign of proper signal delivered). Suspend/freezer use this to freeze the tasks as well. The task gets TIF_SIGPENDING set and is woken up (if it has been sleeping in the kernel before) or kicked by rescheduling IPI (if it was running on other CPU). This causes the task to go to kernel/userspace boundary where the signal would be handled and the task would be marked as safe in terms of live patching. There are tasks which are not affected by this technique though. The fake signal is not sent to kthreads. They should be handled in a different way. Also if the task is in TASK_RUNNING state but not currently running on some CPU it doesn't get the IPI, but it would eventually handle the signal anyway. Last, if the task runs in the kernel (in TASK_RUNNING state) it gets the IPI, but the signal is not handled on return from the interrupt. It would be handled on return to the userspace in the future. If the task was sleeping in a syscall it would be woken by our fake signal, it would check if TIF_SIGPENDING is set (by calling signal_pending() predicate) and return ERESTART* or EINTR. Syscalls with ERESTART* return values are restarted in case of the fake signal (see do_signal()). EINTR is propagated back to the userspace program. This could disturb the program, but... * each process dealing with signals should react accordingly to EINTR return values. * syscalls returning EINTR happen to be quite common situation in the system even if no fake signal is sent. * freezer sends the fake signal and does not deal with EINTR anyhow. Thus EINTR values are returned when the system is resumed. The very safe marking is done in entry_64.S on syscall and interrupt/exception exit paths. Signed-off-by: Miroslav Benes Reviewed-by: Jiri Kosina Cc: Ingo Molnar Cc: Oleg Nesterov Cc: Peter Zijlstra Signed-off-by: Jiri Slaby --- kernel/livepatch/cmodel-kgraft.c | 23 +++++++++++++++++++++++ kernel/signal.c | 3 ++- 2 files changed, 25 insertions(+), 1 deletion(-) diff --git a/kernel/livepatch/cmodel-kgraft.c b/kernel/livepatch/cmodel-kgraft.c index 196b08823f73..fd041ca30161 100644 --- a/kernel/livepatch/cmodel-kgraft.c +++ b/kernel/livepatch/cmodel-kgraft.c @@ -107,6 +107,27 @@ static bool klp_kgraft_still_patching(void) return failed; } +static void klp_kgraft_send_fake_signal(void) +{ + struct task_struct *p; + unsigned long flags; + + read_lock(&tasklist_lock); + for_each_process(p) { + /* + * send fake signal to all non-kthread processes which are still + * not migrated + */ + if (!(p->flags & PF_KTHREAD) && + klp_kgraft_task_in_progress(p) && + lock_task_sighand(p, &flags)) { + signal_wake_up(p, 0); + unlock_task_sighand(p, &flags); + } + } + read_unlock(&tasklist_lock); +} + static void klp_kgraft_work_fn(struct work_struct *work) { static bool printed = false; @@ -117,6 +138,8 @@ static void klp_kgraft_work_fn(struct work_struct *work) KGRAFT_TIMEOUT); printed = true; } + /* send fake signal */ + klp_kgraft_send_fake_signal(); /* recheck again later */ queue_delayed_work(klp_kgraft_wq, &klp_kgraft_work, KGRAFT_TIMEOUT * HZ); diff --git a/kernel/signal.c b/kernel/signal.c index d51c5ddd855c..5a3f56a69122 100644 --- a/kernel/signal.c +++ b/kernel/signal.c @@ -157,7 +157,8 @@ void recalc_sigpending_and_wake(struct task_struct *t) void recalc_sigpending(void) { - if (!recalc_sigpending_tsk(current) && !freezing(current)) + if (!recalc_sigpending_tsk(current) && !freezing(current) && + !klp_kgraft_task_in_progress(current)) clear_thread_flag(TIF_SIGPENDING); } -- 2.3.5