From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-0.8 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id C845BECDFD0 for ; Fri, 14 Sep 2018 08:56:30 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 7556E20882 for ; Fri, 14 Sep 2018 08:56:30 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 7556E20882 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=suse.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728239AbeINOJ6 (ORCPT ); Fri, 14 Sep 2018 10:09:58 -0400 Received: from mx2.suse.de ([195.135.220.15]:60262 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1726618AbeINOJ6 (ORCPT ); Fri, 14 Sep 2018 10:09:58 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id F0185AF75; Fri, 14 Sep 2018 08:56:23 +0000 (UTC) Subject: Re: [PATCH 2/6] xenbus: implement the xenwatch multithreading framework To: Dongli Zhang , xen-devel@lists.xenproject.org, linux-kernel@vger.kernel.org Cc: boris.ostrovsky@oracle.com, paul.durrant@citrix.com, wei.liu2@citrix.com, konrad.wilk@oracle.com, roger.pau@citrix.com, srinivas.eeda@oracle.com References: <1536910456-13337-1-git-send-email-dongli.zhang@oracle.com> <1536910456-13337-3-git-send-email-dongli.zhang@oracle.com> From: Juergen Gross Openpgp: preference=signencrypt Autocrypt: addr=jgross@suse.com; prefer-encrypt=mutual; keydata= xsBNBFOMcBYBCACgGjqjoGvbEouQZw/ToiBg9W98AlM2QHV+iNHsEs7kxWhKMjrioyspZKOB ycWxw3ie3j9uvg9EOB3aN4xiTv4qbnGiTr3oJhkB1gsb6ToJQZ8uxGq2kaV2KL9650I1SJve dYm8Of8Zd621lSmoKOwlNClALZNew72NjJLEzTalU1OdT7/i1TXkH09XSSI8mEQ/ouNcMvIJ NwQpd369y9bfIhWUiVXEK7MlRgUG6MvIj6Y3Am/BBLUVbDa4+gmzDC9ezlZkTZG2t14zWPvx XP3FAp2pkW0xqG7/377qptDmrk42GlSKN4z76ELnLxussxc7I2hx18NUcbP8+uty4bMxABEB AAHNHkp1ZXJnZW4gR3Jvc3MgPGpncm9zc0BzdXNlLmRlPsLAeQQTAQIAIwUCU4xw6wIbAwcL CQgHAwIBBhUIAgkKCwQWAgMBAh4BAheAAAoJELDendYovxMvi4UH/Ri+OXlObzqMANruTd4N zmVBAZgx1VW6jLc8JZjQuJPSsd/a+bNr3BZeLV6lu4Pf1Yl2Log129EX1KWYiFFvPbIiq5M5 kOXTO8Eas4CaScCvAZ9jCMQCgK3pFqYgirwTgfwnPtxFxO/F3ZcS8jovza5khkSKL9JGq8Nk czDTruQ/oy0WUHdUr9uwEfiD9yPFOGqp4S6cISuzBMvaAiC5YGdUGXuPZKXLpnGSjkZswUzY d9BVSitRL5ldsQCg6GhDoEAeIhUC4SQnT9SOWkoDOSFRXZ+7+WIBGLiWMd+yKDdRG5RyP/8f 3tgGiB6cyuYfPDRGsELGjUaTUq3H2xZgIPfOwE0EU4xwFgEIAMsx+gDjgzAY4H1hPVXgoLK8 B93sTQFN9oC6tsb46VpxyLPfJ3T1A6Z6MVkLoCejKTJ3K9MUsBZhxIJ0hIyvzwI6aYJsnOew cCiCN7FeKJ/oA1RSUemPGUcIJwQuZlTOiY0OcQ5PFkV5YxMUX1F/aTYXROXgTmSaw0aC1Jpo w7Ss1mg4SIP/tR88/d1+HwkJDVW1RSxC1PWzGizwRv8eauImGdpNnseneO2BNWRXTJumAWDD pYxpGSsGHXuZXTPZqOOZpsHtInFyi5KRHSFyk2Xigzvh3b9WqhbgHHHE4PUVw0I5sIQt8hJq 5nH5dPqz4ITtCL9zjiJsExHuHKN3NZsAEQEAAcLAXwQYAQIACQUCU4xwFgIbDAAKCRCw3p3W KL8TL0P4B/9YWver5uD/y/m0KScK2f3Z3mXJhME23vGBbMNlfwbr+meDMrJZ950CuWWnQ+d+ Ahe0w1X7e3wuLVODzjcReQ/v7b4JD3wwHxe+88tgB9byc0NXzlPJWBaWV01yB2/uefVKryAf AHYEd0gCRhx7eESgNBe3+YqWAQawunMlycsqKa09dBDL1PFRosF708ic9346GLHRc6Vj5SRA UTHnQqLetIOXZm3a2eQ1gpQK9MmruO86Vo93p39bS1mqnLLspVrL4rhoyhsOyh0Hd28QCzpJ wKeHTd0MAWAirmewHXWPco8p1Wg+V+5xfZzuQY0f4tQxvOpXpt4gQ1817GQ5/Ed/wsDtBBgB CAAgFiEEhRJncuj2BJSl0Jf3sN6d1ii/Ey8FAlrd8NACGwIAgQkQsN6d1ii/Ey92IAQZFggA HRYhBFMtsHpB9jjzHji4HoBcYbtP2GO+BQJa3fDQAAoJEIBcYbtP2GO+TYsA/30H/0V6cr/W V+J/FCayg6uNtm3MJLo4rE+o4sdpjjsGAQCooqffpgA+luTT13YZNV62hAnCLKXH9n3+ZAgJ RtAyDWk1B/0SMDVs1wxufMkKC3Q/1D3BYIvBlrTVKdBYXPxngcRoqV2J77lscEvkLNUGsu/z W2pf7+P3mWWlrPMJdlbax00vevyBeqtqNKjHstHatgMZ2W0CFC4hJ3YEetuRBURYPiGzuJXU pAd7a7BdsqWC4o+GTm5tnGrCyD+4gfDSpkOT53S/GNO07YkPkm/8J4OBoFfgSaCnQ1izwgJQ jIpcG2fPCI2/hxf2oqXPYbKr1v4Z1wthmoyUgGN0LPTIm+B5vdY82wI5qe9uN6UOGyTH2B3p hRQUWqCwu2sqkI3LLbTdrnyDZaixT2T0f4tyF5Lfs+Ha8xVMhIyzNb1byDI5FKCb Message-ID: <2084a3de-29bc-f255-5134-879f1aeb1329@suse.com> Date: Fri, 14 Sep 2018 10:56:23 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.9.1 MIME-Version: 1.0 In-Reply-To: <1536910456-13337-3-git-send-email-dongli.zhang@oracle.com> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 14/09/18 09:34, Dongli Zhang wrote: > This is the 2nd patch of a (6-patch) patch set. > > This patch implements the xenwatch multithreading framework to create or > destroy the per-domU xenwatch thread. The xenwatch thread is created or > destroyed during xenbus device probing or removing (that is, > xenbus_dev_probe() or xenbus_dev_remove()) if the corresponding pv driver > has xenwatch multithreading feature enabled. As there is only one single > per-domU xenwatch thread for each domid, probing the xenbus device for the > same domid again would not create the thread for the same domid again, but > only increment the reference count of the thread's mtwatch domain. When a > xenbus device is removed, the reference count is decremented. The per-domU > xenwatch thread is destroyed when the reference count of its mtwatch domain > is zero, that is, al xenbus devices (whose mtwatch feature is enabled) of > such mtwatch domain are removed. > > Therefore, a domid has its own per-domU xenwatch thread only when it is > attached with dom0 backend xenbus device whose pv driver has the feature > enabled. The domid would not have its own xenwatch thread when it is not > running any mtwatch-enabled xenbus device. > > When a watch (with xenwatch multithreading enabled) is unregistered, we > will generally traverse all mtwatch domains to remove all inflight pending > events fired by such watch. However, one optimization in this patch is we > only need to remove pending events from a specific mtwatch domain when the > watch is registered for a specific domid, that is, when its owner_id field > is non-zero. > > Signed-off-by: Dongli Zhang > --- > drivers/xen/xenbus/xenbus_probe.c | 6 + > drivers/xen/xenbus/xenbus_xs.c | 273 ++++++++++++++++++++++++++++++++++++++ > include/xen/xenbus.h | 3 + > 3 files changed, 282 insertions(+) > > diff --git a/drivers/xen/xenbus/xenbus_probe.c b/drivers/xen/xenbus/xenbus_probe.c > index 5b47188..5755596 100644 > --- a/drivers/xen/xenbus/xenbus_probe.c > +++ b/drivers/xen/xenbus/xenbus_probe.c > @@ -236,6 +236,9 @@ int xenbus_dev_probe(struct device *_dev) > if (err) > goto fail; > > + if (xen_mtwatch && drv->use_mtwatch) > + mtwatch_create_domain(dev->otherend_id); > + > err = watch_otherend(dev); > if (err) { > dev_warn(&dev->dev, "watch_otherend on %s failed.\n", > @@ -263,6 +266,9 @@ int xenbus_dev_remove(struct device *_dev) > if (drv->remove) > drv->remove(dev); > > + if (xen_mtwatch && drv->use_mtwatch) > + mtwatch_put_domain(dev->otherend_id); > + > free_otherend_details(dev); > > xenbus_switch_state(dev, XenbusStateClosed); > diff --git a/drivers/xen/xenbus/xenbus_xs.c b/drivers/xen/xenbus/xenbus_xs.c > index 3f137d2..741dc54 100644 > --- a/drivers/xen/xenbus/xenbus_xs.c > +++ b/drivers/xen/xenbus/xenbus_xs.c > @@ -108,6 +108,201 @@ static __init int xen_parse_mtwatch(char *arg) > } > early_param("xen_mtwatch", xen_parse_mtwatch); > > +struct mtwatch_domain *mtwatch_find_domain(domid_t domid) > +{ > + struct mtwatch_domain *domain; > + int hash = MTWATCH_HASH(domid); > + struct hlist_head *hash_head = &mtwatch_info->domain_hash[hash]; > + > + hlist_for_each_entry_rcu(domain, hash_head, hash_node) { > + if (domain->domid == domid) > + return domain; > + } > + > + return NULL; > +} > + > +/* per-domU thread for xenwatch multithreading. */ > +static int mtwatch_thread(void *arg) > +{ > + struct mtwatch_domain *domain = (struct mtwatch_domain *) arg; > + struct list_head *ent; > + struct xs_watch_event *event; > + > + domain->pid = current->pid; > + > + for (;;) { > + wait_event_interruptible(domain->events_wq, > + !list_empty(&domain->events) || > + domain->state == MTWATCH_DOMAIN_DOWN); > + > + if (domain->state == MTWATCH_DOMAIN_DOWN && > + list_empty(&domain->events)) > + break; > + > + mutex_lock(&domain->domain_mutex); > + > + spin_lock(&domain->events_lock); > + ent = domain->events.next; > + if (ent != &domain->events) > + list_del(ent); > + spin_unlock(&domain->events_lock); > + > + if (ent != &domain->events) { > + event = list_entry(ent, struct xs_watch_event, list); > + event->handle->callback(event->handle, event->path, > + event->token); > + kfree(event); > + } > + > + mutex_unlock(&domain->domain_mutex); > + } This is nearly the same coding as xenwatch_thread(). Why don't you define a new (sub-)structure containing the needed elements and move xenwatch_pid, watch_events_waitq, watch_events, xenwatch_mutex, watch_events_lock into such a structure? Then you could a common function for both purposes (you'd need to set state for xenwatch_thread() to DOMAIN_UP and add a callback for testing the thread end condition). > + > + /* > + * domain->state is already set to MTWATCH_DOMAIN_DOWN (to avoid > + * new event to domain->events) when above for loop breaks, so > + * that there is no requirement to cleanup domain->events again. > + */ > + > + spin_lock(&mtwatch_info->domain_lock); > + list_del_rcu(&domain->list_node); > + spin_unlock(&mtwatch_info->domain_lock); > + > + spin_lock(&mtwatch_info->purge_lock); > + list_add(&domain->purge_node, &mtwatch_info->purge_list); > + spin_unlock(&mtwatch_info->purge_lock); > + > + schedule_work(&mtwatch_info->purge_work); > + > + return 0; > +} > + > +static void delayed_destroy_domain(struct rcu_head *head) > +{ > + struct mtwatch_domain *domain; > + > + domain = container_of(head, struct mtwatch_domain, rcu); > + kfree(domain); > +} > + > +static void xen_mtwatch_purge_domain(struct work_struct *work) > +{ > + struct mtwatch_domain *domain; > + struct list_head *node; > + > + while (!list_empty(&mtwatch_info->purge_list)) { > + > + spin_lock(&mtwatch_info->purge_lock); > + node = mtwatch_info->purge_list.next; > + if (node != &mtwatch_info->purge_list) > + list_del(node); > + spin_unlock(&mtwatch_info->purge_lock); > + > + if (node != &mtwatch_info->purge_list) { > + domain = list_entry(node, struct mtwatch_domain, > + purge_node); > + kthread_stop(domain->task); > + > + call_rcu(&domain->rcu, delayed_destroy_domain); > + } > + } > +} > + > +/* Running in the context of default xenwatch kthread. */ > +void mtwatch_create_domain(domid_t domid) > +{ > + struct mtwatch_domain *domain; > + > + if (!domid) { > + pr_err("Default xenwatch thread is for dom0\n"); Should we really exclude dom0? What if a driver domain wants to support a dom0 based frontend? > + return; > + } > + > + spin_lock(&mtwatch_info->domain_lock); > + > + domain = mtwatch_find_domain(domid); > + if (domain) { > + atomic_inc(&domain->refcnt); > + spin_unlock(&mtwatch_info->domain_lock); > + return; > + } > + > + domain = kzalloc(sizeof(*domain), GFP_ATOMIC); > + if (!domain) { > + spin_unlock(&mtwatch_info->domain_lock); > + pr_err("Failed to allocate memory for mtwatch thread %d\n", > + domid); No alloc error messages, please! > + return; > + } > + > + domain->domid = domid; > + atomic_set(&domain->refcnt, 1); > + mutex_init(&domain->domain_mutex); > + INIT_LIST_HEAD(&domain->purge_node); > + > + init_waitqueue_head(&domain->events_wq); > + spin_lock_init(&domain->events_lock); > + INIT_LIST_HEAD(&domain->events); > + > + list_add_tail_rcu(&domain->list_node, &mtwatch_info->domain_list); > + > + hlist_add_head_rcu(&domain->hash_node, > + &mtwatch_info->domain_hash[MTWATCH_HASH(domid)]); > + > + spin_unlock(&mtwatch_info->domain_lock); > + > + domain->task = kthread_run(mtwatch_thread, domain, > + "xen-mtwatch-%d", domid); > + > + if (!domain->task) { > + pr_err("mtwatch kthread creation is failed\n"); > + domain->state = MTWATCH_DOMAIN_DOWN; > + > + return; > + } > + > + domain->state = MTWATCH_DOMAIN_UP; > +} > + > +/* Running in the context of default xenwatch kthread. */ > +void mtwatch_put_domain(domid_t domid) > +{ > + struct mtwatch_domain *domain; > + > + spin_lock(&mtwatch_info->domain_lock); > + > + domain = mtwatch_find_domain(domid); > + if (!domain) { > + spin_unlock(&mtwatch_info->domain_lock); > + pr_err("mtwatch kthread for domid=%d does not exist\n", > + domid); > + return; > + } > + > + if (atomic_dec_and_test(&domain->refcnt)) { > + > + hlist_del_rcu(&domain->hash_node); > + > + if (!domain->task) { > + /* > + * As the task is failed to initialize during > + * mtwatch_create_domain(), we do not need to wait > + * for the kernel thread to complete. > + */ > + list_del_rcu(&domain->list_node); > + call_rcu(&domain->rcu, delayed_destroy_domain); > + } else { > + spin_lock(&domain->events_lock); > + domain->state = MTWATCH_DOMAIN_DOWN; > + spin_unlock(&domain->events_lock); > + > + wake_up(&domain->events_wq); > + } > + } > + > + spin_unlock(&mtwatch_info->domain_lock); > +} > + > static void xs_suspend_enter(void) > { > spin_lock(&xs_state_lock); > @@ -793,6 +988,80 @@ int register_xenbus_watch(struct xenbus_watch *watch) > } > EXPORT_SYMBOL_GPL(register_xenbus_watch); > > +static void __unregister_single_mtwatch(struct xenbus_watch *watch, > + struct mtwatch_domain *domain) > +{ > + struct xs_watch_event *event, *tmp; > + > + if (current->pid != domain->pid) > + mutex_lock(&domain->domain_mutex); > + > + spin_lock(&domain->events_lock); > + list_for_each_entry_safe(event, tmp, > + &domain->events, list) { > + if (event->handle != watch) > + continue; > + list_del(&event->list); > + kfree(event); > + } > + spin_unlock(&domain->events_lock); > + > + if (current->pid != domain->pid) > + mutex_unlock(&domain->domain_mutex); > +} > + > +static void unregister_single_mtwatch(struct xenbus_watch *watch, > + domid_t domid) > +{ > + struct mtwatch_domain *domain; > + bool found = false; > + > + rcu_read_lock(); > + > + list_for_each_entry_rcu(domain, &mtwatch_info->domain_list, > + list_node) { > + if (domain->domid == domid) { > + found = true; > + __unregister_single_mtwatch(watch, domain); > + } > + } > + > + WARN_ON_ONCE(unlikely(!found)); > + > + rcu_read_unlock(); > +} > + > +static void unregister_all_mtwatch(struct xenbus_watch *watch) > +{ > + struct mtwatch_domain *domain; > + > + rcu_read_lock(); > + > + list_for_each_entry_rcu(domain, &mtwatch_info->domain_list, > + list_node) { > + __unregister_single_mtwatch(watch, domain); > + } > + > + rcu_read_unlock(); > +} > + > +static void unregister_mtwatch(struct xenbus_watch *watch) > +{ > + /* > + * Generally, to unregister a watch. we need to traverse all > + * mtwatch domains to remove all inflight pending watch events for > + * such watch. > + * > + * One exception is we only need to remove pending watch events > + * from a single mtwatch domain when the watch is registered for a > + * specific domid. > + */ > + if (watch->owner_id) Again: 0 as a special value isn't a good idea. Maybe use one of the reserved DOMIDs, like DOMID_SELF? Juergen