All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jason Wang <jasowang@redhat.com>
To: "Zhoujian (jay)" <jianjay.zhou@huawei.com>,
	"qemu-devel@nongnu.org" <qemu-devel@nongnu.org>
Cc: "mst@redhat.com" <mst@redhat.com>,
	"Gonglei (Arei)" <arei.gonglei@huawei.com>,
	"marcandre.lureau@redhat.com" <marcandre.lureau@redhat.com>,
	"Huangweidong (C)" <weidong.huang@huawei.com>,
	"wangxin (U)" <wangxinxin.wang@huawei.com>,
	Maxime Coquelin <maxime.coquelin@redhat.com>
Subject: Re: [Qemu-devel] [RFC PATCH] vhost-user: quit infinite loop while used memslots is more than the backend limit
Date: Wed, 20 Dec 2017 16:56:59 +0800	[thread overview]
Message-ID: <ce633ca0-92c8-de17-fc94-cd9454ca29ad@redhat.com> (raw)
In-Reply-To: <B2D15215269B544CADD246097EACE7473AB0D08F@DGGEMM505-MBS.china.huawei.com>



On 2017年12月19日 18:21, Zhoujian (jay) wrote:
>
>> -----Original Message-----
>> From: Jason Wang [mailto:jasowang@redhat.com]
>> Sent: Tuesday, December 19, 2017 5:24 PM
>> To: Zhoujian (jay) <jianjay.zhou@huawei.com>; qemu-devel@nongnu.org
>> Cc: mst@redhat.com; Gonglei (Arei) <arei.gonglei@huawei.com>;
>> marcandre.lureau@redhat.com; Huangweidong (C) <weidong.huang@huawei.com>;
>> wangxin (U) <wangxinxin.wang@huawei.com>
>> Subject: Re: [RFC PATCH] vhost-user: quit infinite loop while used
>> memslots is more than the backend limit
>>
>>
>>
>> On 2017年12月13日 15:04, Jay Zhou wrote:
>>> Steps to 100% reproduce:
>>> (1) start a VM with two VFIO 82599 PCI NICs successfully
>>> (2) hot plug a RAM, which is attached successfully
>>> (3) hot plug another RAM, which is attached successfully
>>> (4) hot plug a vhost-user NIC, which is attached successfully
>>> (5) hot plug another vhost-user NIC, which is hanged
>>>
>>> then the qemu process infinitely and quickly print the logs below:
>>>       [...]
>>>       vhost backend memory slots limit is less than current number of
>> present memory slots
>>>       failed to init vhost_net for queue 0
>>>       vhost backend memory slots limit is less than current number of
>> present memory slots
>>>       failed to init vhost_net for queue 0
>>>       vhost backend memory slots limit is less than current number of
>> present memory slots
>>>       failed to init vhost_net for queue 0
>>>       [...]
>>>
>>> and the cpu utilization of the systemd-journal process on host is
>>> 100%, the reason is clear:
>>>       "used_memslots > hdev->vhost_ops-
>>> vhost_backend_memslots_limit(hdev)"
>>> is true in vhost_dev_init, and the main thread in the infinite do and
>>> while loop in net_vhost_user_init.
>>>
>>> One way is to increase the vhost backend memory slots limit, some
>>> discussions are here:
>>>
>>> https://lists.gnu.org/archive/html/qemu-devel/2017-11/msg04572.html
>>>
>>> I think if
>>>       "used_memslots > hdev->vhost_ops-
>>> vhost_backend_memslots_limit(hdev)"
>>> is true, it does not make sense to try to initialize the vhost-user
>>> again and again, because the main thread can not handle other qmps at
>>> all, which means it can not reduce the number of memslots, it must be
>>> failed at last.
>> You should include the reason why we hit infinite loop in the commit log,
>> and what's more important, is this issue specific to memslots only?
>> What if you hit errors other than this in vhost_dev_init() for vhost-user?
> If the new hot plugged vhost-user NIC failed to initialize in vhost_dev_init,
> the s->started in net_vhost_user_init is always false, so we hit infinite loop
> in net_vhost_user_init().
> It is not specific to memslots only, e.g. if the backend lacks some feature
> masks, it will hit infinite loop too.
>
> Is it available to identify the cases that MAY be failed and MUST be failed
> at initialization?
>      - If it must be failed, quit the loop at once.
>      - If it may be failed, then we should do the loop to wait the connection
>        established, i.e. wait the s->started becomes true. And some sleep time
>        may be added to reduce the CPU utilization.

Then I think you probably need another flag e.g s->error and set it on 
fatal error.

>
>> Is this better to disable event source in this case?
> The "event source" you mean is the logs printed?
> If it is disabled, the reason why initialization failed is not explicit,
> this is the drawback.
>

No I mean the thing that triggers the loop which is s->started here.

Thanks

> Regards,
> Jay
>
>> Thanks
>>
>>> Signed-off-by: Jay Zhou <jianjay.zhou@huawei.com>
>>> ---
>>>    hw/virtio/vhost.c         | 3 +++
>>>    include/hw/virtio/vhost.h | 2 ++
>>>    net/vhost-user.c          | 5 +++++
>>>    3 files changed, 10 insertions(+)
>>>
>>> diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c index
>>> e4290ce..1cc4a18 100644
>>> --- a/hw/virtio/vhost.c
>>> +++ b/hw/virtio/vhost.c
>>> @@ -47,6 +47,8 @@ static unsigned int used_memslots;
>>>    static QLIST_HEAD(, vhost_dev) vhost_devices =
>>>        QLIST_HEAD_INITIALIZER(vhost_devices);
>>>
>>> +bool used_memslots_exceeded;
>>> +
>>>    bool vhost_has_free_slot(void)
>>>    {
>>>        unsigned int slots_limit = ~0U;
>>> @@ -1254,6 +1256,7 @@ int vhost_dev_init(struct vhost_dev *hdev, void
>> *opaque,
>>>        if (used_memslots > hdev->vhost_ops-
>>> vhost_backend_memslots_limit(hdev)) {
>>>            error_report("vhost backend memory slots limit is less"
>>>                    " than current number of present memory slots");
>>> +        used_memslots_exceeded = true;
>>>            r = -1;
>>>            goto fail;
>>>        }
>>> diff --git a/include/hw/virtio/vhost.h b/include/hw/virtio/vhost.h
>>> index 467dc77..bcc66d9 100644
>>> --- a/include/hw/virtio/vhost.h
>>> +++ b/include/hw/virtio/vhost.h
>>> @@ -6,6 +6,8 @@
>>>    #include "hw/virtio/virtio.h"
>>>    #include "exec/memory.h"
>>>
>>> +extern bool used_memslots_exceeded;
>>> +
>>>    /* Generic structures common for any vhost based device. */
>>>    struct vhost_virtqueue {
>>>        int kick;
>>> diff --git a/net/vhost-user.c b/net/vhost-user.c index
>>> c23927c..8b7bfff 100644
>>> --- a/net/vhost-user.c
>>> +++ b/net/vhost-user.c
>>> @@ -17,6 +17,7 @@
>>>    #include "qemu/error-report.h"
>>>    #include "qmp-commands.h"
>>>    #include "trace.h"
>>> +#include "include/hw/virtio/vhost.h"
>>>
>>>    typedef struct VhostUserState {
>>>        NetClientState nc;
>>> @@ -311,6 +312,10 @@ static int net_vhost_user_init(NetClientState *peer,
>> const char *device,
>>>            qemu_chr_fe_set_handlers(&s->chr, NULL, NULL,
>>>                                     net_vhost_user_event, NULL, nc0->name,
>> NULL,
>>>                                     true);
>>> +        if (used_memslots_exceeded) {
>>> +            error_report("used memslots exceeded the backend limit,
>> quit loop");
>>> +            return -1;
>>> +        }
>>>        } while (!s->started);
>>>
>>>        assert(s->vhost_net);

      reply	other threads:[~2017-12-20  8:57 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-12-13  7:04 [Qemu-devel] [RFC PATCH] vhost-user: quit infinite loop while used memslots is more than the backend limit Jay Zhou
2017-12-13  7:11 ` no-reply
2017-12-13  7:38 ` no-reply
2017-12-19  9:23 ` Jason Wang
2017-12-19 10:21   ` Zhoujian (jay)
2017-12-20  8:56     ` Jason Wang [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ce633ca0-92c8-de17-fc94-cd9454ca29ad@redhat.com \
    --to=jasowang@redhat.com \
    --cc=arei.gonglei@huawei.com \
    --cc=jianjay.zhou@huawei.com \
    --cc=marcandre.lureau@redhat.com \
    --cc=maxime.coquelin@redhat.com \
    --cc=mst@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=wangxinxin.wang@huawei.com \
    --cc=weidong.huang@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.