On Sat, Jun 05, 2021 at 06:53:58PM -0500, michael.christie@oracle.com wrote: > On 6/3/21 9:30 AM, Stefan Hajnoczi wrote: > >> + if (info->pid == VHOST_VRING_NEW_WORKER) { > >> + worker = vhost_worker_create(dev); > > > > The maximum number of kthreads created is limited by > > vhost_dev_init(nvqs)? For example VHOST_SCSI_MAX_VQ 128. > > > > IIUC kthread_create is not limited by or accounted against the current > > task, so I'm a little worried that a process can create a lot of > > kthreads. > > > > I haven't investigated other kthread_create() users reachable from > > userspace applications to see how they bound the number of threads > > effectively. > > Do we want something like io_uring's copy_process use? It's what fork uses, > so we get checks like RLIMIT_NPROC and max_threads. > > I know I didn't look at everything, but it looks like for some software > drivers we just allow the user to run wild. For example for nbd, when we > create the device to do alloc_workqueue and use the default max_active > value (256). We then don't have a limit on connections, so we could end > up with 256 workqueue threads per device. And then there is no limit on > devices a user can make. > > > > > > Any thoughts? > > > > Is the concern a bad VM could create N devs each with 128 vqs/threads > and it would slow down other VMs? How do we handle the case where > some VM makes M * N devs and that is equal to N * 128 so we would end > up with the same number of threads either way? Is there a limit to the > number of vhost devices a VM can make and can I just stick in a similar > check for workers? > > For vhost-scsi specifically, the 128 limit does not make a lot of sense. > I think we want the max to be the number of vCPUs the VM has so we can > add checks for that. Then we would assume someone making a VM with lots of > CPUs is going to have the resources to support them. > > Note: It does make sense from the point of view that we don't know the > number of vCPUs when vhost-scsi calls vhost_dev_init, so I get we had to > select an initial limit. My concern is that threads should probably accounted against RLIMIT_NPROC and max_threads rather than something indirect like 128 * RLIMIT_NOFILE (a userspace process can only have RLIMIT_NOFILE vhost-user file descriptors open). > >> + if (!dev->workers) { > >> + vhost_worker_put(worker); > >> + return -ENOMEM; > >> + } > >> + } > >> + > >> + vq->worker = worker; > >> + > >> + dev->workers[dev->num_workers] = worker; > >> + dev->num_workers++; > > > > Hmm...should we really append to workers[] in the vhost_worker_find() > > case? > > > As it's coded now, yes. Every successful vhost_worker_find call does a > get on the worker's refcount. Later when we delete the device, we loop > over the workers array and for every entry we do a put. > > I can add in some code to first check if the worker is already in the > dev's worker list. If so then skip the refcount and skip adding to the > workers array. If not in the dev's worker list then do a vhost_worker_find. > > I thought it might be nicer how it is now with the single path. It's less > code at least. Later if we add support to change a vq's worker then we also > don't have to worry about refcounts as much. We just always drop the count > taken from when it was added. Thanks for explaining. Stefan