All of lore.kernel.org
 help / color / mirror / Atom feed
* [Virtio-fs] One virtiofs daemon per exported dir requirement
@ 2020-02-14 19:27 Vivek Goyal
  2020-02-14 19:41 ` Dr. David Alan Gilbert
  0 siblings, 1 reply; 11+ messages in thread
From: Vivek Goyal @ 2020-02-14 19:27 UTC (permalink / raw)
  To: virtio-fs-list; +Cc: Mrunal Patel, Miklos Szeredi

Hi,

Dan Walsh and Mrunal mentioned that one virtiofsd daemon per exported
directory requirement sounds excessive. For container use case, they have
atleast 2-3 more directories they need to export (secrets and /etc/host). And
that means 3-4 virtiofsd running for each kata container. 

One option seems that bind mount all exports in one directory and export
that directory using one virtiofsd. I am aware of atleast one problem
with that configuraiton and that is possibility of inode number collision
if bind mounts are coming from different devices. Not sure how many
applications care though. Sergio is looking into solving this issue. It
might take a while though.

Any other thoughts?

Thanks
Vivek


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Virtio-fs] One virtiofs daemon per exported dir requirement
  2020-02-14 19:27 [Virtio-fs] One virtiofs daemon per exported dir requirement Vivek Goyal
@ 2020-02-14 19:41 ` Dr. David Alan Gilbert
  2020-02-18 13:38   ` Stefan Hajnoczi
  0 siblings, 1 reply; 11+ messages in thread
From: Dr. David Alan Gilbert @ 2020-02-14 19:41 UTC (permalink / raw)
  To: Vivek Goyal; +Cc: virtio-fs-list, Mrunal Patel, Miklos Szeredi

* Vivek Goyal (vgoyal@redhat.com) wrote:
> Hi,
> 
> Dan Walsh and Mrunal mentioned that one virtiofsd daemon per exported
> directory requirement sounds excessive. For container use case, they have
> atleast 2-3 more directories they need to export (secrets and /etc/host). And
> that means 3-4 virtiofsd running for each kata container. 
> 
> One option seems that bind mount all exports in one directory and export
> that directory using one virtiofsd. I am aware of atleast one problem
> with that configuraiton and that is possibility of inode number collision
> if bind mounts are coming from different devices. Not sure how many
> applications care though. Sergio is looking into solving this issue. It
> might take a while though.

I thought the bind mount setup was the normal setup seen under both Kata
and k8s?

Dave

> Any other thoughts?
> 
> Thanks
> Vivek
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Virtio-fs] One virtiofs daemon per exported dir requirement
  2020-02-14 19:41 ` Dr. David Alan Gilbert
@ 2020-02-18 13:38   ` Stefan Hajnoczi
  2020-02-18 18:28     ` Daniel Walsh
  2020-02-18 18:29     ` Daniel Walsh
  0 siblings, 2 replies; 11+ messages in thread
From: Stefan Hajnoczi @ 2020-02-18 13:38 UTC (permalink / raw)
  To: Dr. David Alan Gilbert
  Cc: virtio-fs-list, Mrunal Patel, Vivek Goyal, Miklos Szeredi

[-- Attachment #1: Type: text/plain, Size: 2057 bytes --]

On Fri, Feb 14, 2020 at 07:41:30PM +0000, Dr. David Alan Gilbert wrote:
> * Vivek Goyal (vgoyal@redhat.com) wrote:
> > Hi,
> > 
> > Dan Walsh and Mrunal mentioned that one virtiofsd daemon per exported
> > directory requirement sounds excessive. For container use case, they have
> > atleast 2-3 more directories they need to export (secrets and /etc/host). And
> > that means 3-4 virtiofsd running for each kata container. 
> > 
> > One option seems that bind mount all exports in one directory and export
> > that directory using one virtiofsd. I am aware of atleast one problem
> > with that configuraiton and that is possibility of inode number collision
> > if bind mounts are coming from different devices. Not sure how many
> > applications care though. Sergio is looking into solving this issue. It
> > might take a while though.
> 
> I thought the bind mount setup was the normal setup seen under both Kata
> and k8s?

Kata Containers works as follows:

kata-runtime manages a bind mount directory for each sandbox VM (k8s
pod) in /run/kata-containers/shared/sandboxes/$VM_ID.

That directory contains the bind-mounted rootfs as well as resolv.conf
and other per-container files.

When volumes (podman run --volume) are present they are also
bind-mounted alongside the rootfs.

So kata-runtime ends up with something like this:

  /run/kata-containers/shared/sandboxes/
  ... 61c192ae0e7154b6c8ffce6b13c4c5108d6dfe419a508f99ed381d9310268385/
      ... 61c192ae0e7154b6c8ffce6b13c4c5108d6dfe419a508f99ed381d9310268385/
          ... rootfs/
      ... 61c192ae0e7154b6c8ffce6b13c4c5108d6dfe419a508f99ed381d9310268385-04b134d40c6255cf-hostname
      ... 61c192ae0e7154b6c8ffce6b13c4c5108d6dfe419a508f99ed381d9310268385-62cff51b641310e5-resolv.conf
      ... 61c192ae0e7154b6c8ffce6b13c4c5108d6dfe419a508f99ed381d9310268385-b8dedcdf0c623c40-hosts
      ... 61c192ae0e7154b6c8ffce6b13c4c5108d6dfe419a508f99ed381d9310268385-d181eeeb4171c3c5-myvolume/

Only one virtio-fs device is used per sandbox VM.

Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Virtio-fs] One virtiofs daemon per exported dir requirement
  2020-02-18 13:38   ` Stefan Hajnoczi
@ 2020-02-18 18:28     ` Daniel Walsh
  2020-02-18 18:29     ` Daniel Walsh
  1 sibling, 0 replies; 11+ messages in thread
From: Daniel Walsh @ 2020-02-18 18:28 UTC (permalink / raw)
  To: virtio-fs


[-- Attachment #1.1.1: Type: text/plain, Size: 2374 bytes --]

On 2/18/20 8:38 AM, Stefan Hajnoczi wrote:
> On Fri, Feb 14, 2020 at 07:41:30PM +0000, Dr. David Alan Gilbert wrote:
>> * Vivek Goyal (vgoyal@redhat.com) wrote:
>>> Hi,
>>>
>>> Dan Walsh and Mrunal mentioned that one virtiofsd daemon per exported
>>> directory requirement sounds excessive. For container use case, they have
>>> atleast 2-3 more directories they need to export (secrets and /etc/host). And
>>> that means 3-4 virtiofsd running for each kata container. 
>>>
>>> One option seems that bind mount all exports in one directory and export
>>> that directory using one virtiofsd. I am aware of atleast one problem
>>> with that configuraiton and that is possibility of inode number collision
>>> if bind mounts are coming from different devices. Not sure how many
>>> applications care though. Sergio is looking into solving this issue. It
>>> might take a while though.
>> I thought the bind mount setup was the normal setup seen under both Kata
>> and k8s?
> Kata Containers works as follows:
>
> kata-runtime manages a bind mount directory for each sandbox VM (k8s
> pod) in /run/kata-containers/shared/sandboxes/$VM_ID.
>
> That directory contains the bind-mounted rootfs as well as resolv.conf
> and other per-container files.
>
> When volumes (podman run --volume) are present they are also
> bind-mounted alongside the rootfs.
>
> So kata-runtime ends up with something like this:
>
>   /run/kata-containers/shared/sandboxes/
>   ... 61c192ae0e7154b6c8ffce6b13c4c5108d6dfe419a508f99ed381d9310268385/
>       ... 61c192ae0e7154b6c8ffce6b13c4c5108d6dfe419a508f99ed381d9310268385/
>           ... rootfs/
>       ... 61c192ae0e7154b6c8ffce6b13c4c5108d6dfe419a508f99ed381d9310268385-04b134d40c6255cf-hostname
>       ... 61c192ae0e7154b6c8ffce6b13c4c5108d6dfe419a508f99ed381d9310268385-62cff51b641310e5-resolv.conf
>       ... 61c192ae0e7154b6c8ffce6b13c4c5108d6dfe419a508f99ed381d9310268385-b8dedcdf0c623c40-hosts
>       ... 61c192ae0e7154b6c8ffce6b13c4c5108d6dfe419a508f99ed381d9310268385-d181eeeb4171c3c5-myvolume/
>
> Only one virtio-fs device is used per sandbox VM.
>
> Stefan
>
> _______________________________________________
> Virtio-fs mailing list
> Virtio-fs@redhat.com
> https://www.redhat.com/mailman/listinfo/virtio-fs

Great, other then the potential for duplicate inodes, that is perfect.


[-- Attachment #1.1.2: Type: text/html, Size: 3327 bytes --]

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Virtio-fs] One virtiofs daemon per exported dir requirement
  2020-02-18 13:38   ` Stefan Hajnoczi
  2020-02-18 18:28     ` Daniel Walsh
@ 2020-02-18 18:29     ` Daniel Walsh
  2020-02-18 18:32       ` Daniel Walsh
  1 sibling, 1 reply; 11+ messages in thread
From: Daniel Walsh @ 2020-02-18 18:29 UTC (permalink / raw)
  To: virtio-fs


[-- Attachment #1.1.1: Type: text/plain, Size: 2464 bytes --]

On 2/18/20 8:38 AM, Stefan Hajnoczi wrote:
> On Fri, Feb 14, 2020 at 07:41:30PM +0000, Dr. David Alan Gilbert wrote:
>> * Vivek Goyal (vgoyal@redhat.com) wrote:
>>> Hi,
>>>
>>> Dan Walsh and Mrunal mentioned that one virtiofsd daemon per exported
>>> directory requirement sounds excessive. For container use case, they have
>>> atleast 2-3 more directories they need to export (secrets and /etc/host). And
>>> that means 3-4 virtiofsd running for each kata container. 
>>>
>>> One option seems that bind mount all exports in one directory and export
>>> that directory using one virtiofsd. I am aware of atleast one problem
>>> with that configuraiton and that is possibility of inode number collision
>>> if bind mounts are coming from different devices. Not sure how many
>>> applications care though. Sergio is looking into solving this issue. It
>>> might take a while though.
>> I thought the bind mount setup was the normal setup seen under both Kata
>> and k8s?
> Kata Containers works as follows:
>
> kata-runtime manages a bind mount directory for each sandbox VM (k8s
> pod) in /run/kata-containers/shared/sandboxes/$VM_ID.
>
> That directory contains the bind-mounted rootfs as well as resolv.conf
> and other per-container files.
>
> When volumes (podman run --volume) are present they are also
> bind-mounted alongside the rootfs.
>
> So kata-runtime ends up with something like this:
>
>   /run/kata-containers/shared/sandboxes/
>   ... 61c192ae0e7154b6c8ffce6b13c4c5108d6dfe419a508f99ed381d9310268385/
>       ... 61c192ae0e7154b6c8ffce6b13c4c5108d6dfe419a508f99ed381d9310268385/
>           ... rootfs/
>       ... 61c192ae0e7154b6c8ffce6b13c4c5108d6dfe419a508f99ed381d9310268385-04b134d40c6255cf-hostname
>       ... 61c192ae0e7154b6c8ffce6b13c4c5108d6dfe419a508f99ed381d9310268385-62cff51b641310e5-resolv.conf
>       ... 61c192ae0e7154b6c8ffce6b13c4c5108d6dfe419a508f99ed381d9310268385-b8dedcdf0c623c40-hosts
>       ... 61c192ae0e7154b6c8ffce6b13c4c5108d6dfe419a508f99ed381d9310268385-d181eeeb4171c3c5-myvolume/
>
> Only one virtio-fs device is used per sandbox VM.
>
> Stefan
>
> _______________________________________________
> Virtio-fs mailing list
> Virtio-fs@redhat.com
> https://www.redhat.com/mailman/listinfo/virtio-fs

Also what happens if some of the volumes are mounted as read/only?  What
kind of error does the container process get when it attempts to write
to the volume?


[-- Attachment #1.1.2: Type: text/html, Size: 3422 bytes --]

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Virtio-fs] One virtiofs daemon per exported dir requirement
  2020-02-18 18:29     ` Daniel Walsh
@ 2020-02-18 18:32       ` Daniel Walsh
  2020-02-18 18:58         ` Dr. David Alan Gilbert
  0 siblings, 1 reply; 11+ messages in thread
From: Daniel Walsh @ 2020-02-18 18:32 UTC (permalink / raw)
  To: virtio-fs


[-- Attachment #1.1.1: Type: text/plain, Size: 2865 bytes --]

On 2/18/20 1:29 PM, Daniel Walsh wrote:
> On 2/18/20 8:38 AM, Stefan Hajnoczi wrote:
>> On Fri, Feb 14, 2020 at 07:41:30PM +0000, Dr. David Alan Gilbert wrote:
>>> * Vivek Goyal (vgoyal@redhat.com) wrote:
>>>> Hi,
>>>>
>>>> Dan Walsh and Mrunal mentioned that one virtiofsd daemon per exported
>>>> directory requirement sounds excessive. For container use case, they have
>>>> atleast 2-3 more directories they need to export (secrets and /etc/host). And
>>>> that means 3-4 virtiofsd running for each kata container. 
>>>>
>>>> One option seems that bind mount all exports in one directory and export
>>>> that directory using one virtiofsd. I am aware of atleast one problem
>>>> with that configuraiton and that is possibility of inode number collision
>>>> if bind mounts are coming from different devices. Not sure how many
>>>> applications care though. Sergio is looking into solving this issue. It
>>>> might take a while though.
>>> I thought the bind mount setup was the normal setup seen under both Kata
>>> and k8s?
>> Kata Containers works as follows:
>>
>> kata-runtime manages a bind mount directory for each sandbox VM (k8s
>> pod) in /run/kata-containers/shared/sandboxes/$VM_ID.
>>
>> That directory contains the bind-mounted rootfs as well as resolv.conf
>> and other per-container files.
>>
>> When volumes (podman run --volume) are present they are also
>> bind-mounted alongside the rootfs.
>>
>> So kata-runtime ends up with something like this:
>>
>>   /run/kata-containers/shared/sandboxes/
>>   ... 61c192ae0e7154b6c8ffce6b13c4c5108d6dfe419a508f99ed381d9310268385/
>>       ... 61c192ae0e7154b6c8ffce6b13c4c5108d6dfe419a508f99ed381d9310268385/
>>           ... rootfs/
>>       ... 61c192ae0e7154b6c8ffce6b13c4c5108d6dfe419a508f99ed381d9310268385-04b134d40c6255cf-hostname
>>       ... 61c192ae0e7154b6c8ffce6b13c4c5108d6dfe419a508f99ed381d9310268385-62cff51b641310e5-resolv.conf
>>       ... 61c192ae0e7154b6c8ffce6b13c4c5108d6dfe419a508f99ed381d9310268385-b8dedcdf0c623c40-hosts
>>       ... 61c192ae0e7154b6c8ffce6b13c4c5108d6dfe419a508f99ed381d9310268385-d181eeeb4171c3c5-myvolume/
>>
>> Only one virtio-fs device is used per sandbox VM.
>>
>> Stefan
>>
>> _______________________________________________
>> Virtio-fs mailing list
>> Virtio-fs@redhat.com
>> https://www.redhat.com/mailman/listinfo/virtio-fs
>
> Also what happens if some of the volumes are mounted as read/only? 
> What kind of error does the container process get when it attempts to
> write to the volume?
>
>
> _______________________________________________
> Virtio-fs mailing list
> Virtio-fs@redhat.com
> https://www.redhat.com/mailman/listinfo/virtio-fs

Also need to think about the volume being mounted as noexec, nodev,
nosuid,  Does the kernel inside of the container handle this correctly?


[-- Attachment #1.1.2: Type: text/html, Size: 4437 bytes --]

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Virtio-fs] One virtiofs daemon per exported dir requirement
  2020-02-18 18:32       ` Daniel Walsh
@ 2020-02-18 18:58         ` Dr. David Alan Gilbert
  2020-02-18 21:39           ` Daniel Walsh
  2020-02-19 15:24           ` [Virtio-fs] One virtiofs daemon per exported dir requirement Stefan Hajnoczi
  0 siblings, 2 replies; 11+ messages in thread
From: Dr. David Alan Gilbert @ 2020-02-18 18:58 UTC (permalink / raw)
  To: Daniel Walsh; +Cc: virtio-fs

* Daniel Walsh (dwalsh@redhat.com) wrote:
> On 2/18/20 1:29 PM, Daniel Walsh wrote:
> > On 2/18/20 8:38 AM, Stefan Hajnoczi wrote:
> >> On Fri, Feb 14, 2020 at 07:41:30PM +0000, Dr. David Alan Gilbert wrote:
> >>> * Vivek Goyal (vgoyal@redhat.com) wrote:
> >>>> Hi,
> >>>>
> >>>> Dan Walsh and Mrunal mentioned that one virtiofsd daemon per exported
> >>>> directory requirement sounds excessive. For container use case, they have
> >>>> atleast 2-3 more directories they need to export (secrets and /etc/host). And
> >>>> that means 3-4 virtiofsd running for each kata container. 
> >>>>
> >>>> One option seems that bind mount all exports in one directory and export
> >>>> that directory using one virtiofsd. I am aware of atleast one problem
> >>>> with that configuraiton and that is possibility of inode number collision
> >>>> if bind mounts are coming from different devices. Not sure how many
> >>>> applications care though. Sergio is looking into solving this issue. It
> >>>> might take a while though.
> >>> I thought the bind mount setup was the normal setup seen under both Kata
> >>> and k8s?
> >> Kata Containers works as follows:
> >>
> >> kata-runtime manages a bind mount directory for each sandbox VM (k8s
> >> pod) in /run/kata-containers/shared/sandboxes/$VM_ID.
> >>
> >> That directory contains the bind-mounted rootfs as well as resolv.conf
> >> and other per-container files.
> >>
> >> When volumes (podman run --volume) are present they are also
> >> bind-mounted alongside the rootfs.
> >>
> >> So kata-runtime ends up with something like this:
> >>
> >>   /run/kata-containers/shared/sandboxes/
> >>   ... 61c192ae0e7154b6c8ffce6b13c4c5108d6dfe419a508f99ed381d9310268385/
> >>       ... 61c192ae0e7154b6c8ffce6b13c4c5108d6dfe419a508f99ed381d9310268385/
> >>           ... rootfs/
> >>       ... 61c192ae0e7154b6c8ffce6b13c4c5108d6dfe419a508f99ed381d9310268385-04b134d40c6255cf-hostname
> >>       ... 61c192ae0e7154b6c8ffce6b13c4c5108d6dfe419a508f99ed381d9310268385-62cff51b641310e5-resolv.conf
> >>       ... 61c192ae0e7154b6c8ffce6b13c4c5108d6dfe419a508f99ed381d9310268385-b8dedcdf0c623c40-hosts
> >>       ... 61c192ae0e7154b6c8ffce6b13c4c5108d6dfe419a508f99ed381d9310268385-d181eeeb4171c3c5-myvolume/
> >>
> >> Only one virtio-fs device is used per sandbox VM.
> >>
> >> Stefan
> >>
> >> _______________________________________________
> >> Virtio-fs mailing list
> >> Virtio-fs@redhat.com
> >> https://www.redhat.com/mailman/listinfo/virtio-fs
> >
> > Also what happens if some of the volumes are mounted as read/only? 
> > What kind of error does the container process get when it attempts to
> > write to the volume?
> >
> >
> > _______________________________________________
> > Virtio-fs mailing list
> > Virtio-fs@redhat.com
> > https://www.redhat.com/mailman/listinfo/virtio-fs
> 
> Also need to think about the volume being mounted as noexec, nodev,
> nosuid,  Does the kernel inside of the container handle this correctly?

I'd need to check, but I *think* you'll get the error propagated
directly from the errno that the daemon sees; i.e. you'll probably
get the ro-fs error when trying to write the file on the mount that's
ro.

I'm expecting it to behave like FUSE, since it's mostly the transport
level that's changed.

Dave




> _______________________________________________
> Virtio-fs mailing list
> Virtio-fs@redhat.com
> https://www.redhat.com/mailman/listinfo/virtio-fs

--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Virtio-fs] One virtiofs daemon per exported dir requirement
  2020-02-18 18:58         ` Dr. David Alan Gilbert
@ 2020-02-18 21:39           ` Daniel Walsh
  2020-02-19 14:06             ` [Virtio-fs] Effect of nodev, noexec, nosuid mount options (Was: Re: One virtiofs daemon per exported dir requirement) Vivek Goyal
  2020-02-19 15:24           ` [Virtio-fs] One virtiofs daemon per exported dir requirement Stefan Hajnoczi
  1 sibling, 1 reply; 11+ messages in thread
From: Daniel Walsh @ 2020-02-18 21:39 UTC (permalink / raw)
  To: Dr. David Alan Gilbert; +Cc: virtio-fs

On 2/18/20 1:58 PM, Dr. David Alan Gilbert wrote:
> * Daniel Walsh (dwalsh@redhat.com) wrote:
>> On 2/18/20 1:29 PM, Daniel Walsh wrote:
>>> On 2/18/20 8:38 AM, Stefan Hajnoczi wrote:
>>>> On Fri, Feb 14, 2020 at 07:41:30PM +0000, Dr. David Alan Gilbert wrote:
>>>>> * Vivek Goyal (vgoyal@redhat.com) wrote:
>>>>>> Hi,
>>>>>>
>>>>>> Dan Walsh and Mrunal mentioned that one virtiofsd daemon per exported
>>>>>> directory requirement sounds excessive. For container use case, they have
>>>>>> atleast 2-3 more directories they need to export (secrets and /etc/host). And
>>>>>> that means 3-4 virtiofsd running for each kata container. 
>>>>>>
>>>>>> One option seems that bind mount all exports in one directory and export
>>>>>> that directory using one virtiofsd. I am aware of atleast one problem
>>>>>> with that configuraiton and that is possibility of inode number collision
>>>>>> if bind mounts are coming from different devices. Not sure how many
>>>>>> applications care though. Sergio is looking into solving this issue. It
>>>>>> might take a while though.
>>>>> I thought the bind mount setup was the normal setup seen under both Kata
>>>>> and k8s?
>>>> Kata Containers works as follows:
>>>>
>>>> kata-runtime manages a bind mount directory for each sandbox VM (k8s
>>>> pod) in /run/kata-containers/shared/sandboxes/$VM_ID.
>>>>
>>>> That directory contains the bind-mounted rootfs as well as resolv.conf
>>>> and other per-container files.
>>>>
>>>> When volumes (podman run --volume) are present they are also
>>>> bind-mounted alongside the rootfs.
>>>>
>>>> So kata-runtime ends up with something like this:
>>>>
>>>>   /run/kata-containers/shared/sandboxes/
>>>>   ... 61c192ae0e7154b6c8ffce6b13c4c5108d6dfe419a508f99ed381d9310268385/
>>>>       ... 61c192ae0e7154b6c8ffce6b13c4c5108d6dfe419a508f99ed381d9310268385/
>>>>           ... rootfs/
>>>>       ... 61c192ae0e7154b6c8ffce6b13c4c5108d6dfe419a508f99ed381d9310268385-04b134d40c6255cf-hostname
>>>>       ... 61c192ae0e7154b6c8ffce6b13c4c5108d6dfe419a508f99ed381d9310268385-62cff51b641310e5-resolv.conf
>>>>       ... 61c192ae0e7154b6c8ffce6b13c4c5108d6dfe419a508f99ed381d9310268385-b8dedcdf0c623c40-hosts
>>>>       ... 61c192ae0e7154b6c8ffce6b13c4c5108d6dfe419a508f99ed381d9310268385-d181eeeb4171c3c5-myvolume/
>>>>
>>>> Only one virtio-fs device is used per sandbox VM.
>>>>
>>>> Stefan
>>>>
>>>> _______________________________________________
>>>> Virtio-fs mailing list
>>>> Virtio-fs@redhat.com
>>>> https://www.redhat.com/mailman/listinfo/virtio-fs
>>> Also what happens if some of the volumes are mounted as read/only? 
>>> What kind of error does the container process get when it attempts to
>>> write to the volume?
>>>
>>>
>>> _______________________________________________
>>> Virtio-fs mailing list
>>> Virtio-fs@redhat.com
>>> https://www.redhat.com/mailman/listinfo/virtio-fs
>> Also need to think about the volume being mounted as noexec, nodev,
>> nosuid,  Does the kernel inside of the container handle this correctly?
> I'd need to check, but I *think* you'll get the error propagated
> directly from the errno that the daemon sees; i.e. you'll probably
> get the ro-fs error when trying to write the file on the mount that's
> ro.
>
> I'm expecting it to behave like FUSE, since it's mostly the transport
> level that's changed.
>
> Dave
>
What about noexec? nodev? nosuid?
>
>
>> _______________________________________________
>> Virtio-fs mailing list
>> Virtio-fs@redhat.com
>> https://www.redhat.com/mailman/listinfo/virtio-fs
> --
> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK



^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Virtio-fs] Effect of nodev, noexec, nosuid mount options (Was: Re: One virtiofs daemon per exported dir requirement)
  2020-02-18 21:39           ` Daniel Walsh
@ 2020-02-19 14:06             ` Vivek Goyal
  2020-02-21 15:38               ` Daniel Walsh
  0 siblings, 1 reply; 11+ messages in thread
From: Vivek Goyal @ 2020-02-19 14:06 UTC (permalink / raw)
  To: Daniel Walsh; +Cc: virtio-fs

On Tue, Feb 18, 2020 at 04:39:07PM -0500, Daniel Walsh wrote:

[..]
> >
> What about noexec? nodev? nosuid?

These flags will take affect on host (but not inside guest). These will
protect host from guest (in case guest process drops some file in shared dir,
escapes from guest somehow and take over host system).

nodev will make sure even if guest drops a device file in shared dir,
it can't open it (if running on host directly). If running inside guest,
it will fail as device will not be in guest to begin with. If matching
device in guest is available, guest process should be able to open it.

nosuid, also will take affect on host. So if guest process drops a setuid
root bindary and tries to execute it on host, setuid will not take affect.
But running same setuid binary inside guest will continue to work.

noexec also takes affect on host. If guest drops an executable in shared
directory and some process on host tries to execute it, it will fail. But
if guest tries to execute that file inside guest, it works.

IOW, all these flags work on host but have no affect inside guest, as of
now.

Thanks
Vivek


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Virtio-fs] One virtiofs daemon per exported dir requirement
  2020-02-18 18:58         ` Dr. David Alan Gilbert
  2020-02-18 21:39           ` Daniel Walsh
@ 2020-02-19 15:24           ` Stefan Hajnoczi
  1 sibling, 0 replies; 11+ messages in thread
From: Stefan Hajnoczi @ 2020-02-19 15:24 UTC (permalink / raw)
  To: Dr. David Alan Gilbert; +Cc: virtio-fs

[-- Attachment #1: Type: text/plain, Size: 4214 bytes --]

On Tue, Feb 18, 2020 at 06:58:31PM +0000, Dr. David Alan Gilbert wrote:
> * Daniel Walsh (dwalsh@redhat.com) wrote:
> > On 2/18/20 1:29 PM, Daniel Walsh wrote:
> > > On 2/18/20 8:38 AM, Stefan Hajnoczi wrote:
> > >> On Fri, Feb 14, 2020 at 07:41:30PM +0000, Dr. David Alan Gilbert wrote:
> > >>> * Vivek Goyal (vgoyal@redhat.com) wrote:
> > >>>> Hi,
> > >>>>
> > >>>> Dan Walsh and Mrunal mentioned that one virtiofsd daemon per exported
> > >>>> directory requirement sounds excessive. For container use case, they have
> > >>>> atleast 2-3 more directories they need to export (secrets and /etc/host). And
> > >>>> that means 3-4 virtiofsd running for each kata container. 
> > >>>>
> > >>>> One option seems that bind mount all exports in one directory and export
> > >>>> that directory using one virtiofsd. I am aware of atleast one problem
> > >>>> with that configuraiton and that is possibility of inode number collision
> > >>>> if bind mounts are coming from different devices. Not sure how many
> > >>>> applications care though. Sergio is looking into solving this issue. It
> > >>>> might take a while though.
> > >>> I thought the bind mount setup was the normal setup seen under both Kata
> > >>> and k8s?
> > >> Kata Containers works as follows:
> > >>
> > >> kata-runtime manages a bind mount directory for each sandbox VM (k8s
> > >> pod) in /run/kata-containers/shared/sandboxes/$VM_ID.
> > >>
> > >> That directory contains the bind-mounted rootfs as well as resolv.conf
> > >> and other per-container files.
> > >>
> > >> When volumes (podman run --volume) are present they are also
> > >> bind-mounted alongside the rootfs.
> > >>
> > >> So kata-runtime ends up with something like this:
> > >>
> > >>   /run/kata-containers/shared/sandboxes/
> > >>   ... 61c192ae0e7154b6c8ffce6b13c4c5108d6dfe419a508f99ed381d9310268385/
> > >>       ... 61c192ae0e7154b6c8ffce6b13c4c5108d6dfe419a508f99ed381d9310268385/
> > >>           ... rootfs/
> > >>       ... 61c192ae0e7154b6c8ffce6b13c4c5108d6dfe419a508f99ed381d9310268385-04b134d40c6255cf-hostname
> > >>       ... 61c192ae0e7154b6c8ffce6b13c4c5108d6dfe419a508f99ed381d9310268385-62cff51b641310e5-resolv.conf
> > >>       ... 61c192ae0e7154b6c8ffce6b13c4c5108d6dfe419a508f99ed381d9310268385-b8dedcdf0c623c40-hosts
> > >>       ... 61c192ae0e7154b6c8ffce6b13c4c5108d6dfe419a508f99ed381d9310268385-d181eeeb4171c3c5-myvolume/
> > >>
> > >> Only one virtio-fs device is used per sandbox VM.
> > >>
> > >> Stefan
> > >>
> > >> _______________________________________________
> > >> Virtio-fs mailing list
> > >> Virtio-fs@redhat.com
> > >> https://www.redhat.com/mailman/listinfo/virtio-fs
> > >
> > > Also what happens if some of the volumes are mounted as read/only? 
> > > What kind of error does the container process get when it attempts to
> > > write to the volume?
> > >
> > >
> > > _______________________________________________
> > > Virtio-fs mailing list
> > > Virtio-fs@redhat.com
> > > https://www.redhat.com/mailman/listinfo/virtio-fs
> > 
> > Also need to think about the volume being mounted as noexec, nodev,
> > nosuid,  Does the kernel inside of the container handle this correctly?
> 
> I'd need to check, but I *think* you'll get the error propagated
> directly from the errno that the daemon sees; i.e. you'll probably
> get the ro-fs error when trying to write the file on the mount that's
> ro.
> 
> I'm expecting it to behave like FUSE, since it's mostly the transport
> level that's changed.

There are two levels here: 1) kata-agent setups up per-container bind
mounts and 2) virtiofsd performs file system operations on behalf of the
guest.

kata-agent can apply the 'ro' mount option to a per-container bind
mount, so even if the kataShared virtio-fs mount as a whole is
read/write the container will have read-only access.

What I hope happens (but I haven't checked) is that kata-agent sets
mount options on per-container bind mounts.

If a file system operation does make its way through to the host, then
virtiofsd should fail because the bind mount on the host should also be
'ro'.

But does anyone want to check? :-)

Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Virtio-fs] Effect of nodev, noexec, nosuid mount options (Was: Re: One virtiofs daemon per exported dir requirement)
  2020-02-19 14:06             ` [Virtio-fs] Effect of nodev, noexec, nosuid mount options (Was: Re: One virtiofs daemon per exported dir requirement) Vivek Goyal
@ 2020-02-21 15:38               ` Daniel Walsh
  0 siblings, 0 replies; 11+ messages in thread
From: Daniel Walsh @ 2020-02-21 15:38 UTC (permalink / raw)
  To: Vivek Goyal; +Cc: virtio-fs

On 2/19/20 9:06 AM, Vivek Goyal wrote:
> On Tue, Feb 18, 2020 at 04:39:07PM -0500, Daniel Walsh wrote:
>
> [..]
>> What about noexec? nodev? nosuid?
> These flags will take affect on host (but not inside guest). These will
> protect host from guest (in case guest process drops some file in shared dir,
> escapes from guest somehow and take over host system).
>
> nodev will make sure even if guest drops a device file in shared dir,
> it can't open it (if running on host directly). If running inside guest,
> it will fail as device will not be in guest to begin with. If matching
> device in guest is available, guest process should be able to open it.

As we discussed in my office, this is unexpected behavior by the user.

I actually believe Kata inside of the container might be able to take
care of

some of these inside of the VM, but we really need to think in the long
run of

propagating this information to the kernel inside of the VM, so it will
act properly.

> nosuid, also will take affect on host. So if guest process drops a setuid
> root bindary and tries to execute it on host, setuid will not take affect.
> But running same setuid binary inside guest will continue to work.
Which as I pointed out above is a problem.
> noexec also takes affect on host. If guest drops an executable in shared
> directory and some process on host tries to execute it, it will fail. But
> if guest tries to execute that file inside guest, it works.
Same issue.
> IOW, all these flags work on host but have no affect inside guest, as of
> now.

Users will not view a virtiofs as a networked protocol.  So at least we
need to document these in

Kata as shortcomings when using virtiofs.

> Thanks
> Vivek



^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2020-02-21 15:38 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-02-14 19:27 [Virtio-fs] One virtiofs daemon per exported dir requirement Vivek Goyal
2020-02-14 19:41 ` Dr. David Alan Gilbert
2020-02-18 13:38   ` Stefan Hajnoczi
2020-02-18 18:28     ` Daniel Walsh
2020-02-18 18:29     ` Daniel Walsh
2020-02-18 18:32       ` Daniel Walsh
2020-02-18 18:58         ` Dr. David Alan Gilbert
2020-02-18 21:39           ` Daniel Walsh
2020-02-19 14:06             ` [Virtio-fs] Effect of nodev, noexec, nosuid mount options (Was: Re: One virtiofs daemon per exported dir requirement) Vivek Goyal
2020-02-21 15:38               ` Daniel Walsh
2020-02-19 15:24           ` [Virtio-fs] One virtiofs daemon per exported dir requirement Stefan Hajnoczi

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.