* Re: [fuse-devel] Reconnect to FUSE session
[not found] <CADVsYmhF2=Y9AktyHdvKq5=CzJBALBjKfrSu8+2+=YdkSRazpg@mail.gmail.com>
@ 2021-12-14 14:04 ` Miklos Szeredi
2021-12-16 12:59 ` Andreas Gnau
0 siblings, 1 reply; 2+ messages in thread
From: Miklos Szeredi @ 2021-12-14 14:04 UTC (permalink / raw)
To: Robert Vasek; +Cc: fuse-devel, Hao Peng, linux-fsdevel
On Tue, 14 Dec 2021 at 13:58, Robert Vasek <rvasek01@gmail.com> wrote:
>
> Hello fuse-devel,
>
> I'd like to ask about the feasibility of having a reconnect feature added into the FUSE kernel module.
>
> The idea is that when a FUSE driver disconnects (process exited due to a bug, signal, etc.), all pending and future ops for that session would wait for that driver to appear again, and then continue as normal. Waiting would be on a timer, with ENOTCONN returned in case it times out. Obviously, "continue as normal" isn't possible for all FUSE drivers, as it depends on what they do and how they implement things -- they would have to opt-in for this feature.
>
> Use-cases span across basically anything where the lifecycle of a FUSE driver is managed by some external component (e.g. systemd, container orchestrators). This is especially true in containerized environments: volume mounts provided by FUSE drivers running in containers may get killed / rescheduled by the Orchestrator, or they may crash due to bugs, memory pressure, ..., leading to very possible data corruption and severed mounts. Having the ability to recover from such situations would greatly improve reliability of these systems.
>
> I haven't looked at how this would be implemented yet though. I'm just wondering if this makes sense at all and if you folks would be interested in such a feature?
A kernel patch[1] as well as example userspace code[2] has already
been proposed.
[1] https://lore.kernel.org/linux-fsdevel/CAPm50a+j8UL9g3UwpRsye5e+a=M0Hy7Tf1FdfwOrUUBWMyosNg@mail.gmail.com/
[2] https://lore.kernel.org/linux-fsdevel/CAPm50aLuK8Smy4NzdytUPmGM1vpzokKJdRuwxawUDA4jnJg=Fg@mail.gmail.com/
The example recovery is not very practical, but I can see how it would
be possible to extend to a read-only fs.
Is this what you had in mind?
Thanks,
Miklos
^ permalink raw reply [flat|nested] 2+ messages in thread
* Re: [fuse-devel] Reconnect to FUSE session
2021-12-14 14:04 ` [fuse-devel] Reconnect to FUSE session Miklos Szeredi
@ 2021-12-16 12:59 ` Andreas Gnau
0 siblings, 0 replies; 2+ messages in thread
From: Andreas Gnau @ 2021-12-16 12:59 UTC (permalink / raw)
To: Miklos Szeredi, Robert Vasek
Cc: fuse-devel, linux-fsdevel, Hao Peng, swami, laxmanv, dusseau, remzi
On 14/12/2021 15:04, Miklos Szeredi wrote:
> On Tue, 14 Dec 2021 at 13:58, Robert Vasek <rvasek01@gmail.com> wrote:
>>
>> Hello fuse-devel,
>>
>> I'd like to ask about the feasibility of having a reconnect feature added into the FUSE kernel module.
>>
>> The idea is that when a FUSE driver disconnects (process exited due to a bug, signal, etc.), all pending and future ops for that session would wait for that driver to appear again, and then continue as normal. Waiting would be on a timer, with ENOTCONN returned in case it times out. Obviously, "continue as normal" isn't possible for all FUSE drivers, as it depends on what they do and how they implement things -- they would have to opt-in for this feature.
>
> A kernel patch[1] as well as example userspace code[2] has already
> been proposed.
>
> [1] https://lore.kernel.org/linux-fsdevel/CAPm50a+j8UL9g3UwpRsye5e+a=M0Hy7Tf1FdfwOrUUBWMyosNg@mail.gmail.com/
>
> [2] https://lore.kernel.org/linux-fsdevel/CAPm50aLuK8Smy4NzdytUPmGM1vpzokKJdRuwxawUDA4jnJg=Fg@mail.gmail.com/
>
> The example recovery is not very practical, but I can see how it would
> be possible to extend to a read-only fs.
>
There has also been some related work in the paper
"Refuse to Crash with Re-FUSE"
https://research.cs.wisc.edu/wind/Publications/refuse-eurosys11.pdf
https://eurosys2011.cs.uni-salzburg.at/pdf/eurosys2011-sundararaman.pdf
The paper gives some insight into the challenges associated with
restarting and it seems like it worked better for them than I would have
thought. Not sure if any source-code for their work is available to
reproduce their findings, though.
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2021-12-16 13:37 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
[not found] <CADVsYmhF2=Y9AktyHdvKq5=CzJBALBjKfrSu8+2+=YdkSRazpg@mail.gmail.com>
2021-12-14 14:04 ` [fuse-devel] Reconnect to FUSE session Miklos Szeredi
2021-12-16 12:59 ` Andreas Gnau
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.