kernelnewbies.kernelnewbies.org archive mirror
 help / color / mirror / Atom feed
* pipe writes, ERESTARTSYS and SA_RESTART
       [not found] <CAPKsgQzW2uuQrL0amv+hBhVi612mJdQGgpWP33biyf7T3YMKQA@mail.gmail.com>
@ 2022-08-04 18:01 ` Viacheslav Biriukov
  2022-08-04 18:38   ` Yann Droneaud
  0 siblings, 1 reply; 3+ messages in thread
From: Viacheslav Biriukov @ 2022-08-04 18:01 UTC (permalink / raw)
  To: kernelnewbies


[-- Attachment #1.1: Type: text/plain, Size: 1691 bytes --]

Hello team,

It would be great if someone can help me with a question about blocking
write calls to a pipe and syscall restart logic.

From my experiments I can see that if the SA_RESTART flag is set, the
kernel (?) restarts the write call if the process gets a signal.
The logic lives in the pipe.c file under the pipe_write function:
https://elixir.bootlin.com/linux/v5.19/source/fs/pipe.c#L555

But what I can't understand is how and where the kernel modifies the
arguments of the write system call and where it collects the return values
of all these restarts, thus the userspace caller ultimately sees the
correct number of written bytes.

With strace I can see all that retries, for example:

write(1, ""..., 33554431)               = 65536
write(1, ""..., 33488895)               = ? ERESTARTSYS (To be restarted if
SA_RESTART is set)
write(1, ""..., 33488895)               = ? ERESTARTSYS (To be restarted if
SA_RESTART is set)
write(1, ""..., 33488895)               = ? ERESTARTSYS (To be restarted if
SA_RESTART is set)
write(1, ""..., 33488895)               = 33488895

Here there were 4 restarts (I sent 4 signals), 3 of them returned
ERESTARTSYS and 2 were able to write to the pipe. Also for restarts strace
shows the correct 3rd argument, which is decrementing.

The caller in the userspace in the end sees that it was able to write
65536+33488895 bytes. Which is correct and what the man 7 pipe describes.

My question is how and where it does that. I tried to dig in the kernel
source code but can't find the place where this tracking occurs.

Thank you for reading this far and for your willingness to help.

Have a great day,
BR,
Viacheslav

-- 
Sent from Gmail Mobile

[-- Attachment #1.2: Type: text/html, Size: 2464 bytes --]

[-- Attachment #2: Type: text/plain, Size: 170 bytes --]

_______________________________________________
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
https://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: pipe writes, ERESTARTSYS and SA_RESTART
  2022-08-04 18:01 ` pipe writes, ERESTARTSYS and SA_RESTART Viacheslav Biriukov
@ 2022-08-04 18:38   ` Yann Droneaud
  2022-08-04 20:30     ` Viacheslav Biriukov
  0 siblings, 1 reply; 3+ messages in thread
From: Yann Droneaud @ 2022-08-04 18:38 UTC (permalink / raw)
  To: Viacheslav Biriukov, kernelnewbies

Hi,

Le 04/08/2022 à 20:01, Viacheslav Biriukov a écrit :
>
> But what I can't understand is how and where the kernel modifies the 
> arguments of the write system call and where it collects the return 
> values of all these restarts, thus the userspace caller ultimately 
> sees the correct number of written bytes.
>
> With strace I can see all that retries, for example:
>
> write(1, ""..., 33554431)               = 65536
> write(1, ""..., 33488895)               = ? ERESTARTSYS (To be 
> restarted if SA_RESTART is set)
> write(1, ""..., 33488895)               = ? ERESTARTSYS (To be 
> restarted if SA_RESTART is set)
> write(1, ""..., 33488895)               = ? ERESTARTSYS (To be 
> restarted if SA_RESTART is set)
> write(1, ""..., 33488895)               = 33488895
>
> Here there were 4 restarts (I sent 4 signals), 3 of them returned 
> ERESTARTSYS and 2 were able to write to the pipe. Also for restarts 
> strace shows the correct 3rd argument, which is decrementing.
>
> The caller in the userspace in the end sees that it was able to write 
> 65536+33488895 bytes. Which is correct and what the man 7 pipe describes.
>
> My question is how and where it does that. I tried to dig in the 
> kernel source code but can't find the place where this tracking occurs.
>

It doesn't. SA_RESTART is only meant to retry syscall that would have 
returned EINTR.

In such case, there's no tracking to do, because nothing was actually 
written, so the syscall can be restarted with the same parameters.


Regards.

-- 

Yann Droneaud

OPTEYA



_______________________________________________
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
https://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: pipe writes, ERESTARTSYS and SA_RESTART
  2022-08-04 18:38   ` Yann Droneaud
@ 2022-08-04 20:30     ` Viacheslav Biriukov
  0 siblings, 0 replies; 3+ messages in thread
From: Viacheslav Biriukov @ 2022-08-04 20:30 UTC (permalink / raw)
  To: Yann Droneaud; +Cc: kernelnewbies


[-- Attachment #1.1: Type: text/plain, Size: 2236 bytes --]

Hi Yann,

Thank you for your time and quick response.

You are right. I found what misled me. It was a buffered writer to the
stdout, which did all my syscals restarts and it confused me.
The kernel does what it should, if a portion of data was written, then it's
a partial write, otherwise a restart if SA_RESTART is set.

Thank you for confirming that there are no changes in the kernel about the
syscalls restarts.

Have a good day!


On Thu, Aug 4, 2022 at 7:38 PM Yann Droneaud <ydroneaud@opteya.com> wrote:

> Hi,
>
> Le 04/08/2022 à 20:01, Viacheslav Biriukov a écrit :
> >
> > But what I can't understand is how and where the kernel modifies the
> > arguments of the write system call and where it collects the return
> > values of all these restarts, thus the userspace caller ultimately
> > sees the correct number of written bytes.
> >
> > With strace I can see all that retries, for example:
> >
> > write(1, ""..., 33554431)               = 65536
> > write(1, ""..., 33488895)               = ? ERESTARTSYS (To be
> > restarted if SA_RESTART is set)
> > write(1, ""..., 33488895)               = ? ERESTARTSYS (To be
> > restarted if SA_RESTART is set)
> > write(1, ""..., 33488895)               = ? ERESTARTSYS (To be
> > restarted if SA_RESTART is set)
> > write(1, ""..., 33488895)               = 33488895
> >
> > Here there were 4 restarts (I sent 4 signals), 3 of them returned
> > ERESTARTSYS and 2 were able to write to the pipe. Also for restarts
> > strace shows the correct 3rd argument, which is decrementing.
> >
> > The caller in the userspace in the end sees that it was able to write
> > 65536+33488895 bytes. Which is correct and what the man 7 pipe describes.
> >
> > My question is how and where it does that. I tried to dig in the
> > kernel source code but can't find the place where this tracking occurs.
> >
>
> It doesn't. SA_RESTART is only meant to retry syscall that would have
> returned EINTR.
>
> In such case, there's no tracking to do, because nothing was actually
> written, so the syscall can be restarted with the same parameters.
>
>
> Regards.
>
> --
>
> Yann Droneaud
>
> OPTEYA
>
>
>

-- 
Viacheslav Biriukov
BR

[-- Attachment #1.2: Type: text/html, Size: 3940 bytes --]

[-- Attachment #2: Type: text/plain, Size: 170 bytes --]

_______________________________________________
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
https://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2022-08-04 20:33 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <CAPKsgQzW2uuQrL0amv+hBhVi612mJdQGgpWP33biyf7T3YMKQA@mail.gmail.com>
2022-08-04 18:01 ` pipe writes, ERESTARTSYS and SA_RESTART Viacheslav Biriukov
2022-08-04 18:38   ` Yann Droneaud
2022-08-04 20:30     ` Viacheslav Biriukov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).