All of lore.kernel.org
 help / color / mirror / Atom feed
From: Pavel Begunkov <asml.silence@gmail.com>
To: Sascha Hauer <s.hauer@pengutronix.de>
Cc: netdev@vger.kernel.org, kernel@pengutronix.de,
	linux-kernel@vger.kernel.org, Paolo Abeni <pabeni@redhat.com>,
	Jens Axboe <axboe@kernel.dk>,
	io-uring@vger.kernel.org
Subject: Re: [PATCH] net: Do not break out of sk_stream_wait_memory() with TIF_NOTIFY_SIGNAL
Date: Mon, 18 Mar 2024 13:19:19 +0000	[thread overview]
Message-ID: <0a556650-9627-48ee-9707-05d7cab33f0f@gmail.com> (raw)
In-Reply-To: <ZfgvNjWP8OYMIa3Y@pengutronix.de>

On 3/18/24 12:10, Sascha Hauer wrote:
> On Fri, Mar 15, 2024 at 05:02:05PM +0000, Pavel Begunkov wrote:
>> On 3/15/24 10:01, Sascha Hauer wrote:
>>> It can happen that a socket sends the remaining data at close() time.
>>> With io_uring and KTLS it can happen that sk_stream_wait_memory() bails
>>> out with -512 (-ERESTARTSYS) because TIF_NOTIFY_SIGNAL is set for the
>>> current task. This flag has been set in io_req_normal_work_add() by
>>> calling task_work_add().
>>
>> The entire idea of task_work is to interrupt syscalls and let io_uring
>> do its job, otherwise it wouldn't free resources it might be holding,
>> and even potentially forever block the syscall.
>>
>> I'm not that sure about connect / close (are they not restartable?),
>> but it doesn't seem to be a good idea for sk_stream_wait_memory(),
>> which is the normal TCP blocking send path. I'm thinking of some kinds
>> of cases with a local TCP socket pair, the tx queue is full as well
>> and the rx queue of the other end, and io_uring has to run to receive
>> the data.

There is another case, let's say the IO is done via io-wq
(io_uring's worker thread pool) and hits the waiting. Now the
request can't get cancelled, which is done by interrupting the
task with TIF_NOTIFY_SIGNAL. User requested request cancellations
is one thing, but we'd need to check if io_uring can ever be closed
in this case.


>> If interruptions are not welcome you can use different io_uring flags,
>> see IORING_SETUP_COOP_TASKRUN and/or IORING_SETUP_DEFER_TASKRUN.
> 
> I tried with different combinations of these flags. For example
> IORING_SETUP_TASKRUN_FLAG | IORING_SETUP_SINGLE_ISSUER | IORING_SETUP_DEFER_TASKRUN
> makes the issue less likely, but nevertheless it still happens.
> 
> However, reading the documentation of these flags, they shall provide
> hints to the kernel for optimizations, but it should work without these
> flags, right?

That's true, and I guess there are other cases as well, like
io-wq and perhaps even a stray fput.


>> Maybe I'm missing something, why not restart your syscall?
> 
> The problem comes with TLS. Normally with synchronous encryption all
> data on a socket is written during write(). When asynchronous
> encryption comes into play, then not all data is written during write(),
> but instead the remaining data is written at close() time.

Was it considered to do the final cleanup in workqueue
and only then finalising the release?


> Here is my call stack when things go awry:
> 
> [  325.560946] tls_push_sg: tcp_sendmsg_locked returned -512
> [  325.566371] CPU: 1 PID: 305 Comm: webserver_libur Not tainted 6.8.0-rc6-00022-g932acd9c444b-dirty #248
> [  325.575684] Hardware name: NXP i.MX8MPlus EVK board (DT)
> [  325.580997] Call trace:
> [  325.583444]  dump_backtrace+0x90/0xe8
> [  325.587122]  show_stack+0x18/0x24
> [  325.590444]  dump_stack_lvl+0x48/0x60
> [  325.594114]  dump_stack+0x18/0x24
> [  325.597432]  tls_push_sg+0xfc/0x22c
> [  325.600930]  tls_tx_records+0x114/0x1cc
> [  325.604772]  tls_sw_release_resources_tx+0x3c/0x140
> [  325.609658]  tls_sk_proto_close+0x2b0/0x3ac
> [  325.613846]  inet_release+0x4c/0x9c
> [  325.617341]  __sock_release+0x40/0xb4
> [  325.621007]  sock_close+0x18/0x28
> [  325.624328]  __fput+0x70/0x2bc
> [  325.627386]  ____fput+0x10/0x1c
> [  325.630531]  task_work_run+0x74/0xcc
> [  325.634113]  do_notify_resume+0x22c/0x1310
> [  325.638220]  el0_svc+0xa4/0xb4
> [  325.641279]  el0t_64_sync_handler+0x120/0x12c
> [  325.645643]  el0t_64_sync+0x190/0x194
> 
> As said, TLS is sending remaining data at close() time in tls_push_sg().
> Here sk_stream_wait_memory() gets interrupted and returns -ERESTARTSYS.
> There's no way to restart this operation, the socket is about to be
> closed and won't accept data anymore.

-- 
Pavel Begunkov

  reply	other threads:[~2024-03-18 13:20 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-03-15 10:01 [PATCH] net: Do not break out of sk_stream_wait_memory() with TIF_NOTIFY_SIGNAL Sascha Hauer
2024-03-15 16:43 ` Jens Axboe
2024-03-15 17:02 ` Pavel Begunkov
2024-03-18 12:10   ` Sascha Hauer
2024-03-18 13:19     ` Pavel Begunkov [this message]
2024-03-19 10:50       ` Sascha Hauer
2024-03-19 13:55         ` Pavel Begunkov
2024-03-19 15:08           ` Sascha Hauer
2024-03-18 12:03 ` Sascha Hauer
2024-03-19 12:30   ` Paolo Abeni
  -- strict thread matches above, loose matches on Subject: below --
2023-10-23 12:13 Sascha Hauer
2023-10-24 13:56 ` Paolo Abeni
2023-10-26  7:03   ` Sascha Hauer
2023-10-26  8:49     ` Paolo Abeni
2023-10-27 12:04       ` Sascha Hauer
2023-11-17 10:43         ` Marc Kleine-Budde
2023-12-19 11:00           ` Steffen Trumtrar
2023-12-19 13:13             ` Paolo Abeni

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=0a556650-9627-48ee-9707-05d7cab33f0f@gmail.com \
    --to=asml.silence@gmail.com \
    --cc=axboe@kernel.dk \
    --cc=io-uring@vger.kernel.org \
    --cc=kernel@pengutronix.de \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=s.hauer@pengutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.