IO-Uring Archive on lore.kernel.org
 help / color / Atom feed
* [GIT PULL] io_uring updates for 5.10-rc1
@ 2020-10-12 13:46 Jens Axboe
  2020-10-13 19:46 ` Linus Torvalds
  2020-10-13 19:47 ` pr-tracker-bot
  0 siblings, 2 replies; 9+ messages in thread
From: Jens Axboe @ 2020-10-12 13:46 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: io-uring, linux-kernel

Hi Linus,

Here are the io_uring updates for 5.10. This pull request contains:

- Add blkcg accounting for io-wq offload (Dennis)

- A use-after-free fix for io-wq (Hillf)

- Cancelation fixes and improvements

- Use proper files_struct references for offload

- Cleanup of io_uring_get_socket() since that can now go into our own
  header

- SQPOLL fixes and cleanups, and support for sharing the thread

- Improvement to how page accounting is done for registered buffers and
  huge pages, accounting the real pinned state

- Series cleaning up the xarray code (Willy)

- Various cleanups, refactoring, and improvements (Pavel)

- Use raw spinlock for io-wq (Sebastian)

- Add support for ring restrictions (Stefano)

Please pull!


The following changes since commit c8d317aa1887b40b188ec3aaa6e9e524333caed1:

  io_uring: fix async buffered reads when readahead is disabled (2020-09-29 07:54:00 -0600)

are available in the Git repository at:

  git://git.kernel.dk/linux-block.git tags/io_uring-5.10-2020-10-12

for you to fetch changes up to b2e9685283127f30e7f2b466af0046ff9bd27a86:

  io_uring: keep a pointer ref_node in file_data (2020-10-10 12:49:25 -0600)

----------------------------------------------------------------
io_uring-5.10-2020-10-12

----------------------------------------------------------------
Dennis Zhou (1):
      io_uring: add blkcg accounting to offloaded operations

Hillf Danton (1):
      io-wq: fix use-after-free in io_wq_worker_running

Jens Axboe (29):
      Merge branch 'io_uring-5.9' into for-5.10/io_uring
      io_uring: allow timeout/poll/files killing to take task into account
      io_uring: move dropping of files into separate helper
      io_uring: stash ctx task reference for SQPOLL
      io_uring: unconditionally grab req->task
      io_uring: return cancelation status from poll/timeout/files handlers
      io_uring: enable task/files specific overflow flushing
      io_uring: don't rely on weak ->files references
      io_uring: reference ->nsproxy for file table commands
      io_uring: move io_uring_get_socket() into io_uring.h
      io_uring: io_sq_thread() doesn't need to flush signals
      fs: align IOCB_* flags with RWF_* flags
      io_uring: use private ctx wait queue entries for SQPOLL
      io_uring: move SQPOLL post-wakeup ring need wakeup flag into wake handler
      io_uring: split work handling part of SQPOLL into helper
      io_uring: split SQPOLL data into separate structure
      io_uring: base SQPOLL handling off io_sq_data
      io_uring: enable IORING_SETUP_ATTACH_WQ to attach to SQPOLL thread too
      io_uring: mark io_uring_fops/io_op_defs as __read_mostly
      io_uring: provide IORING_ENTER_SQ_WAIT for SQPOLL SQ ring waits
      io_uring: get rid of req->io/io_async_ctx union
      io_uring: cap SQ submit size for SQPOLL with multiple rings
      io_uring: improve registered buffer accounting for huge pages
      io_uring: process task work in io_uring_register()
      io-wq: kill unused IO_WORKER_F_EXITING
      io_uring: kill callback_head argument for io_req_task_work_add()
      io_uring: batch account ->req_issue and task struct references
      io_uring: no need to call xa_destroy() on empty xarray
      io_uring: fix break condition for __io_uring_register() waiting

Joseph Qi (1):
      io_uring: show sqthread pid and cpu in fdinfo

Matthew Wilcox (Oracle) (3):
      io_uring: Fix use of XArray in __io_uring_files_cancel
      io_uring: Fix XArray usage in io_uring_add_task_file
      io_uring: Convert advanced XArray uses to the normal API

Pavel Begunkov (23):
      io_uring: simplify io_rw_prep_async()
      io_uring: refactor io_req_map_rw()
      io_uring: fix overlapped memcpy in io_req_map_rw()
      io_uring: kill extra user_bufs check
      io_uring: simplify io_alloc_req()
      io_uring: io_kiocb_ppos() style change
      io_uring: remove F_NEED_CLEANUP check in *prep()
      io_uring: set/clear IOCB_NOWAIT into io_read/write
      io_uring: remove nonblock arg from io_{rw}_prep()
      io_uring: decouple issuing and req preparation
      io_uring: move req preps out of io_issue_sqe()
      io_uring: don't io_prep_async_work() linked reqs
      io_uring: clean up ->files grabbing
      io_uring: kill extra check in fixed io_file_get()
      io_uring: simplify io_file_get()
      io_uring: improve submit_state.ios_left accounting
      io_uring: use a separate struct for timeout_remove
      io_uring: remove timeout.list after hrtimer cancel
      io_uring: clean leftovers after splitting issue
      io_uring: don't delay io_init_req() error check
      io_uring: clean file_data access in files_register
      io_uring: refactor *files_register()'s error paths
      io_uring: keep a pointer ref_node in file_data

Sebastian Andrzej Siewior (1):
      io_wq: Make io_wqe::lock a raw_spinlock_t

Stefano Garzarella (3):
      io_uring: use an enumeration for io_uring_register(2) opcodes
      io_uring: add IOURING_REGISTER_RESTRICTIONS opcode
      io_uring: allow disabling rings during the creation

Zheng Bin (1):
      io_uring: remove unneeded semicolon

 fs/exec.c                     |    6 +
 fs/file.c                     |    2 +
 fs/io-wq.c                    |  200 +++---
 fs/io-wq.h                    |    4 +
 fs/io_uring.c                 | 2181 ++++++++++++++++++++++++++++++++++++--------------------
 include/linux/fs.h            |   46 +-
 include/linux/io_uring.h      |   58 ++
 include/linux/sched.h         |    5 +
 include/uapi/linux/io_uring.h |   61 +-
 init/init_task.c              |    3 +
 kernel/fork.c                 |    6 +
 net/unix/scm.c                |    1 +
 12 files changed, 1662 insertions(+), 911 deletions(-)
 create mode 100644 include/linux/io_uring.h

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [GIT PULL] io_uring updates for 5.10-rc1
  2020-10-12 13:46 [GIT PULL] io_uring updates for 5.10-rc1 Jens Axboe
@ 2020-10-13 19:46 ` Linus Torvalds
  2020-10-13 19:49   ` Jens Axboe
  2020-10-14 17:43   ` Nick Desaulniers
  2020-10-13 19:47 ` pr-tracker-bot
  1 sibling, 2 replies; 9+ messages in thread
From: Linus Torvalds @ 2020-10-13 19:46 UTC (permalink / raw)
  To: Jens Axboe; +Cc: io-uring, linux-kernel

On Mon, Oct 12, 2020 at 6:46 AM Jens Axboe <axboe@kernel.dk> wrote:
>
> Here are the io_uring updates for 5.10.

Very strange. My clang build gives a warning I've never seen before:

   /tmp/io_uring-dd40c4.s:26476: Warning: ignoring changed section
attributes for .data..read_mostly

and looking at what clang generates for the *.s file, it seems to be
the "section" line in:

        .type   io_op_defs,@object      # @io_op_defs
        .section        .data..read_mostly,"a",@progbits
        .p2align        4

I think it's the combination of "const" and "__read_mostly".

I think the warning is sensible: how can a piece of data be both
"const" and "__read_mostly"? If it's "const", then it's not "mostly"
read - it had better be _always_ read.

I'm letting it go, and I've pulled this (gcc doesn't complain), but
please have a look.

                 Linus

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [GIT PULL] io_uring updates for 5.10-rc1
  2020-10-12 13:46 [GIT PULL] io_uring updates for 5.10-rc1 Jens Axboe
  2020-10-13 19:46 ` Linus Torvalds
@ 2020-10-13 19:47 ` pr-tracker-bot
  1 sibling, 0 replies; 9+ messages in thread
From: pr-tracker-bot @ 2020-10-13 19:47 UTC (permalink / raw)
  To: Jens Axboe; +Cc: Linus Torvalds, io-uring, linux-kernel

The pull request you sent on Mon, 12 Oct 2020 07:46:45 -0600:

> git://git.kernel.dk/linux-block.git tags/io_uring-5.10-2020-10-12

has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/6ad4bf6ea1609fb539a62f10fca87ddbd53a0315

Thank you!

-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/prtracker.html

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [GIT PULL] io_uring updates for 5.10-rc1
  2020-10-13 19:46 ` Linus Torvalds
@ 2020-10-13 19:49   ` Jens Axboe
  2020-10-13 19:50     ` Linus Torvalds
                       ` (2 more replies)
  2020-10-14 17:43   ` Nick Desaulniers
  1 sibling, 3 replies; 9+ messages in thread
From: Jens Axboe @ 2020-10-13 19:49 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: io-uring, linux-kernel

On 10/13/20 1:46 PM, Linus Torvalds wrote:
> On Mon, Oct 12, 2020 at 6:46 AM Jens Axboe <axboe@kernel.dk> wrote:
>>
>> Here are the io_uring updates for 5.10.
> 
> Very strange. My clang build gives a warning I've never seen before:
> 
>    /tmp/io_uring-dd40c4.s:26476: Warning: ignoring changed section
> attributes for .data..read_mostly
> 
> and looking at what clang generates for the *.s file, it seems to be
> the "section" line in:
> 
>         .type   io_op_defs,@object      # @io_op_defs
>         .section        .data..read_mostly,"a",@progbits
>         .p2align        4
> 
> I think it's the combination of "const" and "__read_mostly".
> 
> I think the warning is sensible: how can a piece of data be both
> "const" and "__read_mostly"? If it's "const", then it's not "mostly"
> read - it had better be _always_ read.
> 
> I'm letting it go, and I've pulled this (gcc doesn't complain), but
> please have a look.

Huh weird, I'll take a look. FWIW, the construct isn't unique across
the kernel.

What clang are you using?

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [GIT PULL] io_uring updates for 5.10-rc1
  2020-10-13 19:49   ` Jens Axboe
@ 2020-10-13 19:50     ` Linus Torvalds
  2020-10-13 20:49     ` Rasmus Villemoes
  2020-10-13 21:06     ` Arvind Sankar
  2 siblings, 0 replies; 9+ messages in thread
From: Linus Torvalds @ 2020-10-13 19:50 UTC (permalink / raw)
  To: Jens Axboe; +Cc: io-uring, linux-kernel

On Tue, Oct 13, 2020 at 12:49 PM Jens Axboe <axboe@kernel.dk> wrote:
>
> What clang are you using?

I have a self-built clang version from their development tree, since
I've been using it for the "asm goto with outputs" testing.

But I bet this happens with just regular reasonably up-to-date clang too.

            Linus

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [GIT PULL] io_uring updates for 5.10-rc1
  2020-10-13 19:49   ` Jens Axboe
  2020-10-13 19:50     ` Linus Torvalds
@ 2020-10-13 20:49     ` Rasmus Villemoes
  2020-10-13 21:00       ` Jens Axboe
  2020-10-13 21:06     ` Arvind Sankar
  2 siblings, 1 reply; 9+ messages in thread
From: Rasmus Villemoes @ 2020-10-13 20:49 UTC (permalink / raw)
  To: Jens Axboe, Linus Torvalds; +Cc: io-uring, linux-kernel

On 13/10/2020 21.49, Jens Axboe wrote:
> On 10/13/20 1:46 PM, Linus Torvalds wrote:
>> On Mon, Oct 12, 2020 at 6:46 AM Jens Axboe <axboe@kernel.dk> wrote:
>>>
>>> Here are the io_uring updates for 5.10.
>>
>> Very strange. My clang build gives a warning I've never seen before:
>>
>>    /tmp/io_uring-dd40c4.s:26476: Warning: ignoring changed section
>> attributes for .data..read_mostly
>>
>> and looking at what clang generates for the *.s file, it seems to be
>> the "section" line in:
>>
>>         .type   io_op_defs,@object      # @io_op_defs
>>         .section        .data..read_mostly,"a",@progbits
>>         .p2align        4
>>
>> I think it's the combination of "const" and "__read_mostly".
>>
>> I think the warning is sensible: how can a piece of data be both
>> "const" and "__read_mostly"? If it's "const", then it's not "mostly"
>> read - it had better be _always_ read.
>>
>> I'm letting it go, and I've pulled this (gcc doesn't complain), but
>> please have a look.
> 
> Huh weird, I'll take a look. FWIW, the construct isn't unique across
> the kernel.

Citation needed. There's lots of "pointer to const foo" stuff declared
as __read_mostly, but I can't find any objects that are themselves both
const and __read_mostly. Other than that io_op_defs and io_uring_fops now.

But... there's something a little weird:

$ grep read_most -- fs/io_uring.s
        .section        .data..read_mostly,"a",@progbits
$ readelf --wide -S fs/io_uring.o | grep read_most
  [32] .data..read_mostly PROGBITS        0000000000000000 01b4e0 000188
00  WA  0   0 32

(this is with gcc/gas). So despite that .section directive not saying
"aw", the section got the W flag anyway. There are lots of

        .section "__tracepoints_ptrs", "a"
      .pushsection .smp_locks,"a"

in the .s file, and those sections do end up with just the A bit in the
.o file. Does gas maybe somehow special-case a section name starting
with .data?

Rasmus

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [GIT PULL] io_uring updates for 5.10-rc1
  2020-10-13 20:49     ` Rasmus Villemoes
@ 2020-10-13 21:00       ` Jens Axboe
  0 siblings, 0 replies; 9+ messages in thread
From: Jens Axboe @ 2020-10-13 21:00 UTC (permalink / raw)
  To: Rasmus Villemoes, Linus Torvalds; +Cc: io-uring, linux-kernel

On 10/13/20 2:49 PM, Rasmus Villemoes wrote:
> On 13/10/2020 21.49, Jens Axboe wrote:
>> On 10/13/20 1:46 PM, Linus Torvalds wrote:
>>> On Mon, Oct 12, 2020 at 6:46 AM Jens Axboe <axboe@kernel.dk> wrote:
>>>>
>>>> Here are the io_uring updates for 5.10.
>>>
>>> Very strange. My clang build gives a warning I've never seen before:
>>>
>>>    /tmp/io_uring-dd40c4.s:26476: Warning: ignoring changed section
>>> attributes for .data..read_mostly
>>>
>>> and looking at what clang generates for the *.s file, it seems to be
>>> the "section" line in:
>>>
>>>         .type   io_op_defs,@object      # @io_op_defs
>>>         .section        .data..read_mostly,"a",@progbits
>>>         .p2align        4
>>>
>>> I think it's the combination of "const" and "__read_mostly".
>>>
>>> I think the warning is sensible: how can a piece of data be both
>>> "const" and "__read_mostly"? If it's "const", then it's not "mostly"
>>> read - it had better be _always_ read.
>>>
>>> I'm letting it go, and I've pulled this (gcc doesn't complain), but
>>> please have a look.
>>
>> Huh weird, I'll take a look. FWIW, the construct isn't unique across
>> the kernel.
> 
> Citation needed. There's lots of "pointer to const foo" stuff declared
> as __read_mostly, but I can't find any objects that are themselves both
> const and __read_mostly. Other than that io_op_defs and io_uring_fops now.

You are right, they are all pointers, so not the same. I'll just revert
the patch.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [GIT PULL] io_uring updates for 5.10-rc1
  2020-10-13 19:49   ` Jens Axboe
  2020-10-13 19:50     ` Linus Torvalds
  2020-10-13 20:49     ` Rasmus Villemoes
@ 2020-10-13 21:06     ` Arvind Sankar
  2 siblings, 0 replies; 9+ messages in thread
From: Arvind Sankar @ 2020-10-13 21:06 UTC (permalink / raw)
  To: Jens Axboe; +Cc: Linus Torvalds, io-uring, linux-kernel

On Tue, Oct 13, 2020 at 01:49:01PM -0600, Jens Axboe wrote:
> On 10/13/20 1:46 PM, Linus Torvalds wrote:
> > On Mon, Oct 12, 2020 at 6:46 AM Jens Axboe <axboe@kernel.dk> wrote:
> >>
> >> Here are the io_uring updates for 5.10.
> > 
> > Very strange. My clang build gives a warning I've never seen before:
> > 
> >    /tmp/io_uring-dd40c4.s:26476: Warning: ignoring changed section
> > attributes for .data..read_mostly
> > 
> > and looking at what clang generates for the *.s file, it seems to be
> > the "section" line in:
> > 
> >         .type   io_op_defs,@object      # @io_op_defs
> >         .section        .data..read_mostly,"a",@progbits
> >         .p2align        4
> > 
> > I think it's the combination of "const" and "__read_mostly".
> > 
> > I think the warning is sensible: how can a piece of data be both
> > "const" and "__read_mostly"? If it's "const", then it's not "mostly"
> > read - it had better be _always_ read.
> > 
> > I'm letting it go, and I've pulled this (gcc doesn't complain), but
> > please have a look.
> 
> Huh weird, I'll take a look. FWIW, the construct isn't unique across
> the kernel.
> 
> What clang are you using?
> 
> -- 
> Jens Axboe
> 

If const and non-const __read_mostly appeared in the same file, both gcc
and clang would give errors.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [GIT PULL] io_uring updates for 5.10-rc1
  2020-10-13 19:46 ` Linus Torvalds
  2020-10-13 19:49   ` Jens Axboe
@ 2020-10-14 17:43   ` Nick Desaulniers
  1 sibling, 0 replies; 9+ messages in thread
From: Nick Desaulniers @ 2020-10-14 17:43 UTC (permalink / raw)
  To: torvalds; +Cc: axboe, io-uring, linux-kernel, kernel-tooling, clang-built-linux

Sorry for not reporting it sooner.  It looks to me like a GNU `as` bug:
https://github.com/ClangBuiltLinux/linux/issues/1153#issuecomment-692265433
When I'm done with the three build breakages that popped up overnight I'll try
to report it to GNU binutils folks.

(We run an issue tracker out of
https://github.com/ClangBuiltLinux/linux/issues, if your interested to see what
the outstanding known issues are, or recently solved ones.  We try to
aggressively track when and where patches land for the inevitable backports.
We have 118 people in our github group!)

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, back to index

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-10-12 13:46 [GIT PULL] io_uring updates for 5.10-rc1 Jens Axboe
2020-10-13 19:46 ` Linus Torvalds
2020-10-13 19:49   ` Jens Axboe
2020-10-13 19:50     ` Linus Torvalds
2020-10-13 20:49     ` Rasmus Villemoes
2020-10-13 21:00       ` Jens Axboe
2020-10-13 21:06     ` Arvind Sankar
2020-10-14 17:43   ` Nick Desaulniers
2020-10-13 19:47 ` pr-tracker-bot

IO-Uring Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/io-uring/0 io-uring/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 io-uring io-uring/ https://lore.kernel.org/io-uring \
		io-uring@vger.kernel.org
	public-inbox-index io-uring

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.io-uring


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git