All of lore.kernel.org
 help / color / mirror / Atom feed
* [Bug 105121] New: lseek() hangs for a long time on allocated files
@ 2015-09-28 12:27 bugzilla-daemon
  2015-09-28 14:37 ` [Bug 105121] lseek(SEEK_DATA) hangs for a long time for sparse files in the page cache bugzilla-daemon
                   ` (5 more replies)
  0 siblings, 6 replies; 7+ messages in thread
From: bugzilla-daemon @ 2015-09-28 12:27 UTC (permalink / raw)
  To: linux-ext4

https://bugzilla.kernel.org/show_bug.cgi?id=105121

            Bug ID: 105121
           Summary: lseek() hangs for a long time on allocated files
           Product: File System
           Version: 2.5
    Kernel Version: 4.2.1
          Hardware: All
                OS: Linux
              Tree: Mainline
            Status: NEW
          Severity: normal
          Priority: P1
         Component: ext4
          Assignee: fs_ext4@kernel-bugs.osdl.org
          Reporter: iam@valdikss.org.ru
        Regression: No

Created attachment 188751
  --> https://bugzilla.kernel.org/attachment.cgi?id=188751&action=edit
lseek test program

lseek() call hangs for a long time on fallocate allocated files on ext4.

Steps to reproduce:
1. gcc -o prog lseek-fallocate.c
2. fallocate -l 1G test
3. cat test > /dev/null
4. ./prog

Actual result:
prog hangs for 2 minutes on my system while doing lseek().

Expected result:
lseek() instantly return ENXIO.

Usually lseek works much faster if file wasn't cat'ed before lseek.
Works properly on btrfs.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug 105121] lseek(SEEK_DATA) hangs for a long time for sparse files in the page cache
  2015-09-28 12:27 [Bug 105121] New: lseek() hangs for a long time on allocated files bugzilla-daemon
@ 2015-09-28 14:37 ` bugzilla-daemon
  2015-09-28 14:47 ` bugzilla-daemon
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: bugzilla-daemon @ 2015-09-28 14:37 UTC (permalink / raw)
  To: linux-ext4

https://bugzilla.kernel.org/show_bug.cgi?id=105121

Theodore Tso <tytso@mit.edu> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |tytso@mit.edu
            Summary|lseek() hangs for a long    |lseek(SEEK_DATA) hangs for
                   |time on allocated files     |a long time for sparse
                   |                            |files in the page cache

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug 105121] lseek(SEEK_DATA) hangs for a long time for sparse files in the page cache
  2015-09-28 12:27 [Bug 105121] New: lseek() hangs for a long time on allocated files bugzilla-daemon
  2015-09-28 14:37 ` [Bug 105121] lseek(SEEK_DATA) hangs for a long time for sparse files in the page cache bugzilla-daemon
@ 2015-09-28 14:47 ` bugzilla-daemon
  2015-09-28 14:52 ` bugzilla-daemon
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: bugzilla-daemon @ 2015-09-28 14:47 UTC (permalink / raw)
  To: linux-ext4

https://bugzilla.kernel.org/show_bug.cgi?id=105121

--- Comment #1 from Theodore Tso <tytso@mit.edu> ---
Thanks for reporting this!

What's going on is that the "cat test > /dev/null" is instantiating 1GB of zero
pages in the page cache.

Currently, we are using the extent status to determine if a logical block is
mapped to a physical block.  However, the SEEK_DATA code dates back from a time
when we were not storing the status of delayed allocation blocks in the extent
status cache.  So if the extent status cache indicates that the blocks are
unwritten, the code which is handling fseek(SEEK_DATA) is scanning all of the
pages to determine if any of the unwritten blocks happen to be modified in
memory, but which haven't been pushed out to disk yet.

We should be able to optimize this (and simplify the code as a bonus) by using
the EXTENT_STATUS_DELAYED flag instead of trying to scan through all of the
page structs (with all of the locking requirements this entails).

Out of curiosity, what was the use case that caused you to notice this?

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug 105121] lseek(SEEK_DATA) hangs for a long time for sparse files in the page cache
  2015-09-28 12:27 [Bug 105121] New: lseek() hangs for a long time on allocated files bugzilla-daemon
  2015-09-28 14:37 ` [Bug 105121] lseek(SEEK_DATA) hangs for a long time for sparse files in the page cache bugzilla-daemon
  2015-09-28 14:47 ` bugzilla-daemon
@ 2015-09-28 14:52 ` bugzilla-daemon
  2015-09-28 15:15 ` bugzilla-daemon
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: bugzilla-daemon @ 2015-09-28 14:52 UTC (permalink / raw)
  To: linux-ext4

https://bugzilla.kernel.org/show_bug.cgi?id=105121

--- Comment #2 from ValdikSS <iam@valdikss.org.ru> ---
>Out of curiosity, what was the use case that caused you to notice this?

A person was wondering why cat|grep works faster than grep file on usual files
and his test case involved creating file with fallocate and cat it before
grepping. He was apparently used btrfs and had 2 seconds difference while it
hang for me as grep does literally what's in the attached source.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug 105121] lseek(SEEK_DATA) hangs for a long time for sparse files in the page cache
  2015-09-28 12:27 [Bug 105121] New: lseek() hangs for a long time on allocated files bugzilla-daemon
                   ` (2 preceding siblings ...)
  2015-09-28 14:52 ` bugzilla-daemon
@ 2015-09-28 15:15 ` bugzilla-daemon
  2015-09-28 15:23 ` bugzilla-daemon
  2019-01-18  7:44 ` bugzilla-daemon
  5 siblings, 0 replies; 7+ messages in thread
From: bugzilla-daemon @ 2015-09-28 15:15 UTC (permalink / raw)
  To: linux-ext4

https://bugzilla.kernel.org/show_bug.cgi?id=105121

--- Comment #3 from Theodore Tso <tytso@mit.edu> ---
This implies that grep is using lseek(SEEK_DATA) as an optimization when users
use grep on sparse files.   So I'm guessing this is a Thing, but I'm at a loss
why people are interested in running grep on a sparse file (with or without
blocks preallocated using fallocate).   Can you enlighten me as to why people
(or at least you and your colleague) find it useful to run grep on such files?

Not that it matters since this is a pretty clear optimization we should add to
ext4; I'm just curious what the use case is.

Thanks!!

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug 105121] lseek(SEEK_DATA) hangs for a long time for sparse files in the page cache
  2015-09-28 12:27 [Bug 105121] New: lseek() hangs for a long time on allocated files bugzilla-daemon
                   ` (3 preceding siblings ...)
  2015-09-28 15:15 ` bugzilla-daemon
@ 2015-09-28 15:23 ` bugzilla-daemon
  2019-01-18  7:44 ` bugzilla-daemon
  5 siblings, 0 replies; 7+ messages in thread
From: bugzilla-daemon @ 2015-09-28 15:23 UTC (permalink / raw)
  To: linux-ext4

https://bugzilla.kernel.org/show_bug.cgi?id=105121

--- Comment #4 from ValdikSS <iam@valdikss.org.ru> ---
Nobody really uses grep on sparse files. It was just a discussion on why many
people use cat|grep instead of just using 'grep file', as pipes are usually
slower. Somebody said that he uses cat|grep because it's actually faster than
'grep file' and made this test case. fallocate was used just to make big enough
file to tell a difference in seconds. His result was 1 second with cat|grep and
3 seconds with grep file, but he was running btrfs. When I tried his test case,
it hang grep so hard I couldn't kill it with SIGKILL even if it was in
'running' state. I took a look in strace what's going on and why it is stalled
and filled this bug.

Anyway, thanks for confirmation and acknowledgment this problem!

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug 105121] lseek(SEEK_DATA) hangs for a long time for sparse files in the page cache
  2015-09-28 12:27 [Bug 105121] New: lseek() hangs for a long time on allocated files bugzilla-daemon
                   ` (4 preceding siblings ...)
  2015-09-28 15:23 ` bugzilla-daemon
@ 2019-01-18  7:44 ` bugzilla-daemon
  5 siblings, 0 replies; 7+ messages in thread
From: bugzilla-daemon @ 2019-01-18  7:44 UTC (permalink / raw)
  To: linux-ext4

https://bugzilla.kernel.org/show_bug.cgi?id=105121

ValdikSS (iam@valdikss.org.ru) changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|---                         |CODE_FIX

--- Comment #5 from ValdikSS (iam@valdikss.org.ru) ---
Seems fixed as for 4.19.13.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2019-01-18  7:44 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-09-28 12:27 [Bug 105121] New: lseek() hangs for a long time on allocated files bugzilla-daemon
2015-09-28 14:37 ` [Bug 105121] lseek(SEEK_DATA) hangs for a long time for sparse files in the page cache bugzilla-daemon
2015-09-28 14:47 ` bugzilla-daemon
2015-09-28 14:52 ` bugzilla-daemon
2015-09-28 15:15 ` bugzilla-daemon
2015-09-28 15:23 ` bugzilla-daemon
2019-01-18  7:44 ` bugzilla-daemon

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.