From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-ua0-f199.google.com (mail-ua0-f199.google.com [209.85.217.199]) by kanga.kvack.org (Postfix) with ESMTP id E19DB440874 for ; Wed, 12 Jul 2017 16:32:02 -0400 (EDT) Received: by mail-ua0-f199.google.com with SMTP id j1so12450721uah.3 for ; Wed, 12 Jul 2017 13:32:02 -0700 (PDT) Received: from mail-ua0-x235.google.com (mail-ua0-x235.google.com. [2607:f8b0:400c:c08::235]) by mx.google.com with ESMTPS id a6si1896543uac.220.2017.07.12.13.32.02 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 12 Jul 2017 13:32:02 -0700 (PDT) Received: by mail-ua0-x235.google.com with SMTP id g13so2922202uaj.0 for ; Wed, 12 Jul 2017 13:32:02 -0700 (PDT) MIME-Version: 1.0 From: Vasilis Dimitsas Date: Wed, 12 Jul 2017 23:31:21 +0300 Message-ID: Subject: asynchronous readahead prefetcher operation Content-Type: multipart/alternative; boundary="94eb2c1915045a1f7e055424b5be" Sender: owner-linux-mm@kvack.org List-ID: To: linux-mm@kvack.org --94eb2c1915045a1f7e055424b5be Content-Type: text/plain; charset="UTF-8" Good evening, I am currently working on a project which is related to the operation of the linux readahead prefetcher. As a result, I am trying to understand its operation. Having read thoroughly the relevant part in the kernel code, I realize, from the comments, that part of the prefetching occurs asynchronously. The problem is that I can not verify this from the code. Even if you call page_cache_sync_readahead() or page_cache_async_readahead(), then both will end up in ra_submit(), in which, the operation is common for both cases. So, please could you tell me at which point does the operation of prefetching occurs asynchronously? Thank you in advance, Vasilis Dimitsas --94eb2c1915045a1f7e055424b5be Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Good evening,

I am currently working on= a project which is related to the operation of the linux readahead prefetc= her. As a result, I am trying to understand its operation. Having read thor= oughly the relevant part in the kernel code, I realize, from the comments, = that part of the prefetching occurs asynchronously. The problem is that I c= an not verify this from the code.

Even if you call= page_cache_sync_readahead() or page_cache_async_readahead(), then both wil= l end up in ra_submit(), in which, the operation is common for both cases.<= /div>

So, please could you tell me at which point does t= he operation of prefetching occurs asynchronously?

Thank you in advance,

Vasilis Dimitsas =C2=A0
--94eb2c1915045a1f7e055424b5be-- -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pg0-f70.google.com (mail-pg0-f70.google.com [74.125.83.70]) by kanga.kvack.org (Postfix) with ESMTP id B222F440874 for ; Thu, 13 Jul 2017 12:34:39 -0400 (EDT) Received: by mail-pg0-f70.google.com with SMTP id u5so63505373pgq.14 for ; Thu, 13 Jul 2017 09:34:39 -0700 (PDT) Received: from bombadil.infradead.org (bombadil.infradead.org. [65.50.211.133]) by mx.google.com with ESMTPS id h187si4444055pgc.180.2017.07.13.09.34.38 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 13 Jul 2017 09:34:38 -0700 (PDT) Date: Thu, 13 Jul 2017 09:34:37 -0700 From: Matthew Wilcox Subject: Re: asynchronous readahead prefetcher operation Message-ID: <20170713163437.GA4469@bombadil.infradead.org> References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Sender: owner-linux-mm@kvack.org List-ID: To: Vasilis Dimitsas Cc: linux-mm@kvack.org On Wed, Jul 12, 2017 at 11:31:21PM +0300, Vasilis Dimitsas wrote: > I am currently working on a project which is related to the operation of > the linux readahead prefetcher. As a result, I am trying to understand its > operation. Having read thoroughly the relevant part in the kernel code, I > realize, from the comments, that part of the prefetching occurs > asynchronously. The problem is that I can not verify this from the code. > > Even if you call page_cache_sync_readahead() or > page_cache_async_readahead(), then both will end up in ra_submit(), in > which, the operation is common for both cases. > > So, please could you tell me at which point does the operation of > prefetching occurs asynchronously? The prefetching operation always occurs asynchronously; the I/O is submitted and then both page_cache_sync_readahead() and page_cache_async_readahead() return to the caller. They use slightly different algorithms, which is why they're different functions, but the I/O is not waited for. It's up to the caller to do that. I imagine you're looking at filemap_fault(), and it happens like this: page = find_get_page(mapping, offset); (returns NULL because there's no page in the cache) do_sync_mmap_readahead(vmf->vma, ra, file, offset); (will create pages and put them in the page cache, taking PageLock on each page) page = find_get_page(mapping, offset); (finds the page that was just created) if (!lock_page_or_retry(page, vmf->vma->vm_mm, vmf->flags)) { (will attempt to lock the page ... if it's locked and the fault lets us retry, fails so we can handle retries at the higher level. If it's locked and the fault says we can't retry, then sleeps until unlocked. If/once it's unlocked, will return success) When the I/O completes, the page will be unlocked, usually by calling page_endio(). -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-vk0-f69.google.com (mail-vk0-f69.google.com [209.85.213.69]) by kanga.kvack.org (Postfix) with ESMTP id 5F926440874 for ; Thu, 13 Jul 2017 16:37:04 -0400 (EDT) Received: by mail-vk0-f69.google.com with SMTP id r126so24091548vkg.9 for ; Thu, 13 Jul 2017 13:37:04 -0700 (PDT) Received: from mail-ua0-x230.google.com (mail-ua0-x230.google.com. [2607:f8b0:400c:c08::230]) by mx.google.com with ESMTPS id e68si2521311vkg.272.2017.07.13.13.37.03 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 13 Jul 2017 13:37:03 -0700 (PDT) Received: by mail-ua0-x230.google.com with SMTP id z22so41003997uah.1 for ; Thu, 13 Jul 2017 13:37:03 -0700 (PDT) MIME-Version: 1.0 In-Reply-To: <20170713163437.GA4469@bombadil.infradead.org> References: <20170713163437.GA4469@bombadil.infradead.org> From: Vasilis Dimitsas Date: Thu, 13 Jul 2017 23:36:22 +0300 Message-ID: Subject: Re: asynchronous readahead prefetcher operation Content-Type: multipart/alternative; boundary="f403045dcb962abf5d055438e5df" Sender: owner-linux-mm@kvack.org List-ID: To: Matthew Wilcox Cc: linux-mm@kvack.org --f403045dcb962abf5d055438e5df Content-Type: text/plain; charset="UTF-8" Hello Matthew, Thank you for your response. Since at user level I am using the pread() function, in kernel level, unless I am making a mistake, the do_generic_file_read() is being called. Inside this, the find_get_page() is called and if the page is not in the page cache then page_cache_sync_readahead() is called or page_cache_async_readahead() if the page is marked with the PG_readahead flag. So, I would like to find in which exact part of the code can someone understand that the I/O is not waited for. Thank you again, Vasilis On Thu, Jul 13, 2017 at 7:34 PM, Matthew Wilcox wrote: > On Wed, Jul 12, 2017 at 11:31:21PM +0300, Vasilis Dimitsas wrote: > > I am currently working on a project which is related to the operation of > > the linux readahead prefetcher. As a result, I am trying to understand > its > > operation. Having read thoroughly the relevant part in the kernel code, I > > realize, from the comments, that part of the prefetching occurs > > asynchronously. The problem is that I can not verify this from the code. > > > > Even if you call page_cache_sync_readahead() or > > page_cache_async_readahead(), then both will end up in ra_submit(), in > > which, the operation is common for both cases. > > > > So, please could you tell me at which point does the operation of > > prefetching occurs asynchronously? > > The prefetching operation always occurs asynchronously; the > I/O is submitted and then both page_cache_sync_readahead() and > page_cache_async_readahead() return to the caller. They use slightly > different algorithms, which is why they're different functions, but the > I/O is not waited for. It's up to the caller to do that. > > I imagine you're looking at filemap_fault(), and it happens like this: > > page = find_get_page(mapping, offset); > (returns NULL because there's no page in the cache) > do_sync_mmap_readahead(vmf->vma, ra, file, offset); > (will create pages and put them in the page cache, taking PageLock on each > page) > page = find_get_page(mapping, offset); > (finds the page that was just created) > if (!lock_page_or_retry(page, vmf->vma->vm_mm, vmf->flags)) { > (will attempt to lock the page ... if it's locked and the fault lets us > retry, > fails so we can handle retries at the higher level. If it's locked and the > fault says we can't retry, then sleeps until unlocked. If/once it's > unlocked, > will return success) > > When the I/O completes, the page will be unlocked, usually by calling > page_endio(). > --f403045dcb962abf5d055438e5df Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Hello Matthew,

Thank you for your respo= nse. Since at user level I am using the pread() function, in kernel level, = unless I am making a mistake, the do_generic_file_read() is being called. I= nside this, the find_get_page() is called and if the page is not in the pag= e cache then page_cache_sync_readahead() is called or page_cache_async_read= ahead() if the page is marked with the PG_readahead flag. So, I would like = to find in which exact part of the code can someone understand that the I/O= is not waited for.

Thank you again,
Vasilis

On Thu, Jul 13, 2017 at 7:34 PM, Matthew Wilcox <willy@inf= radead.org> wrote:
On Wed, Jul 12, 2017 at 11:31:21PM +0300, Vasilis Dimitsas wrote:<= br> > I am currently working on a project which is related to the operation = of
> the linux readahead prefetcher. As a result, I am trying to understand= its
> operation. Having read thoroughly the relevant part in the kernel code= , I
> realize, from the comments, that part of the prefetching occurs
> asynchronously. The problem is that I can not verify this from the cod= e.
>
> Even if you call page_cache_sync_readahead() or
> page_cache_async_readahead(), then both will end up in ra_submit(), in=
> which, the operation is common for both cases.
>
> So, please could you tell me at which point does the operation of
> prefetching occurs asynchronously?

The prefetching operation always occurs asynchronously; the
I/O is submitted and then both page_cache_sync_readahead() and
page_cache_async_readahead() return to the caller.=C2=A0 They use slightly<= br> different algorithms, which is why they're different functions, but the=
I/O is not waited for.=C2=A0 It's up to the caller to do that.

I imagine you're looking at filemap_fault(), and it happens like this:<= br>
=C2=A0 =C2=A0 =C2=A0 =C2=A0 page =3D find_get_page(mapping, offset);
(returns NULL because there's no page in the cache)
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 do_sync_mmap_readah= ead(vmf->vma, ra, file, offset);
(will create pages and put them in the page cache, taking PageLock on each = page)
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 page =3D find_get_p= age(mapping, offset);
(finds the page that was just created)
=C2=A0 =C2=A0 =C2=A0 =C2=A0 if (!lock_page_or_retry(page, vmf->vma->v= m_mm, vmf->flags)) {
(will attempt to lock the page ... if it's locked and the fault lets us= retry,
fails so we can handle retries at the higher level.=C2=A0 If it's locke= d and the
fault says we can't retry, then sleeps until unlocked.=C2=A0 If/once it= 's unlocked,
will return success)

When the I/O completes, the page will be unlocked, usually by calling
page_endio().

--f403045dcb962abf5d055438e5df-- -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pg0-f71.google.com (mail-pg0-f71.google.com [74.125.83.71]) by kanga.kvack.org (Postfix) with ESMTP id B355F4408E5 for ; Thu, 13 Jul 2017 21:55:16 -0400 (EDT) Received: by mail-pg0-f71.google.com with SMTP id u5so77787064pgq.14 for ; Thu, 13 Jul 2017 18:55:16 -0700 (PDT) Received: from bombadil.infradead.org (bombadil.infradead.org. [65.50.211.133]) by mx.google.com with ESMTPS id p14si5661574pli.440.2017.07.13.18.55.15 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 13 Jul 2017 18:55:15 -0700 (PDT) Date: Thu, 13 Jul 2017 18:55:15 -0700 From: Matthew Wilcox Subject: Re: asynchronous readahead prefetcher operation Message-ID: <20170714015515.GC4469@bombadil.infradead.org> References: <20170713163437.GA4469@bombadil.infradead.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Sender: owner-linux-mm@kvack.org List-ID: To: Vasilis Dimitsas Cc: linux-mm@kvack.org On Thu, Jul 13, 2017 at 11:36:22PM +0300, Vasilis Dimitsas wrote: > Hello Matthew, > > Thank you for your response. Since at user level I am using the pread() > function, in kernel level, unless I am making a mistake, the > do_generic_file_read() is being called. Inside this, the find_get_page() is > called and if the page is not in the page cache then > page_cache_sync_readahead() is called or page_cache_async_readahead() if > the page is marked with the PG_readahead flag. So, I would like to find in > which exact part of the code can someone understand that the I/O is not > waited for. As I said, the I/O is never waited for by either of these functions. The _sync_readahead vs _async_readahead calls just use a slightly different algorithm for choosing which pages to bring in. In the case of do_generic_file_read(), if the I/O has not finished by the time the call returns, then the page will not be marked Uptodate, so we follow this path: if (!PageUptodate(page)) { error = wait_on_page_locked_killable(page); and that is where we wait for the I/O to complete, no matter whether the I/O was triggered by the sync or async calls. > Thank you again, > > Vasilis > > On Thu, Jul 13, 2017 at 7:34 PM, Matthew Wilcox wrote: > > > On Wed, Jul 12, 2017 at 11:31:21PM +0300, Vasilis Dimitsas wrote: > > > I am currently working on a project which is related to the operation of > > > the linux readahead prefetcher. As a result, I am trying to understand > > its > > > operation. Having read thoroughly the relevant part in the kernel code, I > > > realize, from the comments, that part of the prefetching occurs > > > asynchronously. The problem is that I can not verify this from the code. > > > > > > Even if you call page_cache_sync_readahead() or > > > page_cache_async_readahead(), then both will end up in ra_submit(), in > > > which, the operation is common for both cases. > > > > > > So, please could you tell me at which point does the operation of > > > prefetching occurs asynchronously? > > > > The prefetching operation always occurs asynchronously; the > > I/O is submitted and then both page_cache_sync_readahead() and > > page_cache_async_readahead() return to the caller. They use slightly > > different algorithms, which is why they're different functions, but the > > I/O is not waited for. It's up to the caller to do that. > > > > I imagine you're looking at filemap_fault(), and it happens like this: > > > > page = find_get_page(mapping, offset); > > (returns NULL because there's no page in the cache) > > do_sync_mmap_readahead(vmf->vma, ra, file, offset); > > (will create pages and put them in the page cache, taking PageLock on each > > page) > > page = find_get_page(mapping, offset); > > (finds the page that was just created) > > if (!lock_page_or_retry(page, vmf->vma->vm_mm, vmf->flags)) { > > (will attempt to lock the page ... if it's locked and the fault lets us > > retry, > > fails so we can handle retries at the higher level. If it's locked and the > > fault says we can't retry, then sleeps until unlocked. If/once it's > > unlocked, > > will return success) > > > > When the I/O completes, the page will be unlocked, usually by calling > > page_endio(). > > -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org