All of lore.kernel.org
 help / color / mirror / Atom feed
From: Michal Hocko <mhocko@kernel.org>
To: Jan Kara <jack@suse.cz>
Cc: "Theodore Ts'o" <tytso@mit.edu>,
	linux-mm@kvack.org, linux-fsdevel@vger.kernel.org,
	Andrew Morton <akpm@linux-foundation.org>,
	Dave Chinner <david@fromorbit.com>,
	djwong@kernel.org, Chris Mason <clm@fb.com>,
	David Sterba <dsterba@suse.cz>,
	ceph-devel@vger.kernel.org, cluster-devel@redhat.com,
	linux-nfs@vger.kernel.org, logfs@logfs.org,
	linux-xfs@vger.kernel.org, linux-ext4@vger.kernel.org,
	linux-btrfs@vger.kernel.org, linux-mtd@lists.infradead.org,
	reiserfs-devel@vger.kernel.org,
	linux-ntfs-dev@lists.sourceforge.net,
	linux-f2fs-devel@lists.sourceforge.net,
	linux-afs@lists.infradead.org,
	LKML <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH 8/8] Revert "ext4: fix wrong gfp type under transaction"
Date: Thu, 19 Jan 2017 09:39:56 +0100	[thread overview]
Message-ID: <20170119083956.GE30786@dhcp22.suse.cz> (raw)
In-Reply-To: <20170117172925.GA2486@quack2.suse.cz>

On Tue 17-01-17 18:29:25, Jan Kara wrote:
> On Tue 17-01-17 17:16:19, Michal Hocko wrote:
> > > > But before going to play with that I am really wondering whether we need
> > > > all this with no journal at all. AFAIU what Jack told me it is the
> > > > journal lock(s) which is the biggest problem from the reclaim recursion
> > > > point of view. What would cause a deadlock in no journal mode?
> > > 
> > > We still have the original problem for why we need GFP_NOFS even in
> > > ext2.  If we are in a writeback path, and we need to allocate memory,
> > > we don't want to recurse back into the file system's writeback path.
> > 
> > But we do not enter the writeback path from the direct reclaim. Or do
> > you mean something other than pageout()'s mapping->a_ops->writepage?
> > There is only try_to_release_page where we get back to the filesystems
> > but I do not see any NOFS protection in ext4_releasepage.
> 
> Maybe to expand a bit: These days, direct reclaim can call ->releasepage()
> callback, ->evict_inode() callback (and only for inodes with i_nlink > 0),
> shrinkers. That's it. So the recursion possibilities are rather more limited
> than they used to be several years ago and we likely do not need as much
> GFP_NOFS protection as we used to.

Thanks for making my remark more clear Jack! I would just want to add
that I was playing with the patch below (it is basically
GFP_NOFS->GFP_KERNEL for all allocations which trigger warning from the
debugging patch which means they are called from within transaction) and
it didn't hit the lockdep when running xfstests both with or without the
enabled journal.

So am I still missing something or the nojournal mode is safe and the
current series is OK wrt. ext*?

The following patch in its current form is WIP and needs a proper review
before I post it.
---
 fs/ext4/inode.c       |  4 ++--
 fs/ext4/mballoc.c     | 14 +++++++-------
 fs/ext4/xattr.c       |  2 +-
 fs/jbd2/journal.c     |  4 ++--
 fs/jbd2/revoke.c      |  2 +-
 fs/jbd2/transaction.c |  2 +-
 6 files changed, 14 insertions(+), 14 deletions(-)

diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index b7d141c3b810..841cb8c4cb5e 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -2085,7 +2085,7 @@ static int ext4_writepage(struct page *page,
 		return __ext4_journalled_writepage(page, len);
 
 	ext4_io_submit_init(&io_submit, wbc);
-	io_submit.io_end = ext4_init_io_end(inode, GFP_NOFS);
+	io_submit.io_end = ext4_init_io_end(inode, GFP_KERNEL);
 	if (!io_submit.io_end) {
 		redirty_page_for_writepage(wbc, page);
 		unlock_page(page);
@@ -3794,7 +3794,7 @@ static int __ext4_block_zero_page_range(handle_t *handle,
 	int err = 0;
 
 	page = find_or_create_page(mapping, from >> PAGE_SHIFT,
-				   mapping_gfp_constraint(mapping, ~__GFP_FS));
+				   mapping_gfp_mask(mapping));
 	if (!page)
 		return -ENOMEM;
 
diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
index d9fd184b049e..67b97cd6e3d6 100644
--- a/fs/ext4/mballoc.c
+++ b/fs/ext4/mballoc.c
@@ -1251,7 +1251,7 @@ ext4_mb_load_buddy_gfp(struct super_block *sb, ext4_group_t group,
 static int ext4_mb_load_buddy(struct super_block *sb, ext4_group_t group,
 			      struct ext4_buddy *e4b)
 {
-	return ext4_mb_load_buddy_gfp(sb, group, e4b, GFP_NOFS);
+	return ext4_mb_load_buddy_gfp(sb, group, e4b, GFP_KERNEL);
 }
 
 static void ext4_mb_unload_buddy(struct ext4_buddy *e4b)
@@ -2054,7 +2054,7 @@ static int ext4_mb_good_group(struct ext4_allocation_context *ac,
 
 	/* We only do this if the grp has never been initialized */
 	if (unlikely(EXT4_MB_GRP_NEED_INIT(grp))) {
-		int ret = ext4_mb_init_group(ac->ac_sb, group, GFP_NOFS);
+		int ret = ext4_mb_init_group(ac->ac_sb, group, GFP_KERNEL);
 		if (ret)
 			return ret;
 	}
@@ -3600,7 +3600,7 @@ ext4_mb_new_inode_pa(struct ext4_allocation_context *ac)
 	BUG_ON(ac->ac_status != AC_STATUS_FOUND);
 	BUG_ON(!S_ISREG(ac->ac_inode->i_mode));
 
-	pa = kmem_cache_alloc(ext4_pspace_cachep, GFP_NOFS);
+	pa = kmem_cache_alloc(ext4_pspace_cachep, GFP_KERNEL);
 	if (pa == NULL)
 		return -ENOMEM;
 
@@ -3694,7 +3694,7 @@ ext4_mb_new_group_pa(struct ext4_allocation_context *ac)
 	BUG_ON(!S_ISREG(ac->ac_inode->i_mode));
 
 	BUG_ON(ext4_pspace_cachep == NULL);
-	pa = kmem_cache_alloc(ext4_pspace_cachep, GFP_NOFS);
+	pa = kmem_cache_alloc(ext4_pspace_cachep, GFP_KERNEL);
 	if (pa == NULL)
 		return -ENOMEM;
 
@@ -4479,7 +4479,7 @@ ext4_fsblk_t ext4_mb_new_blocks(handle_t *handle,
 		}
 	}
 
-	ac = kmem_cache_zalloc(ext4_ac_cachep, GFP_NOFS);
+	ac = kmem_cache_zalloc(ext4_ac_cachep, GFP_KERNEL);
 	if (!ac) {
 		ar->len = 0;
 		*errp = -ENOMEM;
@@ -4813,7 +4813,7 @@ void ext4_free_blocks(handle_t *handle, struct inode *inode,
 
 	/* __GFP_NOFAIL: retry infinitely, ignore TIF_MEMDIE and memcg limit. */
 	err = ext4_mb_load_buddy_gfp(sb, block_group, &e4b,
-				     GFP_NOFS|__GFP_NOFAIL);
+				     GFP_KERNEL|__GFP_NOFAIL);
 	if (err)
 		goto error_return;
 
@@ -4832,7 +4832,7 @@ void ext4_free_blocks(handle_t *handle, struct inode *inode,
 		 * to fail.
 		 */
 		new_entry = kmem_cache_alloc(ext4_free_data_cachep,
-				GFP_NOFS|__GFP_NOFAIL);
+				GFP_KERNEL|__GFP_NOFAIL);
 		new_entry->efd_start_cluster = bit;
 		new_entry->efd_group = block_group;
 		new_entry->efd_count = count_clusters;
diff --git a/fs/ext4/xattr.c b/fs/ext4/xattr.c
index 172317462238..f68e8c87f9f2 100644
--- a/fs/ext4/xattr.c
+++ b/fs/ext4/xattr.c
@@ -1650,7 +1650,7 @@ ext4_xattr_cache_insert(struct mb_cache *ext4_mb_cache, struct buffer_head *bh)
 		       EXT4_XATTR_REFCOUNT_MAX;
 	int error;
 
-	error = mb_cache_entry_create(ext4_mb_cache, GFP_NOFS, hash,
+	error = mb_cache_entry_create(ext4_mb_cache, GFP_KERNEL, hash,
 				      bh->b_blocknr, reusable);
 	if (error) {
 		if (error == -EBUSY)
diff --git a/fs/jbd2/journal.c b/fs/jbd2/journal.c
index 3a449150f834..bd29daa975a5 100644
--- a/fs/jbd2/journal.c
+++ b/fs/jbd2/journal.c
@@ -379,7 +379,7 @@ int jbd2_journal_write_metadata_buffer(transaction_t *transaction,
 	 */
 	J_ASSERT_BH(bh_in, buffer_jbddirty(bh_in));
 
-	new_bh = alloc_buffer_head(GFP_NOFS|__GFP_NOFAIL);
+	new_bh = alloc_buffer_head(GFP_KERNEL|__GFP_NOFAIL);
 
 	/* keep subsequent assertions sane */
 	atomic_set(&new_bh->b_count, 1);
@@ -2375,7 +2375,7 @@ static struct journal_head *journal_alloc_journal_head(void)
 #ifdef CONFIG_JBD2_DEBUG
 	atomic_inc(&nr_journal_heads);
 #endif
-	ret = kmem_cache_zalloc(jbd2_journal_head_cache, GFP_NOFS);
+	ret = kmem_cache_zalloc(jbd2_journal_head_cache, GFP_KERNEL);
 	if (!ret) {
 		jbd_debug(1, "out of memory for journal_head\n");
 		pr_notice_ratelimited("ENOMEM in %s, retrying.\n", __func__);
diff --git a/fs/jbd2/revoke.c b/fs/jbd2/revoke.c
index cfc38b552118..c9c347468c5b 100644
--- a/fs/jbd2/revoke.c
+++ b/fs/jbd2/revoke.c
@@ -141,7 +141,7 @@ static int insert_revoke_hash(journal_t *journal, unsigned long long blocknr,
 {
 	struct list_head *hash_list;
 	struct jbd2_revoke_record_s *record;
-	gfp_t gfp_mask = GFP_NOFS;
+	gfp_t gfp_mask = GFP_KERNEL;
 
 	if (journal_oom_retry)
 		gfp_mask |= __GFP_NOFAIL;
diff --git a/fs/jbd2/transaction.c b/fs/jbd2/transaction.c
index 35a5d3d76182..a7e50eb330a8 100644
--- a/fs/jbd2/transaction.c
+++ b/fs/jbd2/transaction.c
@@ -974,7 +974,7 @@ do_get_write_access(handle_t *handle, struct journal_head *jh,
 			JBUFFER_TRACE(jh, "allocate memory for buffer");
 			jbd_unlock_bh_state(bh);
 			frozen_buffer = jbd2_alloc(jh2bh(jh)->b_size,
-						   GFP_NOFS | __GFP_NOFAIL);
+						   GFP_KERNEL | __GFP_NOFAIL);
 			goto repeat;
 		}
 		jh->b_frozen_data = frozen_buffer;
-- 
2.11.0

-- 
Michal Hocko
SUSE Labs

WARNING: multiple messages have this Message-ID (diff)
From: Michal Hocko <mhocko@kernel.org>
To: Jan Kara <jack@suse.cz>
Cc: Theodore Ts'o <tytso@mit.edu>,
	linux-mm@kvack.org, linux-fsdevel@vger.kernel.org,
	Andrew Morton <akpm@linux-foundation.org>,
	Dave Chinner <david@fromorbit.com>,
	djwong@kernel.org, Chris Mason <clm@fb.com>,
	David Sterba <dsterba@suse.cz>,
	ceph-devel@vger.kernel.org, cluster-devel@redhat.com,
	linux-nfs@vger.kernel.org, logfs@logfs.org,
	linux-xfs@vger.kernel.org, linux-ext4@vger.kernel.org,
	linux-btrfs@vger.kernel.org, linux-mtd@lists.infradead.org,
	reiserfs-devel@vger.kernel.org,
	linux-ntfs-dev@lists.sourceforge.net,
	linux-f2fs-devel@lists.sourceforge.net,
	linux-afs@lists.infradead.org,
	LKML <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH 8/8] Revert "ext4: fix wrong gfp type under transaction"
Date: Thu, 19 Jan 2017 09:39:56 +0100	[thread overview]
Message-ID: <20170119083956.GE30786@dhcp22.suse.cz> (raw)
In-Reply-To: <20170117172925.GA2486@quack2.suse.cz>

On Tue 17-01-17 18:29:25, Jan Kara wrote:
> On Tue 17-01-17 17:16:19, Michal Hocko wrote:
> > > > But before going to play with that I am really wondering whether we need
> > > > all this with no journal at all. AFAIU what Jack told me it is the
> > > > journal lock(s) which is the biggest problem from the reclaim recursion
> > > > point of view. What would cause a deadlock in no journal mode?
> > > 
> > > We still have the original problem for why we need GFP_NOFS even in
> > > ext2.  If we are in a writeback path, and we need to allocate memory,
> > > we don't want to recurse back into the file system's writeback path.
> > 
> > But we do not enter the writeback path from the direct reclaim. Or do
> > you mean something other than pageout()'s mapping->a_ops->writepage?
> > There is only try_to_release_page where we get back to the filesystems
> > but I do not see any NOFS protection in ext4_releasepage.
> 
> Maybe to expand a bit: These days, direct reclaim can call ->releasepage()
> callback, ->evict_inode() callback (and only for inodes with i_nlink > 0),
> shrinkers. That's it. So the recursion possibilities are rather more limited
> than they used to be several years ago and we likely do not need as much
> GFP_NOFS protection as we used to.

Thanks for making my remark more clear Jack! I would just want to add
that I was playing with the patch below (it is basically
GFP_NOFS->GFP_KERNEL for all allocations which trigger warning from the
debugging patch which means they are called from within transaction) and
it didn't hit the lockdep when running xfstests both with or without the
enabled journal.

So am I still missing something or the nojournal mode is safe and the
current series is OK wrt. ext*?

The following patch in its current form is WIP and needs a proper review
before I post it.
---
 fs/ext4/inode.c       |  4 ++--
 fs/ext4/mballoc.c     | 14 +++++++-------
 fs/ext4/xattr.c       |  2 +-
 fs/jbd2/journal.c     |  4 ++--
 fs/jbd2/revoke.c      |  2 +-
 fs/jbd2/transaction.c |  2 +-
 6 files changed, 14 insertions(+), 14 deletions(-)

diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index b7d141c3b810..841cb8c4cb5e 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -2085,7 +2085,7 @@ static int ext4_writepage(struct page *page,
 		return __ext4_journalled_writepage(page, len);
 
 	ext4_io_submit_init(&io_submit, wbc);
-	io_submit.io_end = ext4_init_io_end(inode, GFP_NOFS);
+	io_submit.io_end = ext4_init_io_end(inode, GFP_KERNEL);
 	if (!io_submit.io_end) {
 		redirty_page_for_writepage(wbc, page);
 		unlock_page(page);
@@ -3794,7 +3794,7 @@ static int __ext4_block_zero_page_range(handle_t *handle,
 	int err = 0;
 
 	page = find_or_create_page(mapping, from >> PAGE_SHIFT,
-				   mapping_gfp_constraint(mapping, ~__GFP_FS));
+				   mapping_gfp_mask(mapping));
 	if (!page)
 		return -ENOMEM;
 
diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
index d9fd184b049e..67b97cd6e3d6 100644
--- a/fs/ext4/mballoc.c
+++ b/fs/ext4/mballoc.c
@@ -1251,7 +1251,7 @@ ext4_mb_load_buddy_gfp(struct super_block *sb, ext4_group_t group,
 static int ext4_mb_load_buddy(struct super_block *sb, ext4_group_t group,
 			      struct ext4_buddy *e4b)
 {
-	return ext4_mb_load_buddy_gfp(sb, group, e4b, GFP_NOFS);
+	return ext4_mb_load_buddy_gfp(sb, group, e4b, GFP_KERNEL);
 }
 
 static void ext4_mb_unload_buddy(struct ext4_buddy *e4b)
@@ -2054,7 +2054,7 @@ static int ext4_mb_good_group(struct ext4_allocation_context *ac,
 
 	/* We only do this if the grp has never been initialized */
 	if (unlikely(EXT4_MB_GRP_NEED_INIT(grp))) {
-		int ret = ext4_mb_init_group(ac->ac_sb, group, GFP_NOFS);
+		int ret = ext4_mb_init_group(ac->ac_sb, group, GFP_KERNEL);
 		if (ret)
 			return ret;
 	}
@@ -3600,7 +3600,7 @@ ext4_mb_new_inode_pa(struct ext4_allocation_context *ac)
 	BUG_ON(ac->ac_status != AC_STATUS_FOUND);
 	BUG_ON(!S_ISREG(ac->ac_inode->i_mode));
 
-	pa = kmem_cache_alloc(ext4_pspace_cachep, GFP_NOFS);
+	pa = kmem_cache_alloc(ext4_pspace_cachep, GFP_KERNEL);
 	if (pa == NULL)
 		return -ENOMEM;
 
@@ -3694,7 +3694,7 @@ ext4_mb_new_group_pa(struct ext4_allocation_context *ac)
 	BUG_ON(!S_ISREG(ac->ac_inode->i_mode));
 
 	BUG_ON(ext4_pspace_cachep == NULL);
-	pa = kmem_cache_alloc(ext4_pspace_cachep, GFP_NOFS);
+	pa = kmem_cache_alloc(ext4_pspace_cachep, GFP_KERNEL);
 	if (pa == NULL)
 		return -ENOMEM;
 
@@ -4479,7 +4479,7 @@ ext4_fsblk_t ext4_mb_new_blocks(handle_t *handle,
 		}
 	}
 
-	ac = kmem_cache_zalloc(ext4_ac_cachep, GFP_NOFS);
+	ac = kmem_cache_zalloc(ext4_ac_cachep, GFP_KERNEL);
 	if (!ac) {
 		ar->len = 0;
 		*errp = -ENOMEM;
@@ -4813,7 +4813,7 @@ void ext4_free_blocks(handle_t *handle, struct inode *inode,
 
 	/* __GFP_NOFAIL: retry infinitely, ignore TIF_MEMDIE and memcg limit. */
 	err = ext4_mb_load_buddy_gfp(sb, block_group, &e4b,
-				     GFP_NOFS|__GFP_NOFAIL);
+				     GFP_KERNEL|__GFP_NOFAIL);
 	if (err)
 		goto error_return;
 
@@ -4832,7 +4832,7 @@ void ext4_free_blocks(handle_t *handle, struct inode *inode,
 		 * to fail.
 		 */
 		new_entry = kmem_cache_alloc(ext4_free_data_cachep,
-				GFP_NOFS|__GFP_NOFAIL);
+				GFP_KERNEL|__GFP_NOFAIL);
 		new_entry->efd_start_cluster = bit;
 		new_entry->efd_group = block_group;
 		new_entry->efd_count = count_clusters;
diff --git a/fs/ext4/xattr.c b/fs/ext4/xattr.c
index 172317462238..f68e8c87f9f2 100644
--- a/fs/ext4/xattr.c
+++ b/fs/ext4/xattr.c
@@ -1650,7 +1650,7 @@ ext4_xattr_cache_insert(struct mb_cache *ext4_mb_cache, struct buffer_head *bh)
 		       EXT4_XATTR_REFCOUNT_MAX;
 	int error;
 
-	error = mb_cache_entry_create(ext4_mb_cache, GFP_NOFS, hash,
+	error = mb_cache_entry_create(ext4_mb_cache, GFP_KERNEL, hash,
 				      bh->b_blocknr, reusable);
 	if (error) {
 		if (error == -EBUSY)
diff --git a/fs/jbd2/journal.c b/fs/jbd2/journal.c
index 3a449150f834..bd29daa975a5 100644
--- a/fs/jbd2/journal.c
+++ b/fs/jbd2/journal.c
@@ -379,7 +379,7 @@ int jbd2_journal_write_metadata_buffer(transaction_t *transaction,
 	 */
 	J_ASSERT_BH(bh_in, buffer_jbddirty(bh_in));
 
-	new_bh = alloc_buffer_head(GFP_NOFS|__GFP_NOFAIL);
+	new_bh = alloc_buffer_head(GFP_KERNEL|__GFP_NOFAIL);
 
 	/* keep subsequent assertions sane */
 	atomic_set(&new_bh->b_count, 1);
@@ -2375,7 +2375,7 @@ static struct journal_head *journal_alloc_journal_head(void)
 #ifdef CONFIG_JBD2_DEBUG
 	atomic_inc(&nr_journal_heads);
 #endif
-	ret = kmem_cache_zalloc(jbd2_journal_head_cache, GFP_NOFS);
+	ret = kmem_cache_zalloc(jbd2_journal_head_cache, GFP_KERNEL);
 	if (!ret) {
 		jbd_debug(1, "out of memory for journal_head\n");
 		pr_notice_ratelimited("ENOMEM in %s, retrying.\n", __func__);
diff --git a/fs/jbd2/revoke.c b/fs/jbd2/revoke.c
index cfc38b552118..c9c347468c5b 100644
--- a/fs/jbd2/revoke.c
+++ b/fs/jbd2/revoke.c
@@ -141,7 +141,7 @@ static int insert_revoke_hash(journal_t *journal, unsigned long long blocknr,
 {
 	struct list_head *hash_list;
 	struct jbd2_revoke_record_s *record;
-	gfp_t gfp_mask = GFP_NOFS;
+	gfp_t gfp_mask = GFP_KERNEL;
 
 	if (journal_oom_retry)
 		gfp_mask |= __GFP_NOFAIL;
diff --git a/fs/jbd2/transaction.c b/fs/jbd2/transaction.c
index 35a5d3d76182..a7e50eb330a8 100644
--- a/fs/jbd2/transaction.c
+++ b/fs/jbd2/transaction.c
@@ -974,7 +974,7 @@ do_get_write_access(handle_t *handle, struct journal_head *jh,
 			JBUFFER_TRACE(jh, "allocate memory for buffer");
 			jbd_unlock_bh_state(bh);
 			frozen_buffer = jbd2_alloc(jh2bh(jh)->b_size,
-						   GFP_NOFS | __GFP_NOFAIL);
+						   GFP_KERNEL | __GFP_NOFAIL);
 			goto repeat;
 		}
 		jh->b_frozen_data = frozen_buffer;
-- 
2.11.0

-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

WARNING: multiple messages have this Message-ID (diff)
From: Michal Hocko <mhocko@kernel.org>
To: cluster-devel.redhat.com
Subject: [Cluster-devel] [PATCH 8/8] Revert "ext4: fix wrong gfp type under transaction"
Date: Thu, 19 Jan 2017 09:39:56 +0100	[thread overview]
Message-ID: <20170119083956.GE30786@dhcp22.suse.cz> (raw)
In-Reply-To: <20170117172925.GA2486@quack2.suse.cz>

On Tue 17-01-17 18:29:25, Jan Kara wrote:
> On Tue 17-01-17 17:16:19, Michal Hocko wrote:
> > > > But before going to play with that I am really wondering whether we need
> > > > all this with no journal at all. AFAIU what Jack told me it is the
> > > > journal lock(s) which is the biggest problem from the reclaim recursion
> > > > point of view. What would cause a deadlock in no journal mode?
> > > 
> > > We still have the original problem for why we need GFP_NOFS even in
> > > ext2.  If we are in a writeback path, and we need to allocate memory,
> > > we don't want to recurse back into the file system's writeback path.
> > 
> > But we do not enter the writeback path from the direct reclaim. Or do
> > you mean something other than pageout()'s mapping->a_ops->writepage?
> > There is only try_to_release_page where we get back to the filesystems
> > but I do not see any NOFS protection in ext4_releasepage.
> 
> Maybe to expand a bit: These days, direct reclaim can call ->releasepage()
> callback, ->evict_inode() callback (and only for inodes with i_nlink > 0),
> shrinkers. That's it. So the recursion possibilities are rather more limited
> than they used to be several years ago and we likely do not need as much
> GFP_NOFS protection as we used to.

Thanks for making my remark more clear Jack! I would just want to add
that I was playing with the patch below (it is basically
GFP_NOFS->GFP_KERNEL for all allocations which trigger warning from the
debugging patch which means they are called from within transaction) and
it didn't hit the lockdep when running xfstests both with or without the
enabled journal.

So am I still missing something or the nojournal mode is safe and the
current series is OK wrt. ext*?

The following patch in its current form is WIP and needs a proper review
before I post it.
---
 fs/ext4/inode.c       |  4 ++--
 fs/ext4/mballoc.c     | 14 +++++++-------
 fs/ext4/xattr.c       |  2 +-
 fs/jbd2/journal.c     |  4 ++--
 fs/jbd2/revoke.c      |  2 +-
 fs/jbd2/transaction.c |  2 +-
 6 files changed, 14 insertions(+), 14 deletions(-)

diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index b7d141c3b810..841cb8c4cb5e 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -2085,7 +2085,7 @@ static int ext4_writepage(struct page *page,
 		return __ext4_journalled_writepage(page, len);
 
 	ext4_io_submit_init(&io_submit, wbc);
-	io_submit.io_end = ext4_init_io_end(inode, GFP_NOFS);
+	io_submit.io_end = ext4_init_io_end(inode, GFP_KERNEL);
 	if (!io_submit.io_end) {
 		redirty_page_for_writepage(wbc, page);
 		unlock_page(page);
@@ -3794,7 +3794,7 @@ static int __ext4_block_zero_page_range(handle_t *handle,
 	int err = 0;
 
 	page = find_or_create_page(mapping, from >> PAGE_SHIFT,
-				   mapping_gfp_constraint(mapping, ~__GFP_FS));
+				   mapping_gfp_mask(mapping));
 	if (!page)
 		return -ENOMEM;
 
diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
index d9fd184b049e..67b97cd6e3d6 100644
--- a/fs/ext4/mballoc.c
+++ b/fs/ext4/mballoc.c
@@ -1251,7 +1251,7 @@ ext4_mb_load_buddy_gfp(struct super_block *sb, ext4_group_t group,
 static int ext4_mb_load_buddy(struct super_block *sb, ext4_group_t group,
 			      struct ext4_buddy *e4b)
 {
-	return ext4_mb_load_buddy_gfp(sb, group, e4b, GFP_NOFS);
+	return ext4_mb_load_buddy_gfp(sb, group, e4b, GFP_KERNEL);
 }
 
 static void ext4_mb_unload_buddy(struct ext4_buddy *e4b)
@@ -2054,7 +2054,7 @@ static int ext4_mb_good_group(struct ext4_allocation_context *ac,
 
 	/* We only do this if the grp has never been initialized */
 	if (unlikely(EXT4_MB_GRP_NEED_INIT(grp))) {
-		int ret = ext4_mb_init_group(ac->ac_sb, group, GFP_NOFS);
+		int ret = ext4_mb_init_group(ac->ac_sb, group, GFP_KERNEL);
 		if (ret)
 			return ret;
 	}
@@ -3600,7 +3600,7 @@ ext4_mb_new_inode_pa(struct ext4_allocation_context *ac)
 	BUG_ON(ac->ac_status != AC_STATUS_FOUND);
 	BUG_ON(!S_ISREG(ac->ac_inode->i_mode));
 
-	pa = kmem_cache_alloc(ext4_pspace_cachep, GFP_NOFS);
+	pa = kmem_cache_alloc(ext4_pspace_cachep, GFP_KERNEL);
 	if (pa == NULL)
 		return -ENOMEM;
 
@@ -3694,7 +3694,7 @@ ext4_mb_new_group_pa(struct ext4_allocation_context *ac)
 	BUG_ON(!S_ISREG(ac->ac_inode->i_mode));
 
 	BUG_ON(ext4_pspace_cachep == NULL);
-	pa = kmem_cache_alloc(ext4_pspace_cachep, GFP_NOFS);
+	pa = kmem_cache_alloc(ext4_pspace_cachep, GFP_KERNEL);
 	if (pa == NULL)
 		return -ENOMEM;
 
@@ -4479,7 +4479,7 @@ ext4_fsblk_t ext4_mb_new_blocks(handle_t *handle,
 		}
 	}
 
-	ac = kmem_cache_zalloc(ext4_ac_cachep, GFP_NOFS);
+	ac = kmem_cache_zalloc(ext4_ac_cachep, GFP_KERNEL);
 	if (!ac) {
 		ar->len = 0;
 		*errp = -ENOMEM;
@@ -4813,7 +4813,7 @@ void ext4_free_blocks(handle_t *handle, struct inode *inode,
 
 	/* __GFP_NOFAIL: retry infinitely, ignore TIF_MEMDIE and memcg limit. */
 	err = ext4_mb_load_buddy_gfp(sb, block_group, &e4b,
-				     GFP_NOFS|__GFP_NOFAIL);
+				     GFP_KERNEL|__GFP_NOFAIL);
 	if (err)
 		goto error_return;
 
@@ -4832,7 +4832,7 @@ void ext4_free_blocks(handle_t *handle, struct inode *inode,
 		 * to fail.
 		 */
 		new_entry = kmem_cache_alloc(ext4_free_data_cachep,
-				GFP_NOFS|__GFP_NOFAIL);
+				GFP_KERNEL|__GFP_NOFAIL);
 		new_entry->efd_start_cluster = bit;
 		new_entry->efd_group = block_group;
 		new_entry->efd_count = count_clusters;
diff --git a/fs/ext4/xattr.c b/fs/ext4/xattr.c
index 172317462238..f68e8c87f9f2 100644
--- a/fs/ext4/xattr.c
+++ b/fs/ext4/xattr.c
@@ -1650,7 +1650,7 @@ ext4_xattr_cache_insert(struct mb_cache *ext4_mb_cache, struct buffer_head *bh)
 		       EXT4_XATTR_REFCOUNT_MAX;
 	int error;
 
-	error = mb_cache_entry_create(ext4_mb_cache, GFP_NOFS, hash,
+	error = mb_cache_entry_create(ext4_mb_cache, GFP_KERNEL, hash,
 				      bh->b_blocknr, reusable);
 	if (error) {
 		if (error == -EBUSY)
diff --git a/fs/jbd2/journal.c b/fs/jbd2/journal.c
index 3a449150f834..bd29daa975a5 100644
--- a/fs/jbd2/journal.c
+++ b/fs/jbd2/journal.c
@@ -379,7 +379,7 @@ int jbd2_journal_write_metadata_buffer(transaction_t *transaction,
 	 */
 	J_ASSERT_BH(bh_in, buffer_jbddirty(bh_in));
 
-	new_bh = alloc_buffer_head(GFP_NOFS|__GFP_NOFAIL);
+	new_bh = alloc_buffer_head(GFP_KERNEL|__GFP_NOFAIL);
 
 	/* keep subsequent assertions sane */
 	atomic_set(&new_bh->b_count, 1);
@@ -2375,7 +2375,7 @@ static struct journal_head *journal_alloc_journal_head(void)
 #ifdef CONFIG_JBD2_DEBUG
 	atomic_inc(&nr_journal_heads);
 #endif
-	ret = kmem_cache_zalloc(jbd2_journal_head_cache, GFP_NOFS);
+	ret = kmem_cache_zalloc(jbd2_journal_head_cache, GFP_KERNEL);
 	if (!ret) {
 		jbd_debug(1, "out of memory for journal_head\n");
 		pr_notice_ratelimited("ENOMEM in %s, retrying.\n", __func__);
diff --git a/fs/jbd2/revoke.c b/fs/jbd2/revoke.c
index cfc38b552118..c9c347468c5b 100644
--- a/fs/jbd2/revoke.c
+++ b/fs/jbd2/revoke.c
@@ -141,7 +141,7 @@ static int insert_revoke_hash(journal_t *journal, unsigned long long blocknr,
 {
 	struct list_head *hash_list;
 	struct jbd2_revoke_record_s *record;
-	gfp_t gfp_mask = GFP_NOFS;
+	gfp_t gfp_mask = GFP_KERNEL;
 
 	if (journal_oom_retry)
 		gfp_mask |= __GFP_NOFAIL;
diff --git a/fs/jbd2/transaction.c b/fs/jbd2/transaction.c
index 35a5d3d76182..a7e50eb330a8 100644
--- a/fs/jbd2/transaction.c
+++ b/fs/jbd2/transaction.c
@@ -974,7 +974,7 @@ do_get_write_access(handle_t *handle, struct journal_head *jh,
 			JBUFFER_TRACE(jh, "allocate memory for buffer");
 			jbd_unlock_bh_state(bh);
 			frozen_buffer = jbd2_alloc(jh2bh(jh)->b_size,
-						   GFP_NOFS | __GFP_NOFAIL);
+						   GFP_KERNEL | __GFP_NOFAIL);
 			goto repeat;
 		}
 		jh->b_frozen_data = frozen_buffer;
-- 
2.11.0

-- 
Michal Hocko
SUSE Labs



  reply	other threads:[~2017-01-19  8:40 UTC|newest]

Thread overview: 167+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-01-06 14:10 [PATCH 0/8 v3] scope GFP_NOFS api Michal Hocko
2017-01-06 14:10 ` [Cluster-devel] " Michal Hocko
2017-01-06 14:10 ` Michal Hocko
2017-01-06 14:10 ` Michal Hocko
2017-01-06 14:10 ` Michal Hocko
2017-01-06 14:11 ` [PATCH 1/8] lockdep: allow to disable reclaim lockup detection Michal Hocko
2017-01-06 14:11   ` [Cluster-devel] " Michal Hocko
2017-01-06 14:11   ` Michal Hocko
2017-01-06 14:11   ` Michal Hocko
2017-01-06 14:11   ` Michal Hocko
2017-01-09 12:56   ` Vlastimil Babka
2017-01-09 12:56     ` [Cluster-devel] " Vlastimil Babka
2017-01-09 12:56     ` Vlastimil Babka
2017-01-06 14:11 ` [PATCH 2/8] xfs: abstract PF_FSTRANS to PF_MEMALLOC_NOFS Michal Hocko
2017-01-06 14:11   ` [Cluster-devel] " Michal Hocko
2017-01-06 14:11   ` Michal Hocko
2017-01-06 14:11   ` Michal Hocko
2017-01-06 14:11   ` Michal Hocko
2017-01-09 12:59   ` Vlastimil Babka
2017-01-09 12:59     ` [Cluster-devel] " Vlastimil Babka
2017-01-09 12:59     ` Vlastimil Babka
2017-01-09 14:29     ` Michal Hocko
2017-01-09 14:29       ` [Cluster-devel] " Michal Hocko
2017-01-09 14:29       ` Michal Hocko
2017-01-09 20:58   ` Darrick J. Wong
2017-01-09 20:58     ` [Cluster-devel] " Darrick J. Wong
2017-01-09 20:58     ` Darrick J. Wong
2017-01-06 14:11 ` [PATCH 3/8] mm: introduce memalloc_nofs_{save,restore} API Michal Hocko
2017-01-06 14:11   ` [Cluster-devel] [PATCH 3/8] mm: introduce memalloc_nofs_{save, restore} API Michal Hocko
2017-01-06 14:11   ` [PATCH 3/8] mm: introduce memalloc_nofs_{save,restore} API Michal Hocko
2017-01-06 14:11   ` Michal Hocko
2017-01-06 14:11   ` Michal Hocko
2017-01-09 13:04   ` Vlastimil Babka
2017-01-09 13:04     ` [Cluster-devel] [PATCH 3/8] mm: introduce memalloc_nofs_{save, restore} API Vlastimil Babka
2017-01-09 13:04     ` [PATCH 3/8] mm: introduce memalloc_nofs_{save,restore} API Vlastimil Babka
2017-01-09 13:42     ` Michal Hocko
2017-01-09 13:42       ` [Cluster-devel] [PATCH 3/8] mm: introduce memalloc_nofs_{save, restore} API Michal Hocko
2017-01-09 13:42       ` [PATCH 3/8] mm: introduce memalloc_nofs_{save,restore} API Michal Hocko
2017-01-09 13:59       ` Michal Hocko
2017-01-09 13:59         ` [Cluster-devel] [PATCH 3/8] mm: introduce memalloc_nofs_{save, restore} API Michal Hocko
2017-01-09 13:59         ` [PATCH 3/8] mm: introduce memalloc_nofs_{save,restore} API Michal Hocko
2017-01-09 14:04       ` Vlastimil Babka
2017-01-09 14:04         ` [Cluster-devel] [PATCH 3/8] mm: introduce memalloc_nofs_{save, restore} API Vlastimil Babka
2017-01-09 14:04         ` [PATCH 3/8] mm: introduce memalloc_nofs_{save,restore} API Vlastimil Babka
2017-01-06 14:11 ` [PATCH 4/8] xfs: use memalloc_nofs_{save,restore} instead of memalloc_noio* Michal Hocko
2017-01-06 14:11   ` [Cluster-devel] [PATCH 4/8] xfs: use memalloc_nofs_{save, restore} " Michal Hocko
2017-01-06 14:11   ` Michal Hocko
2017-01-06 14:11   ` [PATCH 4/8] xfs: use memalloc_nofs_{save,restore} " Michal Hocko
2017-01-06 14:11   ` Michal Hocko
2017-01-06 14:11   ` Michal Hocko
2017-01-09 14:08   ` Vlastimil Babka
2017-01-09 14:08     ` [Cluster-devel] [PATCH 4/8] xfs: use memalloc_nofs_{save, restore} " Vlastimil Babka
2017-01-09 14:08     ` [PATCH 4/8] xfs: use memalloc_nofs_{save,restore} " Vlastimil Babka
2017-01-09 14:25     ` Michal Hocko
2017-01-09 14:25       ` [Cluster-devel] [PATCH 4/8] xfs: use memalloc_nofs_{save, restore} " Michal Hocko
2017-01-09 14:25       ` [PATCH 4/8] xfs: use memalloc_nofs_{save,restore} " Michal Hocko
2017-01-09 15:56   ` Brian Foster
2017-01-09 15:56     ` [Cluster-devel] [PATCH 4/8] xfs: use memalloc_nofs_{save, restore} " Brian Foster
2017-01-09 15:56     ` [PATCH 4/8] xfs: use memalloc_nofs_{save,restore} " Brian Foster
2017-01-09 20:59   ` Darrick J. Wong
2017-01-09 20:59     ` [Cluster-devel] [PATCH 4/8] xfs: use memalloc_nofs_{save, restore} " Darrick J. Wong
2017-01-09 20:59     ` [PATCH 4/8] xfs: use memalloc_nofs_{save,restore} " Darrick J. Wong
2017-01-06 14:11 ` [PATCH 5/8] jbd2: mark the transaction context with the scope GFP_NOFS context Michal Hocko
2017-01-06 14:11   ` [Cluster-devel] " Michal Hocko
2017-01-06 14:11   ` Michal Hocko
2017-01-06 14:11   ` Michal Hocko
2017-01-06 14:11   ` Michal Hocko
2017-01-06 14:11 ` [PATCH 6/8] jbd2: make the whole kjournald2 kthread NOFS safe Michal Hocko
2017-01-06 14:11   ` [Cluster-devel] " Michal Hocko
2017-01-06 14:11   ` Michal Hocko
2017-01-06 14:11   ` Michal Hocko
2017-01-06 14:11   ` Michal Hocko
2017-01-06 14:11 ` [PATCH 7/8] Revert "ext4: avoid deadlocks in the writeback path by using sb_getblk_gfp" Michal Hocko
2017-01-06 14:11   ` [Cluster-devel] " Michal Hocko
2017-01-06 14:11   ` Michal Hocko
2017-01-06 14:11   ` Michal Hocko
2017-01-06 14:11   ` Michal Hocko
2017-01-17  3:01   ` Theodore Ts'o
2017-01-17  3:01     ` [Cluster-devel] " Theodore Ts'o
2017-01-17  3:01     ` Theodore Ts'o
2017-01-17  7:54     ` Michal Hocko
2017-01-17  7:54       ` [Cluster-devel] " Michal Hocko
2017-01-17  7:54       ` Michal Hocko
2017-03-06 11:59       ` Michal Hocko
2017-03-06 11:59         ` [Cluster-devel] " Michal Hocko
2017-03-06 11:59         ` Michal Hocko
2017-01-06 14:11 ` [PATCH 8/8] Revert "ext4: fix wrong gfp type under transaction" Michal Hocko
2017-01-06 14:11   ` [Cluster-devel] " Michal Hocko
2017-01-06 14:11   ` Michal Hocko
2017-01-06 14:11   ` Michal Hocko
2017-01-06 14:11   ` Michal Hocko
2017-01-17  2:56   ` Theodore Ts'o
2017-01-17  2:56     ` [Cluster-devel] " Theodore Ts'o
2017-01-17  2:56     ` Theodore Ts'o
2017-01-17  8:24     ` Michal Hocko
2017-01-17  8:24       ` [Cluster-devel] " Michal Hocko
2017-01-17  8:24       ` Michal Hocko
2017-01-17 15:18       ` Michal Hocko
2017-01-17 15:18         ` [Cluster-devel] " Michal Hocko
2017-01-17 15:18         ` Michal Hocko
2017-01-17 15:18         ` Michal Hocko
2017-01-17 15:59         ` Theodore Ts'o
2017-01-17 15:59           ` [Cluster-devel] " Theodore Ts'o
2017-01-17 15:59           ` Theodore Ts'o
2017-01-17 16:16           ` Michal Hocko
2017-01-17 16:16             ` [Cluster-devel] " Michal Hocko
2017-01-17 16:16             ` Michal Hocko
2017-01-17 17:29             ` Jan Kara
2017-01-17 17:29               ` [Cluster-devel] " Jan Kara
2017-01-17 17:29               ` Jan Kara
2017-01-19  8:39               ` Michal Hocko [this message]
2017-01-19  8:39                 ` [Cluster-devel] " Michal Hocko
2017-01-19  8:39                 ` Michal Hocko
2017-01-19  9:22                 ` Jan Kara
2017-01-19  9:22                   ` [Cluster-devel] " Jan Kara
2017-01-19  9:22                   ` Jan Kara
2017-01-19  9:44                   ` Michal Hocko
2017-01-19  9:44                     ` [Cluster-devel] " Michal Hocko
2017-01-19  9:44                     ` Michal Hocko
2017-01-26  7:44                     ` Michal Hocko
2017-01-26  7:44                       ` [Cluster-devel] " Michal Hocko
2017-01-26  7:44                       ` Michal Hocko
2017-01-27  6:13                       ` Theodore Ts'o
2017-01-27  6:13                         ` [Cluster-devel] " Theodore Ts'o
2017-01-27  6:13                         ` Theodore Ts'o
2017-01-27  6:13                         ` Theodore Ts'o
2017-01-27  9:37                         ` Michal Hocko
2017-01-27  9:37                           ` [Cluster-devel] " Michal Hocko
2017-01-27  9:37                           ` Michal Hocko
2017-01-27 16:40                           ` Theodore Ts'o
2017-01-27 16:40                             ` [Cluster-devel] " Theodore Ts'o
2017-01-27 16:40                             ` Theodore Ts'o
2017-01-28  7:32                             ` [Cluster-devel] " Christoph Hellwig
2017-01-28  7:32                               ` Christoph Hellwig
2017-01-28  7:32                               ` Christoph Hellwig
2017-01-28  8:17                               ` David Lang
2017-01-28  8:17                                 ` David Lang
2017-01-28  8:17                                 ` David Lang
2017-01-30  8:12                             ` Michal Hocko
2017-01-30  8:12                               ` [Cluster-devel] " Michal Hocko
2017-01-30  8:12                               ` Michal Hocko
2017-02-03 15:32                               ` Michal Hocko
2017-02-03 15:32                                 ` [Cluster-devel] " Michal Hocko
2017-02-03 15:32                                 ` Michal Hocko
2017-01-17 21:04           ` Andreas Dilger
2017-01-17 21:04             ` [Cluster-devel] " Andreas Dilger
2017-01-17 21:04             ` Andreas Dilger
2017-01-18  8:29             ` Michal Hocko
2017-01-18  8:29               ` [Cluster-devel] " Michal Hocko
2017-01-18  8:29               ` Michal Hocko
2017-01-06 14:18 ` [DEBUG PATCH 0/2] debug explicit GFP_NO{FS,IO} usage from the scope context Michal Hocko
2017-01-06 14:18   ` [Cluster-devel] [DEBUG PATCH 0/2] debug explicit GFP_NO{FS, IO} " Michal Hocko
2017-01-06 14:18   ` Michal Hocko
2017-01-06 14:18   ` [DEBUG PATCH 0/2] debug explicit GFP_NO{FS,IO} " Michal Hocko
2017-01-06 14:18   ` Michal Hocko
2017-01-06 14:18   ` Michal Hocko
2017-01-06 14:18   ` [DEBUG PATCH 1/2] mm, debug: report when GFP_NO{FS,IO} is used explicitly from memalloc_no{fs,io}_{save,restore} context Michal Hocko
2017-01-06 14:18     ` [Cluster-devel] [DEBUG PATCH 1/2] mm, debug: report when GFP_NO{FS, IO} is used explicitly from memalloc_no{fs, io}_{save, restore} context Michal Hocko
2017-01-06 14:18     ` Michal Hocko
2017-01-06 14:18     ` [DEBUG PATCH 1/2] mm, debug: report when GFP_NO{FS,IO} is used explicitly from memalloc_no{fs,io}_{save,restore} context Michal Hocko
2017-01-06 14:18     ` Michal Hocko
2017-01-06 14:18     ` Michal Hocko
2017-01-06 14:18   ` [DEBUG PATCH 2/2] silent warnings which we cannot do anything about Michal Hocko
2017-01-06 14:18     ` [Cluster-devel] " Michal Hocko
2017-01-06 14:18     ` Michal Hocko
2017-01-06 14:18     ` Michal Hocko
2017-01-06 14:18     ` Michal Hocko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170119083956.GE30786@dhcp22.suse.cz \
    --to=mhocko@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=ceph-devel@vger.kernel.org \
    --cc=clm@fb.com \
    --cc=cluster-devel@redhat.com \
    --cc=david@fromorbit.com \
    --cc=djwong@kernel.org \
    --cc=dsterba@suse.cz \
    --cc=jack@suse.cz \
    --cc=linux-afs@lists.infradead.org \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-f2fs-devel@lists.sourceforge.net \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-mtd@lists.infradead.org \
    --cc=linux-nfs@vger.kernel.org \
    --cc=linux-ntfs-dev@lists.sourceforge.net \
    --cc=linux-xfs@vger.kernel.org \
    --cc=logfs@logfs.org \
    --cc=reiserfs-devel@vger.kernel.org \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.