[Cluster-devel] [GFS2 PATCH 00/11] gfs2: jdata patch collection

All of lore.kernel.org
 help / color / mirror / Atom feed

* [Cluster-devel] [GFS2 PATCH 00/11] gfs2: jdata patch collection
@ 2020-07-24 18:32 Bob Peterson
  2020-07-24 18:32 ` [Cluster-devel] [GFS2 PATCH 01/11] gfs2: inline gfs2_write_jdata_pagevec into gfs2_write_cache_jdata Bob Peterson
                   ` (10 more replies)
  0 siblings, 11 replies; 14+ messages in thread
From: Bob Peterson @ 2020-07-24 18:32 UTC (permalink / raw)
  To: cluster-devel.redhat.com

These patches are for the jdata problems I've recently found. They allow
xfstests to pass in its entirety using jdata, and unlike before, 269
can be run multiple times without deadlocking. The last two patches are
just improvements to the log_blocks kernel trace point, which can be
pushed (or not) separately.

Bob Peterson (11):
  gfs2: inline gfs2_write_jdata_pagevec into gfs2_write_cache_jdata
  gfs2: don't break integrity writeback on __gfs2_jdata_writepage error
  gfs2: Fix inaccurate comment
  gfs2: don't try to add buffers to transactions a second time for jdata
  gfs2: Wipe jdata and ail1 in gfs2_journal_wipe, formerly
    gfs2_meta_wipe
  gfs2: rename gfs2_write_full_page to gfs2_write_jdata_page, remove
    parm
  gfs2: Add a new jdata-specific version of gfs2_get_block_noalloc
  gfs2: Add caller info to log_blocks trace point
  gfs2: enhance log_blocks trace point to show log blocks free
  gfs2: print details on transactions that aren't properly ended
  gfs2: Never call gfs2_block_zero_range with an open transaction

 fs/gfs2/aops.c       | 267 +++++++++++++++++++++++--------------------
 fs/gfs2/bmap.c       |  70 +++++++-----
 fs/gfs2/incore.h     |   3 +
 fs/gfs2/log.c        |  26 +++--
 fs/gfs2/lops.c       |   1 +
 fs/gfs2/meta_io.c    |  82 ++++++++++++-
 fs/gfs2/meta_io.h    |   5 +-
 fs/gfs2/ops_fstype.c |   2 +-
 fs/gfs2/rgrp.c       |   6 +-
 fs/gfs2/trace_gfs2.h |  23 +++-
 fs/gfs2/trans.c      |  31 ++---
 11 files changed, 327 insertions(+), 189 deletions(-)

-- 
2.26.2



^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Cluster-devel] [GFS2 PATCH 01/11] gfs2: inline gfs2_write_jdata_pagevec into gfs2_write_cache_jdata
  2020-07-24 18:32 [Cluster-devel] [GFS2 PATCH 00/11] gfs2: jdata patch collection Bob Peterson
@ 2020-07-24 18:32 ` Bob Peterson
  2020-07-24 18:32 ` [Cluster-devel] [GFS2 PATCH 02/11] gfs2: don't break integrity writeback on __gfs2_jdata_writepage error Bob Peterson
                   ` (9 subsequent siblings)
  10 siblings, 0 replies; 14+ messages in thread
From: Bob Peterson @ 2020-07-24 18:32 UTC (permalink / raw)
  To: cluster-devel.redhat.com

This patch inlines function gfs2_write_jdata_pagevec() into function
gfs2_write_cache_jdata(). Since gfs2_write_cache_jdata() is a gfs2-specific
copy of write_cache_pages(), we should preserve the structure and logic
of write_cache_pages() so upstream changes can be more easily ported
without all the hokey "ret = 1...if (ret > 0) ret = 0" kludge and
obfuscated logic.

Signed-off-by: Bob Peterson <rpeterso@redhat.com>
---
 fs/gfs2/aops.c | 185 ++++++++++++++++++++++---------------------------
 1 file changed, 84 insertions(+), 101 deletions(-)

diff --git a/fs/gfs2/aops.c b/fs/gfs2/aops.c
index 68cd700a2719..d6f471cbd62b 100644
--- a/fs/gfs2/aops.c
+++ b/fs/gfs2/aops.c
@@ -222,102 +222,6 @@ static int gfs2_writepages(struct address_space *mapping,
 	return ret;
 }
 
-/**
- * gfs2_write_jdata_pagevec - Write back a pagevec's worth of pages
- * @mapping: The mapping
- * @wbc: The writeback control
- * @pvec: The vector of pages
- * @nr_pages: The number of pages to write
- * @done_index: Page index
- *
- * Returns: non-zero if loop should terminate, zero otherwise
- */
-
-static int gfs2_write_jdata_pagevec(struct address_space *mapping,
-				    struct writeback_control *wbc,
-				    struct pagevec *pvec,
-				    int nr_pages,
-				    pgoff_t *done_index)
-{
-	struct inode *inode = mapping->host;
-	struct gfs2_sbd *sdp = GFS2_SB(inode);
-	unsigned nrblocks = nr_pages * (PAGE_SIZE >> inode->i_blkbits);
-	int i;
-	int ret;
-
-	ret = gfs2_trans_begin(sdp, nrblocks, nrblocks);
-	if (ret < 0)
-		return ret;
-
-	for(i = 0; i < nr_pages; i++) {
-		struct page *page = pvec->pages[i];
-
-		*done_index = page->index;
-
-		lock_page(page);
-
-		if (unlikely(page->mapping != mapping)) {
-continue_unlock:
-			unlock_page(page);
-			continue;
-		}
-
-		if (!PageDirty(page)) {
-			/* someone wrote it for us */
-			goto continue_unlock;
-		}
-
-		if (PageWriteback(page)) {
-			if (wbc->sync_mode != WB_SYNC_NONE)
-				wait_on_page_writeback(page);
-			else
-				goto continue_unlock;
-		}
-
-		BUG_ON(PageWriteback(page));
-		if (!clear_page_dirty_for_io(page))
-			goto continue_unlock;
-
-		trace_wbc_writepage(wbc, inode_to_bdi(inode));
-
-		ret = __gfs2_jdata_writepage(page, wbc);
-		if (unlikely(ret)) {
-			if (ret == AOP_WRITEPAGE_ACTIVATE) {
-				unlock_page(page);
-				ret = 0;
-			} else {
-
-				/*
-				 * done_index is set past this page,
-				 * so media errors will not choke
-				 * background writeout for the entire
-				 * file. This has consequences for
-				 * range_cyclic semantics (ie. it may
-				 * not be suitable for data integrity
-				 * writeout).
-				 */
-				*done_index = page->index + 1;
-				ret = 1;
-				break;
-			}
-		}
-
-		/*
-		 * We stop writing back only if we are not doing
-		 * integrity sync. In case of integrity sync we have to
-		 * keep going until we have written all the pages
-		 * we tagged for writeback prior to entering this loop.
-		 */
-		if (--wbc->nr_to_write <= 0 && wbc->sync_mode == WB_SYNC_NONE) {
-			ret = 1;
-			break;
-		}
-
-	}
-	gfs2_trans_end(sdp);
-	return ret;
-}
-
 /**
  * gfs2_write_cache_jdata - Like write_cache_pages but different
  * @mapping: The mapping to write
@@ -331,6 +235,8 @@ static int gfs2_write_jdata_pagevec(struct address_space *mapping,
 static int gfs2_write_cache_jdata(struct address_space *mapping,
 				  struct writeback_control *wbc)
 {
+	struct inode *inode = mapping->host;
+	struct gfs2_sbd *sdp = GFS2_SB(inode);
 	int ret = 0;
 	int done = 0;
 	struct pagevec pvec;
@@ -369,16 +275,93 @@ static int gfs2_write_cache_jdata(struct address_space *mapping,
 		tag_pages_for_writeback(mapping, index, end);
 	done_index = index;
 	while (!done && (index <= end)) {
+		int i;
+		unsigned nrblocks;
+
 		nr_pages = pagevec_lookup_range_tag(&pvec, mapping, &index, end,
 				tag);
 		if (nr_pages == 0)
 			break;
 
-		ret = gfs2_write_jdata_pagevec(mapping, wbc, &pvec, nr_pages, &done_index);
-		if (ret)
-			done = 1;
-		if (ret > 0)
-			ret = 0;
+		nrblocks = nr_pages * (PAGE_SIZE >> inode->i_blkbits);
+		ret = gfs2_trans_begin(sdp, nrblocks, nrblocks);
+		if (ret < 0)
+			break;
+
+		for (i = 0; i < nr_pages; i++) {
+			struct page *page = pvec.pages[i];
+
+			done_index = page->index;
+
+			lock_page(page);
+
+			/*
+			 * Page truncated or invalidated. We can freely skip it
+			 * then, even for data integrity operations: the page
+			 * has disappeared concurrently, so there could be no
+			 * real expectation of this data interity operation
+			 * even if there is now a new, dirty page at the same
+			 * pagecache address.
+			 */
+			if (unlikely(page->mapping != mapping)) {
+continue_unlock:
+				unlock_page(page);
+				continue;
+			}
+
+			if (!PageDirty(page)) {
+				/* someone wrote it for us */
+				goto continue_unlock;
+			}
+
+			if (PageWriteback(page)) {
+				if (wbc->sync_mode != WB_SYNC_NONE)
+					wait_on_page_writeback(page);
+				else
+					goto continue_unlock;
+			}
+
+			BUG_ON(PageWriteback(page));
+			if (!clear_page_dirty_for_io(page))
+				goto continue_unlock;
+
+			trace_wbc_writepage(wbc, inode_to_bdi(inode));
+			ret = __gfs2_jdata_writepage(page, wbc);
+			if (unlikely(ret)) {
+				if (ret == AOP_WRITEPAGE_ACTIVATE) {
+					unlock_page(page);
+					ret = 0;
+				} else {
+					/*
+					 * done_index is set past this page,
+					 * so media errors will not choke
+					 * background writeout for the entire
+					 * file. This has consequences for
+					 * range_cyclic semantics (ie. it may
+					 * not be suitable for data integrity
+					 * writeout).
+					 */
+					done_index = page->index + 1;
+					done = 1;
+					break;
+				}
+			}
+
+			/*
+			 * We stop writing back only if we are not doing
+			 * integrity sync. In case of integrity sync we have to
+			 * keep going until we have written all the pages
+			 * we tagged for writeback prior to entering this loop.
+			 */
+			if (--wbc->nr_to_write <= 0 &&
+			    wbc->sync_mode == WB_SYNC_NONE) {
+				done = 1;
+				break;
+			}
+
+		}
+		gfs2_trans_end(sdp);
+
 		pagevec_release(&pvec);
 		cond_resched();
 	}
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [Cluster-devel] [GFS2 PATCH 02/11] gfs2: don't break integrity writeback on __gfs2_jdata_writepage error
  2020-07-24 18:32 [Cluster-devel] [GFS2 PATCH 00/11] gfs2: jdata patch collection Bob Peterson
  2020-07-24 18:32 ` [Cluster-devel] [GFS2 PATCH 01/11] gfs2: inline gfs2_write_jdata_pagevec into gfs2_write_cache_jdata Bob Peterson
@ 2020-07-24 18:32 ` Bob Peterson
  2020-08-03 17:54   ` Andreas Gruenbacher
  2020-07-24 18:32 ` [Cluster-devel] [GFS2 PATCH 03/11] gfs2: Fix inaccurate comment Bob Peterson
                   ` (8 subsequent siblings)
  10 siblings, 1 reply; 14+ messages in thread
From: Bob Peterson @ 2020-07-24 18:32 UTC (permalink / raw)
  To: cluster-devel.redhat.com

Port Brian Foster's patch 3fa750dcf29e8606e3969d13d8e188cc1c0f511d to
gfs2's sister function gfs2_write_cache_jdata. Brian's original description:

write_cache_pages() is used in both background and integrity writeback
scenarios by various filesystems.  Background writeback is mostly
concerned with cleaning a certain number of dirty pages based on various
mm heuristics.  It may not write the full set of dirty pages or wait for
I/O to complete.  Integrity writeback is responsible for persisting a set
of dirty pages before the writeback job completes.  For example, an
fsync() call must perform integrity writeback to ensure data is on disk
before the call returns.

write_cache_pages() unconditionally breaks out of its processing loop in
the event of a ->writepage() error.  This is fine for background
writeback, which had no strict requirements and will eventually come
around again.  This can cause problems for integrity writeback on
filesystems that might need to clean up state associated with failed page
writeouts.  For example, XFS performs internal delayed allocation
accounting before returning a ->writepage() error, where applicable.  If
the current writeback happens to be associated with an unmount and
write_cache_pages() completes the writeback prematurely due to error, the
filesystem is unmounted in an inconsistent state if dirty+delalloc pages
still exist.

To handle this problem, update write_cache_pages() to always process the
full set of pages for integrity writeback regardless of ->writepage()
errors.  Save the first encountered error and return it to the caller once
complete.  This facilitates XFS (or any other fs that expects integrity
writeback to process the entire set of dirty pages) to clean up its
internal state completely in the event of persistent mapping errors.
Background writeback continues to exit on the first error encountered.

Signed-off-by: Bob Peterson <rpeterso@redhat.com>
---
 fs/gfs2/aops.c | 35 +++++++++++++++++++++--------------
 1 file changed, 21 insertions(+), 14 deletions(-)

diff --git a/fs/gfs2/aops.c b/fs/gfs2/aops.c
index d6f471cbd62b..ff2ec416f106 100644
--- a/fs/gfs2/aops.c
+++ b/fs/gfs2/aops.c
@@ -239,6 +239,7 @@ static int gfs2_write_cache_jdata(struct address_space *mapping,
 	struct gfs2_sbd *sdp = GFS2_SB(inode);
 	int ret = 0;
 	int done = 0;
+	int error;
 	struct pagevec pvec;
 	int nr_pages;
 	pgoff_t uninitialized_var(writeback_index);
@@ -326,25 +327,31 @@ static int gfs2_write_cache_jdata(struct address_space *mapping,
 				goto continue_unlock;

 			trace_wbc_writepage(wbc, inode_to_bdi(inode));
-			ret = __gfs2_jdata_writepage(page, wbc);
-			if (unlikely(ret)) {
-				if (ret == AOP_WRITEPAGE_ACTIVATE) {
+			error = __gfs2_jdata_writepage(page, wbc);
+			if (unlikely(error)) {
+				/*
+				 * Handle errors according to the type of
+				 * writeback. There's no need to continue for
+				 * background writeback. Just push done_index
+				 * past this page so media errors won't choke
+				 * writeout for the entire file. For integrity
+				 * writeback, we must process the entire dirty
+				 * set regardless of errors because the fs may
+				 * still have state to clear for each page. In
+				 * that case we continue processing and return
+				 * the first error.
+				 */
+				if (error == AOP_WRITEPAGE_ACTIVATE) {
 					unlock_page(page);
-					ret = 0;
-				} else {
-					/*
-					 * done_index is set past this page,
-					 * so media errors will not choke
-					 * background writeout for the entire
-					 * file. This has consequences for
-					 * range_cyclic semantics (ie. it may
-					 * not be suitable for data integrity
-					 * writeout).
-					 */
+					error = 0;
+				} else if (wbc->sync_mode != WB_SYNC_ALL) {
+					ret = error;
 					done_index = page->index + 1;
 					done = 1;
 					break;
 				}
+				if (!ret)
+					ret = error;
 			}

 			/*
-- 
2.26.2

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [Cluster-devel] [GFS2 PATCH 03/11] gfs2: Fix inaccurate comment
  2020-07-24 18:32 [Cluster-devel] [GFS2 PATCH 00/11] gfs2: jdata patch collection Bob Peterson
  2020-07-24 18:32 ` [Cluster-devel] [GFS2 PATCH 01/11] gfs2: inline gfs2_write_jdata_pagevec into gfs2_write_cache_jdata Bob Peterson
  2020-07-24 18:32 ` [Cluster-devel] [GFS2 PATCH 02/11] gfs2: don't break integrity writeback on __gfs2_jdata_writepage error Bob Peterson
@ 2020-07-24 18:32 ` Bob Peterson
  2020-07-24 18:32 ` [Cluster-devel] [GFS2 PATCH 04/11] gfs2: don't try to add buffers to transactions a second time for jdata Bob Peterson
                   ` (7 subsequent siblings)
  10 siblings, 0 replies; 14+ messages in thread
From: Bob Peterson @ 2020-07-24 18:32 UTC (permalink / raw)
  To: cluster-devel.redhat.com

The comment regarding journal flush thresholds is wrong. This patch fixes it.

Signed-off-by: Bob Peterson <rpeterso@redhat.com>
---
 fs/gfs2/log.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/gfs2/log.c b/fs/gfs2/log.c
index a76e55bc28eb..a58333e3980d 100644
--- a/fs/gfs2/log.c
+++ b/fs/gfs2/log.c
@@ -1092,7 +1092,7 @@ static void log_refund(struct gfs2_sbd *sdp, struct gfs2_trans *tr)
  * or the total number of used blocks (pinned blocks plus AIL blocks)
  * is greater than thresh2.
  *
- * At mount time thresh1 is 1/3rd of journal size, thresh2 is 2/3rd of
+ * At mount time thresh1 is 2/5ths of journal size, thresh2 is 4/5ths of
  * journal size.
  *
  * Returns: errno
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [Cluster-devel] [GFS2 PATCH 04/11] gfs2: don't try to add buffers to transactions a second time for jdata
  2020-07-24 18:32 [Cluster-devel] [GFS2 PATCH 00/11] gfs2: jdata patch collection Bob Peterson
                   ` (2 preceding siblings ...)
  2020-07-24 18:32 ` [Cluster-devel] [GFS2 PATCH 03/11] gfs2: Fix inaccurate comment Bob Peterson
@ 2020-07-24 18:32 ` Bob Peterson
  2020-08-03 17:52   ` Andreas Gruenbacher
  2020-07-24 18:32 ` [Cluster-devel] [GFS2 PATCH 05/11] gfs2: Wipe jdata and ail1 in gfs2_journal_wipe, formerly gfs2_meta_wipe Bob Peterson
                   ` (6 subsequent siblings)
  10 siblings, 1 reply; 14+ messages in thread
From: Bob Peterson @ 2020-07-24 18:32 UTC (permalink / raw)
  To: cluster-devel.redhat.com

When jdata buffers are written by user-space, the iomap code adds the
buffers to the appropriate transaction's databufs queue. Then they get
pinned, and unpinned and make their way to the ail1 list. But then
gfs2_ail1_start gets called, resulting in this calling sequence:

gfs2_ail1_start(sdp);
   gfs2_ail1_flush(sdp, &wbc); wbc has for_sync=1
      list_for_each_entry_reverse(tr, head, tr_list) {
         ret = gfs2_ail1_start_one(sdp, wbc, tr);
            ret = generic_writepages(mapping, wbc);
               write_cache_pages
                  __writepage
                     gfs2_jdata_writepage
                        __gfs2_jdata_writepage

At this point, __gfs2_jdata_writepage is trying to write the in-place
jdata buffers, but ends up calling gfs2_page_add_databufs which calls
gfs2_trans_add_data which, of course, requires a transaction. But the
gfs2_ail1_flush calling sequence is for flushing transactions, so it
does not have a transaction. In fact, it should just be writing the
jdata buffers in-place and shouldn't need a transaction.

This got us into strange situations where it thought it needed a
transaction when it didn't, and infinite loops because it thought
the buffer was PageChecked (i.e. re-dirtied by a competing write op)
so it just kept waiting for it to not be PageChecked each time.

To fix this, this patch introduces a new buffer_head flag "needs_trans"
to indicate when a buffer has been dirtied by a real write as opposed
to just an in-place write for the ail list.

Signed-off-by: Bob Peterson <rpeterso@redhat.com>
---
 fs/gfs2/aops.c   | 37 +++++++++++++++++++++++--------------
 fs/gfs2/bmap.c   |  1 +
 fs/gfs2/incore.h |  3 +++
 fs/gfs2/lops.c   |  1 +
 fs/gfs2/trans.c  |  2 ++
 5 files changed, 30 insertions(+), 14 deletions(-)

diff --git a/fs/gfs2/aops.c b/fs/gfs2/aops.c
index ff2ec416f106..a9af28abe468 100644
--- a/fs/gfs2/aops.c
+++ b/fs/gfs2/aops.c
@@ -158,14 +158,29 @@ static int __gfs2_jdata_writepage(struct page *page, struct writeback_control *w
 	struct gfs2_inode *ip = GFS2_I(inode);
 	struct gfs2_sbd *sdp = GFS2_SB(inode);
 
-	if (PageChecked(page)) {
-		ClearPageChecked(page);
-		if (!page_has_buffers(page)) {
-			create_empty_buffers(page, inode->i_sb->s_blocksize,
-					     BIT(BH_Dirty)|BIT(BH_Uptodate));
+	if (!PageChecked(page))
+		goto write;
+
+	ClearPageChecked(page);
+	if (page_has_buffers(page)) {
+		struct buffer_head *bh, *head;
+
+		bh = head = page_buffers(page);
+		do {
+			if (buffer_needstrans(bh))
+				break;
+			bh = bh->b_this_page;
+		} while (bh != head);
+		if (buffer_needstrans(bh)) {
+			BUG_ON(!current->journal_info);
+			gfs2_page_add_databufs(ip, page, 0,
+					       sdp->sd_vfs->s_blocksize);
 		}
-		gfs2_page_add_databufs(ip, page, 0, sdp->sd_vfs->s_blocksize);
+	} else {
+		create_empty_buffers(page, inode->i_sb->s_blocksize,
+				     BIT(BH_Dirty)|BIT(BH_Uptodate));
 	}
+write:
 	return gfs2_write_full_page(page, gfs2_get_block_noalloc, wbc);
 }
 
@@ -184,15 +199,9 @@ static int gfs2_jdata_writepage(struct page *page, struct writeback_control *wbc
 	struct gfs2_inode *ip = GFS2_I(inode);
 	struct gfs2_sbd *sdp = GFS2_SB(inode);
 
-	if (gfs2_assert_withdraw(sdp, gfs2_glock_is_held_excl(ip->i_gl)))
-		goto out;
-	if (PageChecked(page) || current->journal_info)
-		goto out_ignore;
-	return __gfs2_jdata_writepage(page, wbc);
+	if (!gfs2_assert_withdraw(sdp, gfs2_glock_is_held_excl(ip->i_gl)))
+		return __gfs2_jdata_writepage(page, wbc);
 
-out_ignore:
-	redirty_page_for_writepage(wbc, page);
-out:
 	unlock_page(page);
 	return 0;
 }
diff --git a/fs/gfs2/bmap.c b/fs/gfs2/bmap.c
index 6306eaae378b..85cf903011d7 100644
--- a/fs/gfs2/bmap.c
+++ b/fs/gfs2/bmap.c
@@ -94,6 +94,7 @@ static int gfs2_unstuffer_page(struct gfs2_inode *ip, struct buffer_head *dibh,
 		gfs2_trans_add_data(ip->i_gl, bh);
 	else {
 		mark_buffer_dirty(bh);
+		set_buffer_needstrans(bh);
 		gfs2_ordered_add_inode(ip);
 	}
 
diff --git a/fs/gfs2/incore.h b/fs/gfs2/incore.h
index ca2ec02436ec..f510126efc1c 100644
--- a/fs/gfs2/incore.h
+++ b/fs/gfs2/incore.h
@@ -149,12 +149,15 @@ static inline bool gfs2_rbm_eq(const struct gfs2_rbm *rbm1,
 enum gfs2_state_bits {
 	BH_Pinned = BH_PrivateStart,
 	BH_Escaped = BH_PrivateStart + 1,
+	BH_Needstrans = BH_PrivateStart + 2,
 };
 
 BUFFER_FNS(Pinned, pinned)
 TAS_BUFFER_FNS(Pinned, pinned)
 BUFFER_FNS(Escaped, escaped)
 TAS_BUFFER_FNS(Escaped, escaped)
+BUFFER_FNS(Needstrans, needstrans)
+TAS_BUFFER_FNS(Needstrans, needstrans)
 
 struct gfs2_bufdata {
 	struct buffer_head *bd_bh;
diff --git a/fs/gfs2/lops.c b/fs/gfs2/lops.c
index cb2a11b458c6..6f74fc43e9c2 100644
--- a/fs/gfs2/lops.c
+++ b/fs/gfs2/lops.c
@@ -104,6 +104,7 @@ static void gfs2_unpin(struct gfs2_sbd *sdp, struct buffer_head *bh,
 	BUG_ON(!buffer_pinned(bh));
 
 	lock_buffer(bh);
+	clear_buffer_needstrans(bh);
 	mark_buffer_dirty(bh);
 	clear_buffer_pinned(bh);
 
diff --git a/fs/gfs2/trans.c b/fs/gfs2/trans.c
index a3dfa3aa87ad..897d08e2a6b0 100644
--- a/fs/gfs2/trans.c
+++ b/fs/gfs2/trans.c
@@ -130,6 +130,7 @@ static struct gfs2_bufdata *gfs2_alloc_bufdata(struct gfs2_glock *gl,
 	bd->bd_bh = bh;
 	bd->bd_gl = gl;
 	INIT_LIST_HEAD(&bd->bd_list);
+	INIT_LIST_HEAD(&bd->bd_ail_st_list);
 	bh->b_private = bd;
 	return bd;
 }
@@ -155,6 +156,7 @@ void gfs2_trans_add_data(struct gfs2_glock *gl, struct buffer_head *bh)
 	struct gfs2_bufdata *bd;
 
 	lock_buffer(bh);
+	clear_buffer_needstrans(bh);
 	if (buffer_pinned(bh)) {
 		set_bit(TR_TOUCHED, &tr->tr_flags);
 		goto out;
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [Cluster-devel] [GFS2 PATCH 05/11] gfs2: Wipe jdata and ail1 in gfs2_journal_wipe, formerly gfs2_meta_wipe
  2020-07-24 18:32 [Cluster-devel] [GFS2 PATCH 00/11] gfs2: jdata patch collection Bob Peterson
                   ` (3 preceding siblings ...)
  2020-07-24 18:32 ` [Cluster-devel] [GFS2 PATCH 04/11] gfs2: don't try to add buffers to transactions a second time for jdata Bob Peterson
@ 2020-07-24 18:32 ` Bob Peterson
  2020-07-24 18:32 ` [Cluster-devel] [GFS2 PATCH 06/11] gfs2: rename gfs2_write_full_page to gfs2_write_jdata_page, remove parm Bob Peterson
                   ` (5 subsequent siblings)
  10 siblings, 0 replies; 14+ messages in thread
From: Bob Peterson @ 2020-07-24 18:32 UTC (permalink / raw)
  To: cluster-devel.redhat.com

Before this patch, when blocks were freed, it called gfs2_meta_wipe to
take the metadata out of the pending journal blocks. It did this mostly
by calling another function called gfs2_remove_from_journal. This is
shortsighted because it does not do anything with jdata blocks which
may also be in the journal.

This patch expands the function so that it wipes out jdata blocks from
the journal as well, and it wipes it from the ail1 list if it hasn't
yet been written back yet. Since it now processes jdata blocks as well,
the function has been renamed from gfs2_meta_wipe to gfs2_journal_wipe.

Signed-off-by: Bob Peterson <rpeterso@redhat.com>
---
 fs/gfs2/aops.c    |  2 +-
 fs/gfs2/meta_io.c | 82 +++++++++++++++++++++++++++++++++++++++++++----
 fs/gfs2/meta_io.h |  5 +--
 fs/gfs2/rgrp.c    |  6 ++--
 4 files changed, 83 insertions(+), 12 deletions(-)

diff --git a/fs/gfs2/aops.c b/fs/gfs2/aops.c
index a9af28abe468..aedc9fbf6ac7 100644
--- a/fs/gfs2/aops.c
+++ b/fs/gfs2/aops.c
@@ -663,7 +663,7 @@ static void gfs2_discard(struct gfs2_sbd *sdp, struct buffer_head *bh)
 		if (!list_empty(&bd->bd_list) && !buffer_pinned(bh))
 			list_del_init(&bd->bd_list);
 		else
-			gfs2_remove_from_journal(bh, REMOVE_JDATA);
+			gfs2_remove_from_journal(bh, REMOVE_JDATA, 1);
 	}
 	bh->b_bdev = NULL;
 	clear_buffer_mapped(bh);
diff --git a/fs/gfs2/meta_io.c b/fs/gfs2/meta_io.c
index 9856cc2e0795..ac0a7537b8fb 100644
--- a/fs/gfs2/meta_io.c
+++ b/fs/gfs2/meta_io.c
@@ -327,7 +327,7 @@ int gfs2_meta_wait(struct gfs2_sbd *sdp, struct buffer_head *bh)
 	return 0;
 }
 
-void gfs2_remove_from_journal(struct buffer_head *bh, int meta)
+void gfs2_remove_from_journal(struct buffer_head *bh, int meta, int lock_ail)
 {
 	struct address_space *mapping = bh->b_page->mapping;
 	struct gfs2_sbd *sdp = gfs2_mapping2sbd(mapping);
@@ -348,38 +348,108 @@ void gfs2_remove_from_journal(struct buffer_head *bh, int meta)
 		brelse(bh);
 	}
 	if (bd) {
-		spin_lock(&sdp->sd_ail_lock);
+		if (lock_ail)
+			spin_lock(&sdp->sd_ail_lock);
 		if (bd->bd_tr) {
 			gfs2_trans_add_revoke(sdp, bd);
 		} else if (was_pinned) {
 			bh->b_private = NULL;
 			kmem_cache_free(gfs2_bufdata_cachep, bd);
 		}
-		spin_unlock(&sdp->sd_ail_lock);
+		if (lock_ail)
+			spin_unlock(&sdp->sd_ail_lock);
 	}
 	clear_buffer_dirty(bh);
 	clear_buffer_uptodate(bh);
 }
 
 /**
- * gfs2_meta_wipe - make inode's buffers so they aren't dirty/pinned anymore
+ * gfs2_ail1_wipe - remove deleted/freed buffers from the ail1 list
+ * @sdp: superblock
+ * @bstart: starting block address of buffers to remove
+ * @blen: length of buffers to be removed
+ *
+ * This function is called from gfs2_journal wipe, whose job is to remove
+ * buffers, corresponding to deleted blocks, from the journal. If we find any
+ * bufdata elements on the system ail1 list, they haven't been written to
+ * the journal yet. So we remove them.
+ */
+static void gfs2_ail1_wipe(struct gfs2_sbd *sdp, u64 bstart, u32 blen)
+{
+	struct gfs2_trans *tr, *s;
+	struct gfs2_bufdata *bd, *bs;
+	struct buffer_head *bh;
+	u64 end = bstart + blen;
+
+	gfs2_log_lock(sdp);
+	spin_lock(&sdp->sd_ail_lock);
+	list_for_each_entry_safe(tr, s, &sdp->sd_ail1_list, tr_list) {
+		list_for_each_entry_safe(bd, bs, &tr->tr_ail1_list,
+					 bd_ail_st_list) {
+			bh = bd->bd_bh;
+			if (bh->b_blocknr < bstart || bh->b_blocknr >= end)
+				continue;
+
+			gfs2_remove_from_journal(bh, REMOVE_JDATA, 0);
+		}
+	}
+	spin_unlock(&sdp->sd_ail_lock);
+	gfs2_log_unlock(sdp);
+}
+
+struct buffer_head *gfs2_getjdatabuf(struct gfs2_inode *ip, u64 blkno)
+{
+	struct address_space *mapping = ip->i_inode.i_mapping;
+	struct gfs2_sbd *sdp = GFS2_SB(&ip->i_inode);
+	struct page *page;
+	struct buffer_head *bh;
+	unsigned int shift = PAGE_SHIFT - sdp->sd_sb.sb_bsize_shift;
+	unsigned long index = blkno >> shift; /* convert block to page */
+	unsigned int bufnum = blkno - (index << shift);
+
+	page = find_get_page_flags(mapping, index, FGP_LOCK|FGP_ACCESSED);
+	if (!page)
+		return NULL;
+	if (!page_has_buffers(page)) {
+		unlock_page(page);
+		put_page(page);
+		return NULL;
+	}
+	/* Locate header for our buffer within our page */
+	for (bh = page_buffers(page); bufnum--; bh = bh->b_this_page)
+		/* Do nothing */;
+	get_bh(bh);
+	unlock_page(page);
+	put_page(page);
+	return bh;
+}
+
+/**
+ * gfs2_journal_wipe - make inode's buffers so they aren't dirty/pinned anymore
  * @ip: the inode who owns the buffers
  * @bstart: the first buffer in the run
  * @blen: the number of buffers in the run
  *
  */
 
-void gfs2_meta_wipe(struct gfs2_inode *ip, u64 bstart, u32 blen)
+void gfs2_journal_wipe(struct gfs2_inode *ip, u64 bstart, u32 blen)
 {
 	struct gfs2_sbd *sdp = GFS2_SB(&ip->i_inode);
 	struct buffer_head *bh;
+	int ty;
 
+	gfs2_ail1_wipe(sdp, bstart, blen);
 	while (blen) {
+		ty = REMOVE_META;
 		bh = gfs2_getbuf(ip->i_gl, bstart, NO_CREATE);
+		if (!bh && gfs2_is_jdata(ip)) {
+			bh = gfs2_getjdatabuf(ip, bstart);
+			ty = REMOVE_JDATA;
+		}
 		if (bh) {
 			lock_buffer(bh);
 			gfs2_log_lock(sdp);
-			gfs2_remove_from_journal(bh, REMOVE_META);
+			gfs2_remove_from_journal(bh, ty, 1);
 			gfs2_log_unlock(sdp);
 			unlock_buffer(bh);
 			brelse(bh);
diff --git a/fs/gfs2/meta_io.h b/fs/gfs2/meta_io.h
index eafb74e861c6..4d0577c27be1 100644
--- a/fs/gfs2/meta_io.h
+++ b/fs/gfs2/meta_io.h
@@ -59,8 +59,9 @@ enum {
 	REMOVE_META = 1,
 };
 
-extern void gfs2_remove_from_journal(struct buffer_head *bh, int meta);
-extern void gfs2_meta_wipe(struct gfs2_inode *ip, u64 bstart, u32 blen);
+extern void gfs2_remove_from_journal(struct buffer_head *bh, int meta,
+				     int lock_ail);
+extern void gfs2_journal_wipe(struct gfs2_inode *ip, u64 bstart, u32 blen);
 extern int gfs2_meta_indirect_buffer(struct gfs2_inode *ip, int height, u64 num,
 				     struct buffer_head **bhp);
 
diff --git a/fs/gfs2/rgrp.c b/fs/gfs2/rgrp.c
index 074f228ea839..bade48f2522f 100644
--- a/fs/gfs2/rgrp.c
+++ b/fs/gfs2/rgrp.c
@@ -2445,8 +2445,8 @@ void __gfs2_free_blocks(struct gfs2_inode *ip, struct gfs2_rgrpd *rgd,
 	gfs2_rgrp_out(rgd, rgd->rd_bits[0].bi_bh->b_data);
 
 	/* Directories keep their data in the metadata address space */
-	if (meta || ip->i_depth)
-		gfs2_meta_wipe(ip, bstart, blen);
+	if (meta || ip->i_depth || gfs2_is_jdata(ip))
+		gfs2_journal_wipe(ip, bstart, blen);
 }
 
 /**
@@ -2502,7 +2502,7 @@ void gfs2_free_di(struct gfs2_rgrpd *rgd, struct gfs2_inode *ip)
 	gfs2_statfs_change(sdp, 0, +1, -1);
 	trace_gfs2_block_alloc(ip, rgd, ip->i_no_addr, 1, GFS2_BLKST_FREE);
 	gfs2_quota_change(ip, -1, ip->i_inode.i_uid, ip->i_inode.i_gid);
-	gfs2_meta_wipe(ip, ip->i_no_addr, 1);
+	gfs2_journal_wipe(ip, ip->i_no_addr, 1);
 }
 
 /**
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [Cluster-devel] [GFS2 PATCH 06/11] gfs2: rename gfs2_write_full_page to gfs2_write_jdata_page, remove parm
  2020-07-24 18:32 [Cluster-devel] [GFS2 PATCH 00/11] gfs2: jdata patch collection Bob Peterson
                   ` (4 preceding siblings ...)
  2020-07-24 18:32 ` [Cluster-devel] [GFS2 PATCH 05/11] gfs2: Wipe jdata and ail1 in gfs2_journal_wipe, formerly gfs2_meta_wipe Bob Peterson
@ 2020-07-24 18:32 ` Bob Peterson
  2020-07-24 18:33 ` [Cluster-devel] [GFS2 PATCH 07/11] gfs2: Add a new jdata-specific version of gfs2_get_block_noalloc Bob Peterson
                   ` (4 subsequent siblings)
  10 siblings, 0 replies; 14+ messages in thread
From: Bob Peterson @ 2020-07-24 18:32 UTC (permalink / raw)
  To: cluster-devel.redhat.com

Since the function is only used for writing jdata pages, this patch
simply renames function gfs2_write_full_page to a more appropriate
name: gfs2_write_jdata_page. This makes the code easier to understand.

The function was only called in one place, which passed in a pointer to
function gfs2_get_block_noalloc. The function doesn't need to be
passed in. Therefore, this also eliminates the unnecessary parameter
to increase efficiency.

I also took the liberty of cleaning up the function comments.

Signed-off-by: Bob Peterson <rpeterso@redhat.com>
---
 fs/gfs2/aops.c | 15 ++++++++++-----
 1 file changed, 10 insertions(+), 5 deletions(-)

diff --git a/fs/gfs2/aops.c b/fs/gfs2/aops.c
index aedc9fbf6ac7..e548b54d4139 100644
--- a/fs/gfs2/aops.c
+++ b/fs/gfs2/aops.c
@@ -115,11 +115,16 @@ static int gfs2_writepage(struct page *page, struct writeback_control *wbc)
 	return 0;
 }
 
-/* This is the same as calling block_write_full_page, but it also
+/**
+ * gfs2_write_jdata_page - gfs2 jdata-specific version of block_write_full_page
+ * @page: The page to write
+ * @wbc: The writeback control
+ *
+ * This is the same as calling block_write_full_page, but it also
  * writes pages outside of i_size
  */
-static int gfs2_write_full_page(struct page *page, get_block_t *get_block,
-				struct writeback_control *wbc)
+static int gfs2_write_jdata_page(struct page *page,
+				 struct writeback_control *wbc)
 {
 	struct inode * const inode = page->mapping->host;
 	loff_t i_size = i_size_read(inode);
@@ -137,7 +142,7 @@ static int gfs2_write_full_page(struct page *page, get_block_t *get_block,
 	if (page->index == end_index && offset)
 		zero_user_segment(page, offset, PAGE_SIZE);
 
-	return __block_write_full_page(inode, page, get_block, wbc,
+	return __block_write_full_page(inode, page, gfs2_get_block_noalloc, wbc,
 				       end_buffer_async_write);
 }
 
@@ -181,7 +186,7 @@ static int __gfs2_jdata_writepage(struct page *page, struct writeback_control *w
 				     BIT(BH_Dirty)|BIT(BH_Uptodate));
 	}
 write:
-	return gfs2_write_full_page(page, gfs2_get_block_noalloc, wbc);
+	return gfs2_write_jdata_page(page, wbc);
 }
 
 /**
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [Cluster-devel] [GFS2 PATCH 07/11] gfs2: Add a new jdata-specific version of gfs2_get_block_noalloc
  2020-07-24 18:32 [Cluster-devel] [GFS2 PATCH 00/11] gfs2: jdata patch collection Bob Peterson
                   ` (5 preceding siblings ...)
  2020-07-24 18:32 ` [Cluster-devel] [GFS2 PATCH 06/11] gfs2: rename gfs2_write_full_page to gfs2_write_jdata_page, remove parm Bob Peterson
@ 2020-07-24 18:33 ` Bob Peterson
  2020-07-24 18:33 ` [Cluster-devel] [GFS2 PATCH 08/11] gfs2: Add caller info to log_blocks trace point Bob Peterson
                   ` (3 subsequent siblings)
  10 siblings, 0 replies; 14+ messages in thread
From: Bob Peterson @ 2020-07-24 18:33 UTC (permalink / raw)
  To: cluster-devel.redhat.com

This patch creates a new function, gfs2_get_jblk_noalloc, similar to
gfs2_get_block_noalloc, which doesn't complain about unmapped blocks.

This is due to situations in which jdata pages are queued to the ail1
list but haven't been written back, but afterward, they're deleted
by a punch_hole or truncate operation. This resulted in a calling
sequence that looks like this:

gfs2_ail1_start
   gfs2_ail1_flush - for every tr on the sd_ail1_list:
      gfs2_ail1_start_one - for every bd on the tr's tr_ail1_list:
         generic_writepages
            write_cache_pages passing __writepage()
               __writepage() calls:
                  mapping->a_ops->writepage AKA gfs2_jdata_writepage:
                     __gfs2_jdata_writepage calls gfs2_write_jdata_page
                        gfs2_get_jblk_noalloc (was gfs2_get_block_noalloc)
                           gfs2_block_map - finds unmapped jdata page

If we return -EIO like gfs2_get_block_noalloc, we error out and withdraw
from gfs2_ail1_start_one. With this jdata-specific version, we simply
assume the error is due to the page being truncated and ignore the error,
which allows the ail1 write operation to finish successfully, writing out
the jdata pages it can.

Signed-off-by: Bob Peterson <rpeterso@redhat.com>
---
 fs/gfs2/aops.c | 23 ++++++++++++++++++++++-
 1 file changed, 22 insertions(+), 1 deletion(-)

diff --git a/fs/gfs2/aops.c b/fs/gfs2/aops.c
index e548b54d4139..c94521de838a 100644
--- a/fs/gfs2/aops.c
+++ b/fs/gfs2/aops.c
@@ -115,6 +115,27 @@ static int gfs2_writepage(struct page *page, struct writeback_control *wbc)
 	return 0;
 }
 
+/**
+ * gfs2_get_jblk_noalloc - Fills in a buffer head with details about a block
+ * @inode: The inode
+ * @lblock: The block number to look up
+ * @bh_result: The buffer head to return the result in
+ * @create: Non-zero if we may add block to the file
+ *
+ * This is a jdata-specific version of gfs2_get_block_noalloc. The difference
+ * is: This is called when the ail1 list is being written for jdata pages,
+ * which means we don't want to throw a -EIO error when they are not mapped,
+ * which can happen when jdata pages are deleted by punch_hold operations.
+ *
+ * Returns: errno
+ */
+
+static int gfs2_get_jblk_noalloc(struct inode *inode, sector_t lblock,
+				 struct buffer_head *bh_result, int create)
+{
+	return gfs2_block_map(inode, lblock, bh_result, 0);
+}
+
 /**
  * gfs2_write_jdata_page - gfs2 jdata-specific version of block_write_full_page
  * @page: The page to write
@@ -142,7 +163,7 @@ static int gfs2_write_jdata_page(struct page *page,
 	if (page->index == end_index && offset)
 		zero_user_segment(page, offset, PAGE_SIZE);
 
-	return __block_write_full_page(inode, page, gfs2_get_block_noalloc, wbc,
+	return __block_write_full_page(inode, page, gfs2_get_jblk_noalloc, wbc,
 				       end_buffer_async_write);
 }
 
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [Cluster-devel] [GFS2 PATCH 08/11] gfs2: Add caller info to log_blocks trace point
  2020-07-24 18:32 [Cluster-devel] [GFS2 PATCH 00/11] gfs2: jdata patch collection Bob Peterson
                   ` (6 preceding siblings ...)
  2020-07-24 18:33 ` [Cluster-devel] [GFS2 PATCH 07/11] gfs2: Add a new jdata-specific version of gfs2_get_block_noalloc Bob Peterson
@ 2020-07-24 18:33 ` Bob Peterson
  2020-07-24 18:33 ` [Cluster-devel] [GFS2 PATCH 09/11] gfs2: enhance log_blocks trace point to show log blocks free Bob Peterson
                   ` (2 subsequent siblings)
  10 siblings, 0 replies; 14+ messages in thread
From: Bob Peterson @ 2020-07-24 18:33 UTC (permalink / raw)
  To: cluster-devel.redhat.com

The log_blocks kernel trace point becomes much more useful if we know
from whence the call was made. This patch adds caller info to the trace
point.

Signed-off-by: Bob Peterson <rpeterso@redhat.com>
---
 fs/gfs2/log.c        | 12 ++++++------
 fs/gfs2/ops_fstype.c |  2 +-
 fs/gfs2/trace_gfs2.h | 20 ++++++++++++++++----
 3 files changed, 23 insertions(+), 11 deletions(-)

diff --git a/fs/gfs2/log.c b/fs/gfs2/log.c
index a58333e3980d..665e287bf4f1 100644
--- a/fs/gfs2/log.c
+++ b/fs/gfs2/log.c
@@ -396,7 +396,7 @@ void gfs2_log_release(struct gfs2_sbd *sdp, unsigned int blks)
 {
 
 	atomic_add(blks, &sdp->sd_log_blks_free);
-	trace_gfs2_log_blocks(sdp, blks);
+	trace_gfs2_log_blocks(sdp, blks, 0);
 	gfs2_assert_withdraw(sdp, atomic_read(&sdp->sd_log_blks_free) <=
 				  sdp->sd_jdesc->jd_blocks);
 	up_read(&sdp->sd_log_flush_lock);
@@ -457,7 +457,7 @@ int gfs2_log_reserve(struct gfs2_sbd *sdp, unsigned int blks)
 		goto retry;
 	}
 	atomic_sub(blks, &sdp->sd_log_blks_needed);
-	trace_gfs2_log_blocks(sdp, -blks);
+	trace_gfs2_log_blocks(sdp, -blks, 1);
 
 	/*
 	 * If we waited, then so might others, wake them up _after_ we get
@@ -576,7 +576,7 @@ static void log_pull_tail(struct gfs2_sbd *sdp, unsigned int new_tail)
 	ail2_empty(sdp, new_tail);
 
 	atomic_add(dist, &sdp->sd_log_blks_free);
-	trace_gfs2_log_blocks(sdp, dist);
+	trace_gfs2_log_blocks(sdp, dist, 2);
 	gfs2_assert_withdraw(sdp, atomic_read(&sdp->sd_log_blks_free) <=
 			     sdp->sd_jdesc->jd_blocks);
 
@@ -966,7 +966,7 @@ void gfs2_log_flush(struct gfs2_sbd *sdp, struct gfs2_glock *gl, u32 flags)
 		log_write_header(sdp, flags);
 	} else if (sdp->sd_log_tail != current_tail(sdp) && !sdp->sd_log_idle){
 		atomic_dec(&sdp->sd_log_blks_free); /* Adjust for unreserved buffer */
-		trace_gfs2_log_blocks(sdp, -1);
+		trace_gfs2_log_blocks(sdp, -1, 3);
 		log_write_header(sdp, flags);
 	}
 	if (gfs2_withdrawn(sdp))
@@ -992,7 +992,7 @@ void gfs2_log_flush(struct gfs2_sbd *sdp, struct gfs2_glock *gl, u32 flags)
 			if (gfs2_withdrawn(sdp))
 				goto out;
 			atomic_dec(&sdp->sd_log_blks_free); /* Adjust for unreserved buffer */
-			trace_gfs2_log_blocks(sdp, -1);
+			trace_gfs2_log_blocks(sdp, -1, 4);
 			log_write_header(sdp, flags);
 			sdp->sd_log_head = sdp->sd_log_flush_head;
 		}
@@ -1075,7 +1075,7 @@ static void log_refund(struct gfs2_sbd *sdp, struct gfs2_trans *tr)
 	gfs2_assert_withdraw(sdp, maxres >= reserved);
 	unused = maxres - reserved;
 	atomic_add(unused, &sdp->sd_log_blks_free);
-	trace_gfs2_log_blocks(sdp, unused);
+	trace_gfs2_log_blocks(sdp, unused, 5);
 	gfs2_assert_withdraw(sdp, atomic_read(&sdp->sd_log_blks_free) <=
 			     sdp->sd_jdesc->jd_blocks);
 	sdp->sd_log_blks_reserved = reserved;
diff --git a/fs/gfs2/ops_fstype.c b/fs/gfs2/ops_fstype.c
index 6d18d2c91add..b2538eea3e5e 100644
--- a/fs/gfs2/ops_fstype.c
+++ b/fs/gfs2/ops_fstype.c
@@ -688,7 +688,7 @@ static int init_journal(struct gfs2_sbd *sdp, int undo)
 		/* Map the extents for this journal's blocks */
 		gfs2_map_journal_extents(sdp, sdp->sd_jdesc);
 	}
-	trace_gfs2_log_blocks(sdp, atomic_read(&sdp->sd_log_blks_free));
+	trace_gfs2_log_blocks(sdp, atomic_read(&sdp->sd_log_blks_free), 6);
 
 	if (sdp->sd_lockstruct.ls_first) {
 		unsigned int x;
diff --git a/fs/gfs2/trace_gfs2.h b/fs/gfs2/trace_gfs2.h
index e0025258107a..1c4aa666e7e0 100644
--- a/fs/gfs2/trace_gfs2.h
+++ b/fs/gfs2/trace_gfs2.h
@@ -45,6 +45,15 @@
 					 { 2, "ins " },	\
 					 { 3, "clm " })
 
+#define lbcaller(x) __print_symbolic(x, \
+	{ 0, "gfs2_log_release" }, \
+	{ 1, "gfs2_log_reserve" }, \
+	{ 2, "log_pull_tail" },	   \
+	{ 3, "gfs2_log_flush" }, \
+	{ 4, "gfs2_log_flush2" }, \
+	{ 5, "log_refund" }, \
+	{ 6, "init_journal" })
+
 #define show_glock_flags(flags) __print_flags(flags, "",	\
 	{(1UL << GLF_LOCK),			"l" },		\
 	{(1UL << GLF_DEMOTE),			"D" },		\
@@ -381,22 +390,25 @@ TRACE_EVENT(gfs2_log_flush,
 /* Reserving/releasing blocks in the log */
 TRACE_EVENT(gfs2_log_blocks,
 
-	TP_PROTO(const struct gfs2_sbd *sdp, int blocks),
+	TP_PROTO(const struct gfs2_sbd *sdp, int blocks, int caller),
 
-	TP_ARGS(sdp, blocks),
+	TP_ARGS(sdp, blocks, caller),
 
 	TP_STRUCT__entry(
 		__field(        dev_t,  dev                     )
 		__field(	int,	blocks			)
+		__field(	int,	caller			)
 	),
 
 	TP_fast_assign(
 		__entry->dev		= sdp->sd_vfs->s_dev;
 		__entry->blocks		= blocks;
+		__entry->caller		= caller;
 	),
 
-	TP_printk("%u,%u log reserve %d", MAJOR(__entry->dev),
-		  MINOR(__entry->dev), __entry->blocks)
+	TP_printk("%u,%u log reserve %d %s", MAJOR(__entry->dev),
+		  MINOR(__entry->dev), __entry->blocks,
+		  lbcaller(__entry->caller))
 );
 
 /* Writing back the AIL */
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [Cluster-devel] [GFS2 PATCH 09/11] gfs2: enhance log_blocks trace point to show log blocks free
  2020-07-24 18:32 [Cluster-devel] [GFS2 PATCH 00/11] gfs2: jdata patch collection Bob Peterson
                   ` (7 preceding siblings ...)
  2020-07-24 18:33 ` [Cluster-devel] [GFS2 PATCH 08/11] gfs2: Add caller info to log_blocks trace point Bob Peterson
@ 2020-07-24 18:33 ` Bob Peterson
  2020-07-24 18:33 ` [Cluster-devel] [GFS2 PATCH 10/11] gfs2: print details on transactions that aren't properly ended Bob Peterson
  2020-07-24 18:33 ` [Cluster-devel] [GFS2 PATCH 11/11] gfs2: Never call gfs2_block_zero_range with an open transaction Bob Peterson
  10 siblings, 0 replies; 14+ messages in thread
From: Bob Peterson @ 2020-07-24 18:33 UTC (permalink / raw)
  To: cluster-devel.redhat.com

This patch adds some code to enhance the log_blocks trace point. It
reports the number of free log blocks, and also adds new trace points
for revoke writes that were previously missing.

Signed-off-by: Bob Peterson <rpeterso@redhat.com>
---
 fs/gfs2/log.c        | 12 ++++++++++--
 fs/gfs2/trace_gfs2.h |  9 ++++++---
 2 files changed, 16 insertions(+), 5 deletions(-)

diff --git a/fs/gfs2/log.c b/fs/gfs2/log.c
index 665e287bf4f1..489b6f5c605d 100644
--- a/fs/gfs2/log.c
+++ b/fs/gfs2/log.c
@@ -716,16 +716,24 @@ void gfs2_write_revokes(struct gfs2_sbd *sdp)
 		atomic_dec(&sdp->sd_log_blks_free);
 		/* If no blocks have been reserved, we need to also
 		 * reserve a block for the header */
-		if (!sdp->sd_log_blks_reserved)
+		if (!sdp->sd_log_blks_reserved) {
 			atomic_dec(&sdp->sd_log_blks_free);
+			trace_gfs2_log_blocks(sdp, -2, 7);
+		} else {
+			trace_gfs2_log_blocks(sdp, -1, 7);
+		}
 	}
 	gfs2_ail1_empty(sdp, max_revokes);
 	gfs2_log_unlock(sdp);
 
 	if (!sdp->sd_log_num_revoke) {
 		atomic_inc(&sdp->sd_log_blks_free);
-		if (!sdp->sd_log_blks_reserved)
+		if (!sdp->sd_log_blks_reserved) {
 			atomic_inc(&sdp->sd_log_blks_free);
+			trace_gfs2_log_blocks(sdp, 2, 7);
+		} else {
+			trace_gfs2_log_blocks(sdp, 1, 7);
+		}
 	}
 }
 
diff --git a/fs/gfs2/trace_gfs2.h b/fs/gfs2/trace_gfs2.h
index 1c4aa666e7e0..9d28be2a8b69 100644
--- a/fs/gfs2/trace_gfs2.h
+++ b/fs/gfs2/trace_gfs2.h
@@ -52,7 +52,8 @@
 	{ 3, "gfs2_log_flush" }, \
 	{ 4, "gfs2_log_flush2" }, \
 	{ 5, "log_refund" }, \
-	{ 6, "init_journal" })
+	{ 6, "init_journal" }, \
+				     { 7, "gfs2_write_revokes" })
 
 #define show_glock_flags(flags) __print_flags(flags, "",	\
 	{(1UL << GLF_LOCK),			"l" },		\
@@ -398,17 +399,19 @@ TRACE_EVENT(gfs2_log_blocks,
 		__field(        dev_t,  dev                     )
 		__field(	int,	blocks			)
 		__field(	int,	caller			)
+		__field(	int,	blks_free		)
 	),
 
 	TP_fast_assign(
 		__entry->dev		= sdp->sd_vfs->s_dev;
 		__entry->blocks		= blocks;
 		__entry->caller		= caller;
+		__entry->blks_free	= atomic_read(&sdp->sd_log_blks_free);
 	),
 
-	TP_printk("%u,%u log reserve %d %s", MAJOR(__entry->dev),
+	TP_printk("%u,%u log reserve %d %s %d", MAJOR(__entry->dev),
 		  MINOR(__entry->dev), __entry->blocks,
-		  lbcaller(__entry->caller))
+		  lbcaller(__entry->caller), __entry->blks_free)
 );
 
 /* Writing back the AIL */
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [Cluster-devel] [GFS2 PATCH 10/11] gfs2: print details on transactions that aren't properly ended
  2020-07-24 18:32 [Cluster-devel] [GFS2 PATCH 00/11] gfs2: jdata patch collection Bob Peterson
                   ` (8 preceding siblings ...)
  2020-07-24 18:33 ` [Cluster-devel] [GFS2 PATCH 09/11] gfs2: enhance log_blocks trace point to show log blocks free Bob Peterson
@ 2020-07-24 18:33 ` Bob Peterson
  2020-07-24 18:33 ` [Cluster-devel] [GFS2 PATCH 11/11] gfs2: Never call gfs2_block_zero_range with an open transaction Bob Peterson
  10 siblings, 0 replies; 14+ messages in thread
From: Bob Peterson @ 2020-07-24 18:33 UTC (permalink / raw)
  To: cluster-devel.redhat.com

If function gfs2_trans_begin is called with another transaction active
it BUGs out, but it doesn't give any details about the duplicate.
This patch moves function gfs2_print_trans and calls it when this
situation arises for better debugging.

Signed-off-by: Bob Peterson <rpeterso@redhat.com>
---
 fs/gfs2/trans.c | 29 ++++++++++++++++-------------
 1 file changed, 16 insertions(+), 13 deletions(-)

diff --git a/fs/gfs2/trans.c b/fs/gfs2/trans.c
index 897d08e2a6b0..3728872976da 100644
--- a/fs/gfs2/trans.c
+++ b/fs/gfs2/trans.c
@@ -25,13 +25,28 @@
 #include "util.h"
 #include "trace_gfs2.h"
 
+static void gfs2_print_trans(struct gfs2_sbd *sdp, const struct gfs2_trans *tr)
+{
+	fs_warn(sdp, "Transaction created at: %pSR\n", (void *)tr->tr_ip);
+	fs_warn(sdp, "blocks=%u revokes=%u reserved=%u touched=%u\n",
+		tr->tr_blocks, tr->tr_revokes, tr->tr_reserved,
+		test_bit(TR_TOUCHED, &tr->tr_flags));
+	fs_warn(sdp, "Buf %u/%u Databuf %u/%u Revoke %u/%u\n",
+		tr->tr_num_buf_new, tr->tr_num_buf_rm,
+		tr->tr_num_databuf_new, tr->tr_num_databuf_rm,
+		tr->tr_num_revoke, tr->tr_num_revoke_rm);
+}
+
 int gfs2_trans_begin(struct gfs2_sbd *sdp, unsigned int blocks,
 		     unsigned int revokes)
 {
 	struct gfs2_trans *tr;
 	int error;
 
-	BUG_ON(current->journal_info);
+	if (current->journal_info) {
+		gfs2_print_trans(sdp, current->journal_info);
+		BUG();
+	}
 	BUG_ON(blocks == 0 && revokes == 0);
 
 	if (!test_bit(SDF_JOURNAL_LIVE, &sdp->sd_flags))
@@ -72,18 +87,6 @@ int gfs2_trans_begin(struct gfs2_sbd *sdp, unsigned int blocks,
 	return error;
 }
 
-static void gfs2_print_trans(struct gfs2_sbd *sdp, const struct gfs2_trans *tr)
-{
-	fs_warn(sdp, "Transaction created at: %pSR\n", (void *)tr->tr_ip);
-	fs_warn(sdp, "blocks=%u revokes=%u reserved=%u touched=%u\n",
-		tr->tr_blocks, tr->tr_revokes, tr->tr_reserved,
-		test_bit(TR_TOUCHED, &tr->tr_flags));
-	fs_warn(sdp, "Buf %u/%u Databuf %u/%u Revoke %u/%u\n",
-		tr->tr_num_buf_new, tr->tr_num_buf_rm,
-		tr->tr_num_databuf_new, tr->tr_num_databuf_rm,
-		tr->tr_num_revoke, tr->tr_num_revoke_rm);
-}
-
 void gfs2_trans_end(struct gfs2_sbd *sdp)
 {
 	struct gfs2_trans *tr = current->journal_info;
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [Cluster-devel] [GFS2 PATCH 11/11] gfs2: Never call gfs2_block_zero_range with an open transaction
  2020-07-24 18:32 [Cluster-devel] [GFS2 PATCH 00/11] gfs2: jdata patch collection Bob Peterson
                   ` (9 preceding siblings ...)
  2020-07-24 18:33 ` [Cluster-devel] [GFS2 PATCH 10/11] gfs2: print details on transactions that aren't properly ended Bob Peterson
@ 2020-07-24 18:33 ` Bob Peterson
  10 siblings, 0 replies; 14+ messages in thread
From: Bob Peterson @ 2020-07-24 18:33 UTC (permalink / raw)
  To: cluster-devel.redhat.com

Before this patch, some functions started transactions then they called
gfs2_block_zero_range. However, gfs2_block_zero_range, like writes, can
start transactions, which results in a recursive transaction error.
For example:

do_shrink
   trunc_start
      gfs2_trans_begin <------------------------------------------------
         gfs2_block_zero_range
            iomap_zero_range(inode, from, length, NULL, &gfs2_iomap_ops);
               iomap_apply ... iomap_zero_range_actor
                  iomap_begin
                     gfs2_iomap_begin
                        gfs2_iomap_begin_write
                  actor (iomap_zero_range_actor)
		     iomap_zero
			iomap_write_begin
			   gfs2_iomap_page_prepare
			      gfs2_trans_begin <------------------------

This patch reorders the callers of gfs2_block_zero_range so that they
only start their transactions after the call. It also adds a BUG_ON to
ensure this doesn't happen again.

Signed-off-by: Bob Peterson <rpeterso@redhat.com>
---
 fs/gfs2/bmap.c | 69 ++++++++++++++++++++++++++++----------------------
 1 file changed, 39 insertions(+), 30 deletions(-)

diff --git a/fs/gfs2/bmap.c b/fs/gfs2/bmap.c
index 85cf903011d7..251aa3bcf4e3 100644
--- a/fs/gfs2/bmap.c
+++ b/fs/gfs2/bmap.c
@@ -1352,9 +1352,15 @@ int gfs2_extent_map(struct inode *inode, u64 lblock, int *new, u64 *dblock, unsi
 	return ret;
 }
 
+/*
+ * NOTE: Never call gfs2_block_zero_range with an open transaction because it
+ * uses iomap write to perform its actions, which begin their own transactions
+ * (iomap_begin, page_prepare, etc.)
+ */
 static int gfs2_block_zero_range(struct inode *inode, loff_t from,
 				 unsigned int length)
 {
+	BUG_ON(current->journal_info);
 	return iomap_zero_range(inode, from, length, NULL, &gfs2_iomap_ops);
 }
 
@@ -1415,6 +1421,16 @@ static int trunc_start(struct inode *inode, u64 newsize)
 	u64 oldsize = inode->i_size;
 	int error;
 
+	if (!gfs2_is_stuffed(ip)) {
+		unsigned int blocksize = i_blocksize(inode);
+		unsigned int offs = newsize & (blocksize - 1);
+		if (offs) {
+			error = gfs2_block_zero_range(inode, newsize,
+						      blocksize - offs);
+			if (error)
+				return error;
+		}
+	}
 	if (journaled)
 		error = gfs2_trans_begin(sdp, RES_DINODE + RES_JDATA, GFS2_JTRUNC_REVOKES);
 	else
@@ -1428,19 +1444,10 @@ static int trunc_start(struct inode *inode, u64 newsize)
 
 	gfs2_trans_add_meta(ip->i_gl, dibh);
 
-	if (gfs2_is_stuffed(ip)) {
+	if (gfs2_is_stuffed(ip))
 		gfs2_buffer_clear_tail(dibh, sizeof(struct gfs2_dinode) + newsize);
-	} else {
-		unsigned int blocksize = i_blocksize(inode);
-		unsigned int offs = newsize & (blocksize - 1);
-		if (offs) {
-			error = gfs2_block_zero_range(inode, newsize,
-						      blocksize - offs);
-			if (error)
-				goto out;
-		}
+	else
 		ip->i_diskflags |= GFS2_DIF_TRUNC_IN_PROG;
-	}
 
 	i_size_write(inode, newsize);
 	ip->i_inode.i_mtime = ip->i_inode.i_ctime = current_time(&ip->i_inode);
@@ -2449,25 +2456,7 @@ int __gfs2_punch_hole(struct file *file, loff_t offset, loff_t length)
 	loff_t start, end;
 	int error;
 
-	start = round_down(offset, blocksize);
-	end = round_up(offset + length, blocksize) - 1;
-	error = filemap_write_and_wait_range(inode->i_mapping, start, end);
-	if (error)
-		return error;
-
-	if (gfs2_is_jdata(ip))
-		error = gfs2_trans_begin(sdp, RES_DINODE + 2 * RES_JDATA,
-					 GFS2_JTRUNC_REVOKES);
-	else
-		error = gfs2_trans_begin(sdp, RES_DINODE, 0);
-	if (error)
-		return error;
-
-	if (gfs2_is_stuffed(ip)) {
-		error = stuffed_zero_range(inode, offset, length);
-		if (error)
-			goto out;
-	} else {
+	if (!gfs2_is_stuffed(ip)) {
 		unsigned int start_off, end_len;
 
 		start_off = offset & (blocksize - 1);
@@ -2490,6 +2479,26 @@ int __gfs2_punch_hole(struct file *file, loff_t offset, loff_t length)
 		}
 	}
 
+	start = round_down(offset, blocksize);
+	end = round_up(offset + length, blocksize) - 1;
+	error = filemap_write_and_wait_range(inode->i_mapping, start, end);
+	if (error)
+		return error;
+
+	if (gfs2_is_jdata(ip))
+		error = gfs2_trans_begin(sdp, RES_DINODE + 2 * RES_JDATA,
+					 GFS2_JTRUNC_REVOKES);
+	else
+		error = gfs2_trans_begin(sdp, RES_DINODE, 0);
+	if (error)
+		return error;
+
+	if (gfs2_is_stuffed(ip)) {
+		error = stuffed_zero_range(inode, offset, length);
+		if (error)
+			goto out;
+	}
+
 	if (gfs2_is_jdata(ip)) {
 		BUG_ON(!current->journal_info);
 		gfs2_journaled_truncate_range(inode, offset, length);
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [Cluster-devel] [GFS2 PATCH 04/11] gfs2: don't try to add buffers to transactions a second time for jdata
  2020-07-24 18:32 ` [Cluster-devel] [GFS2 PATCH 04/11] gfs2: don't try to add buffers to transactions a second time for jdata Bob Peterson
@ 2020-08-03 17:52   ` Andreas Gruenbacher
  0 siblings, 0 replies; 14+ messages in thread
From: Andreas Gruenbacher @ 2020-08-03 17:52 UTC (permalink / raw)
  To: cluster-devel.redhat.com

Hi Bob,

On Fri, Jul 24, 2020 at 8:36 PM Bob Peterson <rpeterso@redhat.com> wrote:
> When jdata buffers are written by user-space, the iomap code adds the
> buffers to the appropriate transaction's databufs queue.

what does this have to do with user-space?

It's gfs2_iomap_page_done and __gfs2_jdata_writepage that call
gfs2_page_add_databufs, or gfs2 of you will, not "the iomap code".

> Then they get
> pinned, and unpinned and make their way to the ail1 list. But then
> gfs2_ail1_start gets called, resulting in this calling sequence:
>
> gfs2_ail1_start(sdp);
>    gfs2_ail1_flush(sdp, &wbc); wbc has for_sync=1
>       list_for_each_entry_reverse(tr, head, tr_list) {
>          ret = gfs2_ail1_start_one(sdp, wbc, tr);
>             ret = generic_writepages(mapping, wbc);
>                write_cache_pages
>                   __writepage
>                      gfs2_jdata_writepage
>                         __gfs2_jdata_writepage
>
> At this point, __gfs2_jdata_writepage is trying to write the in-place

It seems that you mean the in-journal buffers here and throughout the
rest of this patch description, not the in-place buffers.

> jdata buffers, but ends up calling gfs2_page_add_databufs which calls
> gfs2_trans_add_data which, of course, requires a transaction. But the
> gfs2_ail1_flush calling sequence is for flushing transactions, so it
> does not have a transaction. In fact, it should just be writing the
> jdata buffers in-place and shouldn't need a transaction.
>
> This got us into strange situations where it thought it needed a
> transaction when it didn't, and infinite loops because it thought
> the buffer was PageChecked (i.e. re-dirtied by a competing write op)
> so it just kept waiting for it to not be PageChecked each time.
>
> To fix this, this patch introduces a new buffer_head flag "needs_trans"
> to indicate when a buffer has been dirtied by a real write as opposed
> to just an in-place write for the ail list.
>
> Signed-off-by: Bob Peterson <rpeterso@redhat.com>
> ---
>  fs/gfs2/aops.c   | 37 +++++++++++++++++++++++--------------
>  fs/gfs2/bmap.c   |  1 +
>  fs/gfs2/incore.h |  3 +++
>  fs/gfs2/lops.c   |  1 +
>  fs/gfs2/trans.c  |  2 ++
>  5 files changed, 30 insertions(+), 14 deletions(-)
>
> diff --git a/fs/gfs2/aops.c b/fs/gfs2/aops.c
> index ff2ec416f106..a9af28abe468 100644
> --- a/fs/gfs2/aops.c
> +++ b/fs/gfs2/aops.c
> @@ -158,14 +158,29 @@ static int __gfs2_jdata_writepage(struct page *page, struct writeback_control *w
>         struct gfs2_inode *ip = GFS2_I(inode);
>         struct gfs2_sbd *sdp = GFS2_SB(inode);
>
> -       if (PageChecked(page)) {
> -               ClearPageChecked(page);
> -               if (!page_has_buffers(page)) {
> -                       create_empty_buffers(page, inode->i_sb->s_blocksize,
> -                                            BIT(BH_Dirty)|BIT(BH_Uptodate));
> +       if (!PageChecked(page))
> +               goto write;
> +
> +       ClearPageChecked(page);
> +       if (page_has_buffers(page)) {
> +               struct buffer_head *bh, *head;
> +
> +               bh = head = page_buffers(page);
> +               do {
> +                       if (buffer_needstrans(bh))
> +                               break;
> +                       bh = bh->b_this_page;
> +               } while (bh != head);
> +               if (buffer_needstrans(bh)) {
> +                       BUG_ON(!current->journal_info);
> +                       gfs2_page_add_databufs(ip, page, 0,
> +                                              sdp->sd_vfs->s_blocksize);
>                 }
> -               gfs2_page_add_databufs(ip, page, 0, sdp->sd_vfs->s_blocksize);
> +       } else {
> +               create_empty_buffers(page, inode->i_sb->s_blocksize,
> +                                    BIT(BH_Dirty)|BIT(BH_Uptodate));

Why is create_empty_buffers only called in the else clause now;
shouldn't it always be called as before this patch? (Would it perhaps
make sense to move that code to jdata_set_page_dirty?)

From the patch description, I cannot convince myself that the new
needstrans flag is needed. When __gfs2_jdata_writepage is called the
second time, PageChecked should already be cleared, which means that
gfs2_page_add_databufs won't be called. When exactly is that not the
case?

There's a comment above function __gfs2_jdata_writepage that describes
how PageChecked is supposed to work. That also needs updating, please.

>         }
> +write:
>         return gfs2_write_full_page(page, gfs2_get_block_noalloc, wbc);
>  }
>
> @@ -184,15 +199,9 @@ static int gfs2_jdata_writepage(struct page *page, struct writeback_control *wbc
>         struct gfs2_inode *ip = GFS2_I(inode);
>         struct gfs2_sbd *sdp = GFS2_SB(inode);
>
> -       if (gfs2_assert_withdraw(sdp, gfs2_glock_is_held_excl(ip->i_gl)))
> -               goto out;
> -       if (PageChecked(page) || current->journal_info)
> -               goto out_ignore;
> -       return __gfs2_jdata_writepage(page, wbc);
> +       if (!gfs2_assert_withdraw(sdp, gfs2_glock_is_held_excl(ip->i_gl)))
> +               return __gfs2_jdata_writepage(page, wbc);
>
> -out_ignore:
> -       redirty_page_for_writepage(wbc, page);
> -out:
>         unlock_page(page);
>         return 0;
>  }
> diff --git a/fs/gfs2/bmap.c b/fs/gfs2/bmap.c
> index 6306eaae378b..85cf903011d7 100644
> --- a/fs/gfs2/bmap.c
> +++ b/fs/gfs2/bmap.c
> @@ -94,6 +94,7 @@ static int gfs2_unstuffer_page(struct gfs2_inode *ip, struct buffer_head *dibh,
>                 gfs2_trans_add_data(ip->i_gl, bh);
>         else {
>                 mark_buffer_dirty(bh);
> +               set_buffer_needstrans(bh);
>                 gfs2_ordered_add_inode(ip);
>         }
>
> diff --git a/fs/gfs2/incore.h b/fs/gfs2/incore.h
> index ca2ec02436ec..f510126efc1c 100644
> --- a/fs/gfs2/incore.h
> +++ b/fs/gfs2/incore.h
> @@ -149,12 +149,15 @@ static inline bool gfs2_rbm_eq(const struct gfs2_rbm *rbm1,
>  enum gfs2_state_bits {
>         BH_Pinned = BH_PrivateStart,
>         BH_Escaped = BH_PrivateStart + 1,
> +       BH_Needstrans = BH_PrivateStart + 2,
>  };
>
>  BUFFER_FNS(Pinned, pinned)
>  TAS_BUFFER_FNS(Pinned, pinned)
>  BUFFER_FNS(Escaped, escaped)
>  TAS_BUFFER_FNS(Escaped, escaped)
> +BUFFER_FNS(Needstrans, needstrans)
> +TAS_BUFFER_FNS(Needstrans, needstrans)
>
>  struct gfs2_bufdata {
>         struct buffer_head *bd_bh;
> diff --git a/fs/gfs2/lops.c b/fs/gfs2/lops.c
> index cb2a11b458c6..6f74fc43e9c2 100644
> --- a/fs/gfs2/lops.c
> +++ b/fs/gfs2/lops.c
> @@ -104,6 +104,7 @@ static void gfs2_unpin(struct gfs2_sbd *sdp, struct buffer_head *bh,
>         BUG_ON(!buffer_pinned(bh));
>
>         lock_buffer(bh);
> +       clear_buffer_needstrans(bh);
>         mark_buffer_dirty(bh);
>         clear_buffer_pinned(bh);
>
> diff --git a/fs/gfs2/trans.c b/fs/gfs2/trans.c
> index a3dfa3aa87ad..897d08e2a6b0 100644
> --- a/fs/gfs2/trans.c
> +++ b/fs/gfs2/trans.c
> @@ -130,6 +130,7 @@ static struct gfs2_bufdata *gfs2_alloc_bufdata(struct gfs2_glock *gl,
>         bd->bd_bh = bh;
>         bd->bd_gl = gl;
>         INIT_LIST_HEAD(&bd->bd_list);
> +       INIT_LIST_HEAD(&bd->bd_ail_st_list);
>         bh->b_private = bd;
>         return bd;
>  }
> @@ -155,6 +156,7 @@ void gfs2_trans_add_data(struct gfs2_glock *gl, struct buffer_head *bh)
>         struct gfs2_bufdata *bd;
>
>         lock_buffer(bh);
> +       clear_buffer_needstrans(bh);
>         if (buffer_pinned(bh)) {
>                 set_bit(TR_TOUCHED, &tr->tr_flags);
>                 goto out;
> --
> 2.26.2
>

Thanks,
Andreas



^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Cluster-devel] [GFS2 PATCH 02/11] gfs2: don't break integrity writeback on __gfs2_jdata_writepage error
  2020-07-24 18:32 ` [Cluster-devel] [GFS2 PATCH 02/11] gfs2: don't break integrity writeback on __gfs2_jdata_writepage error Bob Peterson
@ 2020-08-03 17:54   ` Andreas Gruenbacher
  0 siblings, 0 replies; 14+ messages in thread
From: Andreas Gruenbacher @ 2020-08-03 17:54 UTC (permalink / raw)
  To: cluster-devel.redhat.com

Hi Bob,

On Fri, Jul 24, 2020 at 8:33 PM Bob Peterson <rpeterso@redhat.com> wrote:
> Port Brian Foster's patch 3fa750dcf29e8606e3969d13d8e188cc1c0f511d to
> gfs2's sister function gfs2_write_cache_jdata. Brian's original description: [...]

I don't think there's a need to repeat Brian's description; it's easy
enough to look up
from the sha1. I'd be more interested in learning what this change means in the
context of gfs2 (that is, what does gfs2 need this change for).

Thanks,
Andreas



^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2020-08-03 17:54 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-07-24 18:32 [Cluster-devel] [GFS2 PATCH 00/11] gfs2: jdata patch collection Bob Peterson
2020-07-24 18:32 ` [Cluster-devel] [GFS2 PATCH 01/11] gfs2: inline gfs2_write_jdata_pagevec into gfs2_write_cache_jdata Bob Peterson
2020-07-24 18:32 ` [Cluster-devel] [GFS2 PATCH 02/11] gfs2: don't break integrity writeback on __gfs2_jdata_writepage error Bob Peterson
2020-08-03 17:54   ` Andreas Gruenbacher
2020-07-24 18:32 ` [Cluster-devel] [GFS2 PATCH 03/11] gfs2: Fix inaccurate comment Bob Peterson
2020-07-24 18:32 ` [Cluster-devel] [GFS2 PATCH 04/11] gfs2: don't try to add buffers to transactions a second time for jdata Bob Peterson
2020-08-03 17:52   ` Andreas Gruenbacher
2020-07-24 18:32 ` [Cluster-devel] [GFS2 PATCH 05/11] gfs2: Wipe jdata and ail1 in gfs2_journal_wipe, formerly gfs2_meta_wipe Bob Peterson
2020-07-24 18:32 ` [Cluster-devel] [GFS2 PATCH 06/11] gfs2: rename gfs2_write_full_page to gfs2_write_jdata_page, remove parm Bob Peterson
2020-07-24 18:33 ` [Cluster-devel] [GFS2 PATCH 07/11] gfs2: Add a new jdata-specific version of gfs2_get_block_noalloc Bob Peterson
2020-07-24 18:33 ` [Cluster-devel] [GFS2 PATCH 08/11] gfs2: Add caller info to log_blocks trace point Bob Peterson
2020-07-24 18:33 ` [Cluster-devel] [GFS2 PATCH 09/11] gfs2: enhance log_blocks trace point to show log blocks free Bob Peterson
2020-07-24 18:33 ` [Cluster-devel] [GFS2 PATCH 10/11] gfs2: print details on transactions that aren't properly ended Bob Peterson
2020-07-24 18:33 ` [Cluster-devel] [GFS2 PATCH 11/11] gfs2: Never call gfs2_block_zero_range with an open transaction Bob Peterson

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.