All of lore.kernel.org
 help / color / mirror / Atom feed
* [Cluster-devel] [bug report] GFS2: Fix bug-trap in ail flush code
@ 2021-07-28  8:31 Dan Carpenter
  2021-07-30 18:32 ` Bob Peterson
  0 siblings, 1 reply; 2+ messages in thread
From: Dan Carpenter @ 2021-07-28  8:31 UTC (permalink / raw)
  To: cluster-devel.redhat.com


Hi GFS2 devs,

This is 10 year old code, but it looks suspicious and hopefully the
recovery code doesn't get testing very often in runtime.

The patch 75549186edf1: "GFS2: Fix bug-trap in ail flush code" from
Aug 2, 2011, leads to the following static checker warning:

	fs/gfs2/glock.c:1487 gfs2_glock_dq()
	warn: sleeping in atomic context

fs/gfs2/glops.c
    57  static void __gfs2_ail_flush(struct gfs2_glock *gl, bool fsync,
    58                               unsigned int nr_revokes)
    59  {
    60          struct gfs2_sbd *sdp = gl->gl_name.ln_sbd;
    61          struct list_head *head = &gl->gl_ail_list;
    62          struct gfs2_bufdata *bd, *tmp;
    63          struct buffer_head *bh;
    64          const unsigned long b_state = (1UL << BH_Dirty)|(1UL << BH_Pinned)|(1UL << BH_Lock);
    65  
    66          gfs2_log_lock(sdp);
                ^^^^^^^^^^^^^^^^^^
    67          spin_lock(&sdp->sd_ail_lock);
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
We're holding a spinlock here

    68          list_for_each_entry_safe_reverse(bd, tmp, head, bd_ail_gl_list) {
    69                  if (nr_revokes == 0)
    70                          break;
    71                  bh = bd->bd_bh;
    72                  if (bh->b_state & b_state) {
    73                          if (fsync)
    74                                  continue;
    75                          gfs2_ail_error(gl, bh);
                                ^^^^^^^^^^^^^^^^^^^^^^
The gfs2_ail_error() function calls gfs2_withdraw() which can sleep or
the call tree that this is complains about is:

--> gfs2_ail_error()
   --> gfs2_withdraw()
    --> signal_our_withdraw()
        -->gfs2_glock_dq()

It's also very possible that this is a false positive...  Smatch doesn't
understand bit tests very well and especially across function
boundaries.

    76                  }
    77                  gfs2_trans_add_revoke(sdp, bd);
    78                  nr_revokes--;
    79          }
    80          GLOCK_BUG_ON(gl, !fsync && atomic_read(&gl->gl_ail_count));
    81          spin_unlock(&sdp->sd_ail_lock);
    82          gfs2_log_unlock(sdp);
    83  }

regards,
dan carpenter



^ permalink raw reply	[flat|nested] 2+ messages in thread

* [Cluster-devel] [bug report] GFS2: Fix bug-trap in ail flush code
  2021-07-28  8:31 [Cluster-devel] [bug report] GFS2: Fix bug-trap in ail flush code Dan Carpenter
@ 2021-07-30 18:32 ` Bob Peterson
  0 siblings, 0 replies; 2+ messages in thread
From: Bob Peterson @ 2021-07-30 18:32 UTC (permalink / raw)
  To: cluster-devel.redhat.com

On 7/28/21 3:31 AM, Dan Carpenter wrote:
> 
> Hi GFS2 devs,
> 
> This is 10 year old code, but it looks suspicious and hopefully the
> recovery code doesn't get testing very often in runtime.
> 
> The patch 75549186edf1: "GFS2: Fix bug-trap in ail flush code" from
> Aug 2, 2011, leads to the following static checker warning:
> 
> 	fs/gfs2/glock.c:1487 gfs2_glock_dq()
> 	warn: sleeping in atomic context
> 
> fs/gfs2/glops.c
>      57  static void __gfs2_ail_flush(struct gfs2_glock *gl, bool fsync,
>      58                               unsigned int nr_revokes)
>      59  {
>      60          struct gfs2_sbd *sdp = gl->gl_name.ln_sbd;
>      61          struct list_head *head = &gl->gl_ail_list;
>      62          struct gfs2_bufdata *bd, *tmp;
>      63          struct buffer_head *bh;
>      64          const unsigned long b_state = (1UL << BH_Dirty)|(1UL << BH_Pinned)|(1UL << BH_Lock);
>      65
>      66          gfs2_log_lock(sdp);
>                  ^^^^^^^^^^^^^^^^^^
>      67          spin_lock(&sdp->sd_ail_lock);
>                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> We're holding a spinlock here
> 
>      68          list_for_each_entry_safe_reverse(bd, tmp, head, bd_ail_gl_list) {
>      69                  if (nr_revokes == 0)
>      70                          break;
>      71                  bh = bd->bd_bh;
>      72                  if (bh->b_state & b_state) {
>      73                          if (fsync)
>      74                                  continue;
>      75                          gfs2_ail_error(gl, bh);
>                                  ^^^^^^^^^^^^^^^^^^^^^^
> The gfs2_ail_error() function calls gfs2_withdraw() which can sleep or
> the call tree that this is complains about is:
> 
> --> gfs2_ail_error()
>     --> gfs2_withdraw()
>      --> signal_our_withdraw()
>          -->gfs2_glock_dq()
> 
> It's also very possible that this is a false positive...  Smatch doesn't
> understand bit tests very well and especially across function
> boundaries.
> 
>      76                  }
>      77                  gfs2_trans_add_revoke(sdp, bd);
>      78                  nr_revokes--;
>      79          }
>      80          GLOCK_BUG_ON(gl, !fsync && atomic_read(&gl->gl_ail_count));
>      81          spin_unlock(&sdp->sd_ail_lock);
>      82          gfs2_log_unlock(sdp);
>      83  }
> 
> regards,
> dan carpenter
> 
Hi Dan,

Thanks. I don't think it's a false positive. I think we should be using 
a delayed withdraw rather than an actual withdraw. I'll send a patch out 
to fix it shortly.

Regards,

Bob Peterson



^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2021-07-30 18:32 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-07-28  8:31 [Cluster-devel] [bug report] GFS2: Fix bug-trap in ail flush code Dan Carpenter
2021-07-30 18:32 ` Bob Peterson

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.