linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* RAID/block regression starting from 2.6.32, bisected
       [not found]           ` <20100727220110.GF31152@atlantis.cc.ndsu.nodak.edu>
@ 2010-07-28 18:16             ` Vladislav Bolkhovitin
  2010-07-30 10:29               ` Tejun Heo
  0 siblings, 1 reply; 26+ messages in thread
From: Vladislav Bolkhovitin @ 2010-07-28 18:16 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Bryan Mesich, scst-devel, Jens Axboe, linux-kernel, linux-raid, dm-devel

Hello,

In recent kernels we are experiencing a problem that in our setup using SCST BLOCKIO backend some BIOs are finished, i.e. the finish callback called for them, with error -EIO. It happens quite often, much more often than one would expect to have an actual IO error. (BLOCKIO backend just converts all incoming SCSI commands to the corresponding block requests.)

After some investigation, we figured out, that, most likely, raid5.c::make_request() for some reason sometimes calls bio_endio() with not BIO_UPTODATE bios.

We bisected it to commit: 

commit a82afdfcb8c0df09776b6458af6b68fc58b2e87b
Author: Tejun Heo <tj@kernel.org>
Date:   Fri Jul 3 17:48:16 2009 +0900

    block: use the same failfast bits for bio and request
    
    bio and request use the same set of failfast bits.  This patch makes
    the following changes to simplify things.
    
    * enumify BIO_RW* bits and reorder bits such that BIOS_RW_FAILFAST_*
      bits coincide with __REQ_FAILFAST_* bits.
    
    * The above pushes BIO_RW_AHEAD out of sync with __REQ_FAILFAST_DEV
      but the matching is useless anyway.  init_request_from_bio() is
      responsible for setting FAILFAST bits on FS requests and non-FS
      requests never use BIO_RW_AHEAD.  Drop the code and comment from
      blk_rq_bio_prep().
    
    * Define REQ_FAILFAST_MASK which is OR of all FAILFAST bits and
      simplify FAILFAST flags handling in init_request_from_bio().
    
    Signed-off-by: Tejun Heo <tj@kernel.org>
    Signed-off-by: Jens Axboe <jens.axboe@oracle.com>

After looking at it I can't see how it can lead to the effect we are experiencing. Could anybody comment on this, please? Is it a known problem?

The error can be only reproduced when running RAID 5. The general layout is:

Disks --> RAID5 --> LVM --> BLOCKIO VDISK

The problem is easy to reproduce by forcing the RAID 5 array to re-sync its members, eg just fail out one member and add it back into the array and then generate some IO using dd. In fact, just writing out to the partition table on the exported block device is usually enough to provoke the error.

The complete thread about the topic you can find in http://sourceforge.net/mailarchive/forum.php?thread_name=20100727220110.GF31152%40atlantis.cc.ndsu.nodak.edu&forum_name=scst-devel

If any additional information is needed we would be glad to provide it.

Thanks,
Vlad



^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: RAID/block regression starting from 2.6.32, bisected
  2010-07-28 18:16             ` RAID/block regression starting from 2.6.32, bisected Vladislav Bolkhovitin
@ 2010-07-30 10:29               ` Tejun Heo
  2010-08-02  0:42                 ` Neil Brown
  0 siblings, 1 reply; 26+ messages in thread
From: Tejun Heo @ 2010-07-30 10:29 UTC (permalink / raw)
  To: Vladislav Bolkhovitin
  Cc: Bryan Mesich, scst-devel, Jens Axboe, linux-kernel, linux-raid, dm-devel

Hello,

On 07/28/2010 08:16 PM, Vladislav Bolkhovitin wrote:
> In recent kernels we are experiencing a problem that in our setup
> using SCST BLOCKIO backend some BIOs are finished, i.e. the finish
> callback called for them, with error -EIO. It happens quite often,
> much more often than one would expect to have an actual IO
> error. (BLOCKIO backend just converts all incoming SCSI commands to
> the corresponding block requests.)
> 
> After some investigation, we figured out, that, most likely,
> raid5.c::make_request() for some reason sometimes calls bio_endio()
> with not BIO_UPTODATE bios.
> 
> We bisected it to commit: 
> 
> commit a82afdfcb8c0df09776b6458af6b68fc58b2e87b
> Author: Tejun Heo <tj@kernel.org>
> Date:   Fri Jul 3 17:48:16 2009 +0900
> 
>     block: use the same failfast bits for bio and request

That commit doesn't (or at least isn't supposed to) make any behavior
difference.  It's just repositioning flag bits.  If the commit is
actually causing the problem, I think one possibility is that whatever
code could be using hard coded constants which now are mapped to
different flags.  The mixed merge changes have been in mainline for
quite some time and shipping in all major distros too and this is the
first time this is reported, so I don't think it could be a widespread
problem.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: RAID/block regression starting from 2.6.32, bisected
  2010-07-30 10:29               ` Tejun Heo
@ 2010-08-02  0:42                 ` Neil Brown
  2010-08-02 14:12                   ` [PATCH 1/2 block#for-linus] bio, fs: update READA and SWRITE to match the corresponding BIO_RW_* bits Tejun Heo
                                     ` (2 more replies)
  0 siblings, 3 replies; 26+ messages in thread
From: Neil Brown @ 2010-08-02  0:42 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Vladislav Bolkhovitin, Bryan Mesich, scst-devel, Jens Axboe,
	linux-kernel, linux-raid, dm-devel

On Fri, 30 Jul 2010 12:29:30 +0200
Tejun Heo <tj@kernel.org> wrote:

> Hello,
> 
> On 07/28/2010 08:16 PM, Vladislav Bolkhovitin wrote:
> > In recent kernels we are experiencing a problem that in our setup
> > using SCST BLOCKIO backend some BIOs are finished, i.e. the finish
> > callback called for them, with error -EIO. It happens quite often,
> > much more often than one would expect to have an actual IO
> > error. (BLOCKIO backend just converts all incoming SCSI commands to
> > the corresponding block requests.)
> > 
> > After some investigation, we figured out, that, most likely,
> > raid5.c::make_request() for some reason sometimes calls bio_endio()
> > with not BIO_UPTODATE bios.
> > 
> > We bisected it to commit: 
> > 
> > commit a82afdfcb8c0df09776b6458af6b68fc58b2e87b
> > Author: Tejun Heo <tj@kernel.org>
> > Date:   Fri Jul 3 17:48:16 2009 +0900
> > 
> >     block: use the same failfast bits for bio and request
> 
> That commit doesn't (or at least isn't supposed to) make any behavior
> difference.  It's just repositioning flag bits.  If the commit is
> actually causing the problem, I think one possibility is that whatever
> code could be using hard coded constants which now are mapped to
> different flags.  The mixed merge changes have been in mainline for
> quite some time and shipping in all major distros too and this is the
> first time this is reported, so I don't think it could be a widespread
> problem.
> 
> Thanks.
> 

The problem is that md/raid5 tests bio->bi_rw against RWA_MASK, which used to
align with BIO_RW_AHEAD, and now doesn't.
However the definition of bio_rw() in fs.h seems to justify that RWA_MASK
should align with BIO_RW_AHEAD, as does the definition of READA.

Given the current definitions, any WRITE request with BIO_RW_FAILFAST_DEV
set is going to confused a number of drives which test
     bio_rw(bio) == WRITE

I guess RWA_MASK needs to be changed to (1<<BIO_RW_AHEAD), and READA need to
be change to that value too.

Can I leave that to you Tejun?

Thanks,
NeilBrown

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH 1/2 block#for-linus] bio, fs: update READA and SWRITE to match the corresponding BIO_RW_* bits
  2010-08-02  0:42                 ` Neil Brown
@ 2010-08-02 14:12                   ` Tejun Heo
  2010-08-02 14:13                     ` [PATCH 2/2 block#for-linus] bio, fs: separate out bio_types.h and define READ/WRITE constants in terms of BIO_RW_* flags Tejun Heo
                                       ` (2 more replies)
  2010-08-02 14:15                   ` [PATCH RESEND 1/2 block#for-linus] bio, fs: update READA and SWRITE to match the corresponding BIO_RW_* bits Tejun Heo
  2010-08-02 14:17                   ` Tejun Heo
  2 siblings, 3 replies; 26+ messages in thread
From: Tejun Heo @ 2010-08-02 14:12 UTC (permalink / raw)
  To: Jens Axboe, stable
  Cc: Vladislav Bolkhovitin, Bryan Mesich, scst-devel, linux-kernel,
	linux-raid, dm-devel, Neil Brown

Commit a82afdf (block: use the same failfast bits for bio and request)
moved BIO_RW_* bits around such that they match up with REQ_* bits.
Unfortunately, fs.h hard coded READ, WRITE, READA and SWRITE as 0, 1,
2 and 3, and expected them to match with BIO_RW_* bits.  READ/WRITE
didn't change but BIO_RW_AHEAD was moved to bit 4 instead of bit 1,
breaking READA and SWRITE.

This patch updates READA and SWRITE such that they match the BIO_RW_*
bits again.  A follow up patch will update the definitions to directly
use BIO_RW_* bits so that this kind of breakage won't happen again.

Stable: The offending commit a82afdf was released with v2.6.32, so
this patch should be applied to all kernels since then but it must
_NOT_ be applied to kernels earlier than that.

Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-and-bisected-by: Vladislav Bolkhovitin <vst@vlnb.net>
Root-caused-by: Neil Brown <neilb@suse.de>
Cc: Jens Axobe <axboe@kernel.dk>
Cc: stable@kernel.org
---
Aieee... thanks for root causing it Neil.  That was a stupid bug.  I
knew that READ/WRITE were hardcoded but forgot about READA.  :-(
Moving BIO_RW_AHEAD back to bit 1 might be a better solution but I'm
afraid that would cause more confusions downstream.  This patch
updates READA and SWRITE to match BIO_RW_AHEAD and should also appear
in -stable releases.  The next patch will create bio_types.h and
define all constants in terms of BIO_RW_*.

Thanks.

 include/linux/fs.h |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

Index: work/include/linux/fs.h
===================================================================
--- work.orig/include/linux/fs.h
+++ work/include/linux/fs.h
@@ -148,8 +148,8 @@ struct inodes_stat_t {
 #define RWA_MASK	2
 #define READ 0
 #define WRITE 1
-#define READA 2		/* read-ahead  - don't block if no resources */
-#define SWRITE 3	/* for ll_rw_block() - wait for buffer lock */
+#define READA 16	/* read-ahead  - don't block if no resources */
+#define SWRITE 17	/* for ll_rw_block() - wait for buffer lock */
 #define READ_SYNC	(READ | (1 << BIO_RW_SYNCIO) | (1 << BIO_RW_UNPLUG))
 #define READ_META	(READ | (1 << BIO_RW_META))
 #define WRITE_SYNC_PLUG	(WRITE | (1 << BIO_RW_SYNCIO) | (1 << BIO_RW_NOIDLE))

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH 2/2 block#for-linus] bio, fs: separate out bio_types.h and define READ/WRITE constants in terms of BIO_RW_* flags
  2010-08-02 14:12                   ` [PATCH 1/2 block#for-linus] bio, fs: update READA and SWRITE to match the corresponding BIO_RW_* bits Tejun Heo
@ 2010-08-02 14:13                     ` Tejun Heo
  2010-08-02 14:15                     ` [PATCH RESEND " Tejun Heo
  2010-08-02 14:18                     ` Tejun Heo
  2 siblings, 0 replies; 26+ messages in thread
From: Tejun Heo @ 2010-08-02 14:13 UTC (permalink / raw)
  To: Jens Axboe, stable
  Cc: Vladislav Bolkhovitin, Bryan Mesich, scst-devel, linux-kernel,
	linux-raid, dm-devel, Neil Brown

linux/fs.h hard coded READ/WRITE constants which should match BIO_RW_*
flags.  This is fragile and caused breakage during BIO_RW_* flag
rearrangement.  The hardcoding is to avoid include dependency hell.

Create linux/bio_types.h which contatins definitions for bio data
structures and flags and include it from bio.h and fs.h, and make fs.h
define all READ/WRITE related constants in terms of BIO_RW_* flags.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Jens Axobe <axboe@kernel.dk>
---
 include/linux/bio.h       |  153 +-----------------------------------------
 include/linux/bio_types.h |  164 ++++++++++++++++++++++++++++++++++++++++++++++
 include/linux/fs.h        |   17 ++--
 3 files changed, 176 insertions(+), 158 deletions(-)

Index: work/include/linux/bio.h
===================================================================
--- work.orig/include/linux/bio.h
+++ work/include/linux/bio.h
@@ -9,7 +9,7 @@
  *
  * This program is distributed in the hope that it will be useful,
  * but WITHOUT ANY WARRANTY; without even the implied warranty of
-
+ *
  * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
  * GNU General Public License for more details.
  *
@@ -28,6 +28,9 @@

 #include <asm/io.h>

+/* struct bio, bio_vec and BIO_* flags are defined in bio_types.h */
+#include <linux/bio_types.h>
+
 #define BIO_DEBUG

 #ifdef BIO_DEBUG
@@ -41,154 +44,6 @@
 #define BIO_MAX_SECTORS		(BIO_MAX_SIZE >> 9)

 /*
- * was unsigned short, but we might as well be ready for > 64kB I/O pages
- */
-struct bio_vec {
-	struct page	*bv_page;
-	unsigned int	bv_len;
-	unsigned int	bv_offset;
-};
-
-struct bio_set;
-struct bio;
-struct bio_integrity_payload;
-typedef void (bio_end_io_t) (struct bio *, int);
-typedef void (bio_destructor_t) (struct bio *);
-
-/*
- * main unit of I/O for the block layer and lower layers (ie drivers and
- * stacking drivers)
- */
-struct bio {
-	sector_t		bi_sector;	/* device address in 512 byte
-						   sectors */
-	struct bio		*bi_next;	/* request queue link */
-	struct block_device	*bi_bdev;
-	unsigned long		bi_flags;	/* status, command, etc */
-	unsigned long		bi_rw;		/* bottom bits READ/WRITE,
-						 * top bits priority
-						 */
-
-	unsigned short		bi_vcnt;	/* how many bio_vec's */
-	unsigned short		bi_idx;		/* current index into bvl_vec */
-
-	/* Number of segments in this BIO after
-	 * physical address coalescing is performed.
-	 */
-	unsigned int		bi_phys_segments;
-
-	unsigned int		bi_size;	/* residual I/O count */
-
-	/*
-	 * To keep track of the max segment size, we account for the
-	 * sizes of the first and last mergeable segments in this bio.
-	 */
-	unsigned int		bi_seg_front_size;
-	unsigned int		bi_seg_back_size;
-
-	unsigned int		bi_max_vecs;	/* max bvl_vecs we can hold */
-
-	unsigned int		bi_comp_cpu;	/* completion CPU */
-
-	atomic_t		bi_cnt;		/* pin count */
-
-	struct bio_vec		*bi_io_vec;	/* the actual vec list */
-
-	bio_end_io_t		*bi_end_io;
-
-	void			*bi_private;
-#if defined(CONFIG_BLK_DEV_INTEGRITY)
-	struct bio_integrity_payload *bi_integrity;  /* data integrity */
-#endif
-
-	bio_destructor_t	*bi_destructor;	/* destructor */
-
-	/*
-	 * We can inline a number of vecs at the end of the bio, to avoid
-	 * double allocations for a small number of bio_vecs. This member
-	 * MUST obviously be kept at the very end of the bio.
-	 */
-	struct bio_vec		bi_inline_vecs[0];
-};
-
-/*
- * bio flags
- */
-#define BIO_UPTODATE	0	/* ok after I/O completion */
-#define BIO_RW_BLOCK	1	/* RW_AHEAD set, and read/write would block */
-#define BIO_EOF		2	/* out-out-bounds error */
-#define BIO_SEG_VALID	3	/* bi_phys_segments valid */
-#define BIO_CLONED	4	/* doesn't own data */
-#define BIO_BOUNCED	5	/* bio is a bounce bio */
-#define BIO_USER_MAPPED 6	/* contains user pages */
-#define BIO_EOPNOTSUPP	7	/* not supported */
-#define BIO_CPU_AFFINE	8	/* complete bio on same CPU as submitted */
-#define BIO_NULL_MAPPED 9	/* contains invalid user pages */
-#define BIO_FS_INTEGRITY 10	/* fs owns integrity data, not block layer */
-#define BIO_QUIET	11	/* Make BIO Quiet */
-#define bio_flagged(bio, flag)	((bio)->bi_flags & (1 << (flag)))
-
-/*
- * top 4 bits of bio flags indicate the pool this bio came from
- */
-#define BIO_POOL_BITS		(4)
-#define BIO_POOL_NONE		((1UL << BIO_POOL_BITS) - 1)
-#define BIO_POOL_OFFSET		(BITS_PER_LONG - BIO_POOL_BITS)
-#define BIO_POOL_MASK		(1UL << BIO_POOL_OFFSET)
-#define BIO_POOL_IDX(bio)	((bio)->bi_flags >> BIO_POOL_OFFSET)	
-
-/*
- * bio bi_rw flags
- *
- * bit 0 -- data direction
- *	If not set, bio is a read from device. If set, it's a write to device.
- * bit 1 -- fail fast device errors
- * bit 2 -- fail fast transport errors
- * bit 3 -- fail fast driver errors
- * bit 4 -- rw-ahead when set
- * bit 5 -- barrier
- *	Insert a serialization point in the IO queue, forcing previously
- *	submitted IO to be completed before this one is issued.
- * bit 6 -- synchronous I/O hint.
- * bit 7 -- Unplug the device immediately after submitting this bio.
- * bit 8 -- metadata request
- *	Used for tracing to differentiate metadata and data IO. May also
- *	get some preferential treatment in the IO scheduler
- * bit 9 -- discard sectors
- *	Informs the lower level device that this range of sectors is no longer
- *	used by the file system and may thus be freed by the device. Used
- *	for flash based storage.
- *	Don't want driver retries for any fast fail whatever the reason.
- * bit 10 -- Tell the IO scheduler not to wait for more requests after this
-	one has been submitted, even if it is a SYNC request.
- */
-enum bio_rw_flags {
-	BIO_RW,
-	BIO_RW_FAILFAST_DEV,
-	BIO_RW_FAILFAST_TRANSPORT,
-	BIO_RW_FAILFAST_DRIVER,
-	/* above flags must match REQ_* */
-	BIO_RW_AHEAD,
-	BIO_RW_BARRIER,
-	BIO_RW_SYNCIO,
-	BIO_RW_UNPLUG,
-	BIO_RW_META,
-	BIO_RW_DISCARD,
-	BIO_RW_NOIDLE,
-};
-
-/*
- * First four bits must match between bio->bi_rw and rq->cmd_flags, make
- * that explicit here.
- */
-#define BIO_RW_RQ_MASK		0xf
-
-static inline bool bio_rw_flagged(struct bio *bio, enum bio_rw_flags flag)
-{
-	return (bio->bi_rw & (1 << flag)) != 0;
-}
-
-/*
  * upper 16 bits of bi_rw define the io priority of this bio
  */
 #define BIO_PRIO_SHIFT	(8 * sizeof(unsigned long) - IOPRIO_BITS)
Index: work/include/linux/bio_types.h
===================================================================
--- /dev/null
+++ work/include/linux/bio_types.h
@@ -0,0 +1,164 @@
+/*
+ * BIO data types and constants.  Include linux/bio.h for usual cases.
+ * Directly include this file only to break include dependency loop.
+ */
+#ifndef __LINUX_BIO_TYPES_H
+#define __LINUX_BIO_TYPES_H
+
+#ifdef CONFIG_BLOCK
+
+#include <linux/types.h>
+
+struct bio_set;
+struct bio;
+struct bio_integrity_payload;
+struct page;
+struct block_device;
+
+/*
+ * was unsigned short, but we might as well be ready for > 64kB I/O pages
+ */
+struct bio_vec {
+	struct page	*bv_page;
+	unsigned int	bv_len;
+	unsigned int	bv_offset;
+};
+
+typedef void (bio_end_io_t) (struct bio *, int);
+typedef void (bio_destructor_t) (struct bio *);
+
+/*
+ * main unit of I/O for the block layer and lower layers (ie drivers and
+ * stacking drivers)
+ */
+struct bio {
+	sector_t		bi_sector;	/* device address in 512 byte
+						   sectors */
+	struct bio		*bi_next;	/* request queue link */
+	struct block_device	*bi_bdev;
+	unsigned long		bi_flags;	/* status, command, etc */
+	unsigned long		bi_rw;		/* bottom bits READ/WRITE,
+						 * top bits priority
+						 */
+
+	unsigned short		bi_vcnt;	/* how many bio_vec's */
+	unsigned short		bi_idx;		/* current index into bvl_vec */
+
+	/* Number of segments in this BIO after
+	 * physical address coalescing is performed.
+	 */
+	unsigned int		bi_phys_segments;
+
+	unsigned int		bi_size;	/* residual I/O count */
+
+	/*
+	 * To keep track of the max segment size, we account for the
+	 * sizes of the first and last mergeable segments in this bio.
+	 */
+	unsigned int		bi_seg_front_size;
+	unsigned int		bi_seg_back_size;
+
+	unsigned int		bi_max_vecs;	/* max bvl_vecs we can hold */
+
+	unsigned int		bi_comp_cpu;	/* completion CPU */
+
+	atomic_t		bi_cnt;		/* pin count */
+
+	struct bio_vec		*bi_io_vec;	/* the actual vec list */
+
+	bio_end_io_t		*bi_end_io;
+
+	void			*bi_private;
+#if defined(CONFIG_BLK_DEV_INTEGRITY)
+	struct bio_integrity_payload *bi_integrity;  /* data integrity */
+#endif
+
+	bio_destructor_t	*bi_destructor;	/* destructor */
+
+	/*
+	 * We can inline a number of vecs at the end of the bio, to avoid
+	 * double allocations for a small number of bio_vecs. This member
+	 * MUST obviously be kept at the very end of the bio.
+	 */
+	struct bio_vec		bi_inline_vecs[0];
+};
+
+/*
+ * bio flags
+ */
+#define BIO_UPTODATE	0	/* ok after I/O completion */
+#define BIO_RW_BLOCK	1	/* RW_AHEAD set, and read/write would block */
+#define BIO_EOF		2	/* out-out-bounds error */
+#define BIO_SEG_VALID	3	/* bi_phys_segments valid */
+#define BIO_CLONED	4	/* doesn't own data */
+#define BIO_BOUNCED	5	/* bio is a bounce bio */
+#define BIO_USER_MAPPED 6	/* contains user pages */
+#define BIO_EOPNOTSUPP	7	/* not supported */
+#define BIO_CPU_AFFINE	8	/* complete bio on same CPU as submitted */
+#define BIO_NULL_MAPPED 9	/* contains invalid user pages */
+#define BIO_FS_INTEGRITY 10	/* fs owns integrity data, not block layer */
+#define BIO_QUIET	11	/* Make BIO Quiet */
+#define bio_flagged(bio, flag)	((bio)->bi_flags & (1 << (flag)))
+
+/*
+ * top 4 bits of bio flags indicate the pool this bio came from
+ */
+#define BIO_POOL_BITS		(4)
+#define BIO_POOL_NONE		((1UL << BIO_POOL_BITS) - 1)
+#define BIO_POOL_OFFSET		(BITS_PER_LONG - BIO_POOL_BITS)
+#define BIO_POOL_MASK		(1UL << BIO_POOL_OFFSET)
+#define BIO_POOL_IDX(bio)	((bio)->bi_flags >> BIO_POOL_OFFSET)
+
+/*
+ * bio bi_rw flags
+ *
+ * bit 0 -- data direction
+ *	If not set, bio is a read from device. If set, it's a write to device.
+ * bit 1 -- fail fast device errors
+ * bit 2 -- fail fast transport errors
+ * bit 3 -- fail fast driver errors
+ * bit 4 -- rw-ahead when set
+ * bit 5 -- barrier
+ *	Insert a serialization point in the IO queue, forcing previously
+ *	submitted IO to be completed before this one is issued.
+ * bit 6 -- synchronous I/O hint.
+ * bit 7 -- Unplug the device immediately after submitting this bio.
+ * bit 8 -- metadata request
+ *	Used for tracing to differentiate metadata and data IO. May also
+ *	get some preferential treatment in the IO scheduler
+ * bit 9 -- discard sectors
+ *	Informs the lower level device that this range of sectors is no longer
+ *	used by the file system and may thus be freed by the device. Used
+ *	for flash based storage.
+ *	Don't want driver retries for any fast fail whatever the reason.
+ * bit 10 -- Tell the IO scheduler not to wait for more requests after this
+	one has been submitted, even if it is a SYNC request.
+ */
+enum bio_rw_flags {
+	BIO_RW,
+	BIO_RW_FAILFAST_DEV,
+	BIO_RW_FAILFAST_TRANSPORT,
+	BIO_RW_FAILFAST_DRIVER,
+	/* above flags must match REQ_* */
+	BIO_RW_AHEAD,
+	BIO_RW_BARRIER,
+	BIO_RW_SYNCIO,
+	BIO_RW_UNPLUG,
+	BIO_RW_META,
+	BIO_RW_DISCARD,
+	BIO_RW_NOIDLE,
+};
+
+/*
+ * First four bits must match between bio->bi_rw and rq->cmd_flags, make
+ * that explicit here.
+ */
+#define BIO_RW_RQ_MASK		0xf
+
+static inline bool bio_rw_flagged(struct bio *bio, enum bio_rw_flags flag)
+{
+	return (bio->bi_rw & (1 << flag)) != 0;
+}
+
+#endif /* CONFIG_BLOCK */
+#endif /* __LINUX_BIO_TYPES_H */
Index: work/include/linux/fs.h
===================================================================
--- work.orig/include/linux/fs.h
+++ work/include/linux/fs.h
@@ -8,6 +8,7 @@

 #include <linux/limits.h>
 #include <linux/ioctl.h>
+#include <linux/bio_types.h>

 /*
  * It's silly to have NR_OPEN bigger than NR_FILE, but you can change
@@ -117,7 +118,7 @@ struct inodes_stat_t {
  *			immediately wait on this read without caring about
  *			unplugging.
  * READA		Used for read-ahead operations. Lower priority, and the
- *			 block layer could (in theory) choose to ignore this
+ *			block layer could (in theory) choose to ignore this
  *			request if it runs into resource problems.
  * WRITE		A normal async write. Device will be plugged.
  * SWRITE		Like WRITE, but a special case for ll_rw_block() that
@@ -144,12 +145,12 @@ struct inodes_stat_t {
  *			of this IO.
  *
  */
-#define RW_MASK		1
-#define RWA_MASK	2
-#define READ 0
-#define WRITE 1
-#define READA 16	/* read-ahead  - don't block if no resources */
-#define SWRITE 17	/* for ll_rw_block() - wait for buffer lock */
+#define RW_MASK		(1 << BIO_RW)
+#define RWA_MASK	(1 << BIO_RW_AHEAD)
+#define READ		0
+#define WRITE		RW_MASK
+#define READA		RWA_MASK
+#define SWRITE		(WRITE | READA)
 #define READ_SYNC	(READ | (1 << BIO_RW_SYNCIO) | (1 << BIO_RW_UNPLUG))
 #define READ_META	(READ | (1 << BIO_RW_META))
 #define WRITE_SYNC_PLUG	(WRITE | (1 << BIO_RW_SYNCIO) | (1 << BIO_RW_NOIDLE))
@@ -2198,7 +2199,6 @@ static inline void insert_inode_hash(str
 extern void file_move(struct file *f, struct list_head *list);
 extern void file_kill(struct file *f);
 #ifdef CONFIG_BLOCK
-struct bio;
 extern void submit_bio(int, struct bio *);
 extern int bdev_read_only(struct block_device *);
 #endif
@@ -2265,7 +2265,6 @@ static inline int xip_truncate_page(stru
 #endif

 #ifdef CONFIG_BLOCK
-struct bio;
 typedef void (dio_submit_t)(int rw, struct bio *bio, struct inode *inode,
 			    loff_t file_offset);
 void dio_end_io(struct bio *bio, int error);

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH RESEND 1/2 block#for-linus] bio, fs: update READA and SWRITE to match the corresponding BIO_RW_* bits
  2010-08-02  0:42                 ` Neil Brown
  2010-08-02 14:12                   ` [PATCH 1/2 block#for-linus] bio, fs: update READA and SWRITE to match the corresponding BIO_RW_* bits Tejun Heo
@ 2010-08-02 14:15                   ` Tejun Heo
  2010-08-02 21:52                     ` Neil Brown
  2010-08-05 18:45                     ` Jeff Moyer
  2010-08-02 14:17                   ` Tejun Heo
  2 siblings, 2 replies; 26+ messages in thread
From: Tejun Heo @ 2010-08-02 14:15 UTC (permalink / raw)
  To: Jens Axboe, stable
  Cc: Vladislav Bolkhovitin, Bryan Mesich, scst-devel, linux-kernel,
	linux-raid, dm-devel, Neil Brown

Commit a82afdf (block: use the same failfast bits for bio and request)
moved BIO_RW_* bits around such that they match up with REQ_* bits.
Unfortunately, fs.h hard coded READ, WRITE, READA and SWRITE as 0, 1,
2 and 3, and expected them to match with BIO_RW_* bits.  READ/WRITE
didn't change but BIO_RW_AHEAD was moved to bit 4 instead of bit 1,
breaking READA and SWRITE.

This patch updates READA and SWRITE such that they match the BIO_RW_*
bits again.  A follow up patch will update the definitions to directly
use BIO_RW_* bits so that this kind of breakage won't happen again.

Stable: The offending commit a82afdf was released with v2.6.32, so
this patch should be applied to all kernels since then but it must
_NOT_ be applied to kernels earlier than that.

Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-and-bisected-by: Vladislav Bolkhovitin <vst@vlnb.net>
Root-caused-by: Neil Brown <neilb@suse.de>
Cc: Jens Axobe <axboe@kernel.dk>
Cc: stable@kernel.org
---
Aieee... thanks for root causing it Neil.  That was a stupid bug.  I
knew that READ/WRITE were hardcoded but forgot about READA.  :-(
Moving BIO_RW_AHEAD back to bit 1 might be a better solution but I'm
afraid that would cause more confusions downstream.  This patch
updates READA and SWRITE to match BIO_RW_AHEAD and should also appear
in -stable releases.  The next patch will create bio_types.h and
define all constants in terms of BIO_RW_*.

Thanks.

(resending w/ Jens' new address)

 include/linux/fs.h |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

Index: work/include/linux/fs.h
===================================================================
--- work.orig/include/linux/fs.h
+++ work/include/linux/fs.h
@@ -148,8 +148,8 @@ struct inodes_stat_t {
 #define RWA_MASK	2
 #define READ 0
 #define WRITE 1
-#define READA 2		/* read-ahead  - don't block if no resources */
-#define SWRITE 3	/* for ll_rw_block() - wait for buffer lock */
+#define READA 16	/* read-ahead  - don't block if no resources */
+#define SWRITE 17	/* for ll_rw_block() - wait for buffer lock */
 #define READ_SYNC	(READ | (1 << BIO_RW_SYNCIO) | (1 << BIO_RW_UNPLUG))
 #define READ_META	(READ | (1 << BIO_RW_META))
 #define WRITE_SYNC_PLUG	(WRITE | (1 << BIO_RW_SYNCIO) | (1 << BIO_RW_NOIDLE))

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH RESEND 2/2 block#for-linus] bio, fs: separate out bio_types.h and define READ/WRITE constants in terms of BIO_RW_* flags
  2010-08-02 14:12                   ` [PATCH 1/2 block#for-linus] bio, fs: update READA and SWRITE to match the corresponding BIO_RW_* bits Tejun Heo
  2010-08-02 14:13                     ` [PATCH 2/2 block#for-linus] bio, fs: separate out bio_types.h and define READ/WRITE constants in terms of BIO_RW_* flags Tejun Heo
@ 2010-08-02 14:15                     ` Tejun Heo
  2010-08-02 14:18                     ` Tejun Heo
  2 siblings, 0 replies; 26+ messages in thread
From: Tejun Heo @ 2010-08-02 14:15 UTC (permalink / raw)
  To: Jens Axboe, stable
  Cc: Vladislav Bolkhovitin, Bryan Mesich, scst-devel, linux-kernel,
	linux-raid, dm-devel, Neil Brown

linux/fs.h hard coded READ/WRITE constants which should match BIO_RW_*
flags.  This is fragile and caused breakage during BIO_RW_* flag
rearrangement.  The hardcoding is to avoid include dependency hell.

Create linux/bio_types.h which contatins definitions for bio data
structures and flags and include it from bio.h and fs.h, and make fs.h
define all READ/WRITE related constants in terms of BIO_RW_* flags.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Jens Axobe <axboe@kernel.dk>
---
 include/linux/bio.h       |  153 +-----------------------------------------
 include/linux/bio_types.h |  164 ++++++++++++++++++++++++++++++++++++++++++++++
 include/linux/fs.h        |   17 ++--
 3 files changed, 176 insertions(+), 158 deletions(-)

Index: work/include/linux/bio.h
===================================================================
--- work.orig/include/linux/bio.h
+++ work/include/linux/bio.h
@@ -9,7 +9,7 @@
  *
  * This program is distributed in the hope that it will be useful,
  * but WITHOUT ANY WARRANTY; without even the implied warranty of
-
+ *
  * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
  * GNU General Public License for more details.
  *
@@ -28,6 +28,9 @@

 #include <asm/io.h>

+/* struct bio, bio_vec and BIO_* flags are defined in bio_types.h */
+#include <linux/bio_types.h>
+
 #define BIO_DEBUG

 #ifdef BIO_DEBUG
@@ -41,154 +44,6 @@
 #define BIO_MAX_SECTORS		(BIO_MAX_SIZE >> 9)

 /*
- * was unsigned short, but we might as well be ready for > 64kB I/O pages
- */
-struct bio_vec {
-	struct page	*bv_page;
-	unsigned int	bv_len;
-	unsigned int	bv_offset;
-};
-
-struct bio_set;
-struct bio;
-struct bio_integrity_payload;
-typedef void (bio_end_io_t) (struct bio *, int);
-typedef void (bio_destructor_t) (struct bio *);
-
-/*
- * main unit of I/O for the block layer and lower layers (ie drivers and
- * stacking drivers)
- */
-struct bio {
-	sector_t		bi_sector;	/* device address in 512 byte
-						   sectors */
-	struct bio		*bi_next;	/* request queue link */
-	struct block_device	*bi_bdev;
-	unsigned long		bi_flags;	/* status, command, etc */
-	unsigned long		bi_rw;		/* bottom bits READ/WRITE,
-						 * top bits priority
-						 */
-
-	unsigned short		bi_vcnt;	/* how many bio_vec's */
-	unsigned short		bi_idx;		/* current index into bvl_vec */
-
-	/* Number of segments in this BIO after
-	 * physical address coalescing is performed.
-	 */
-	unsigned int		bi_phys_segments;
-
-	unsigned int		bi_size;	/* residual I/O count */
-
-	/*
-	 * To keep track of the max segment size, we account for the
-	 * sizes of the first and last mergeable segments in this bio.
-	 */
-	unsigned int		bi_seg_front_size;
-	unsigned int		bi_seg_back_size;
-
-	unsigned int		bi_max_vecs;	/* max bvl_vecs we can hold */
-
-	unsigned int		bi_comp_cpu;	/* completion CPU */
-
-	atomic_t		bi_cnt;		/* pin count */
-
-	struct bio_vec		*bi_io_vec;	/* the actual vec list */
-
-	bio_end_io_t		*bi_end_io;
-
-	void			*bi_private;
-#if defined(CONFIG_BLK_DEV_INTEGRITY)
-	struct bio_integrity_payload *bi_integrity;  /* data integrity */
-#endif
-
-	bio_destructor_t	*bi_destructor;	/* destructor */
-
-	/*
-	 * We can inline a number of vecs at the end of the bio, to avoid
-	 * double allocations for a small number of bio_vecs. This member
-	 * MUST obviously be kept at the very end of the bio.
-	 */
-	struct bio_vec		bi_inline_vecs[0];
-};
-
-/*
- * bio flags
- */
-#define BIO_UPTODATE	0	/* ok after I/O completion */
-#define BIO_RW_BLOCK	1	/* RW_AHEAD set, and read/write would block */
-#define BIO_EOF		2	/* out-out-bounds error */
-#define BIO_SEG_VALID	3	/* bi_phys_segments valid */
-#define BIO_CLONED	4	/* doesn't own data */
-#define BIO_BOUNCED	5	/* bio is a bounce bio */
-#define BIO_USER_MAPPED 6	/* contains user pages */
-#define BIO_EOPNOTSUPP	7	/* not supported */
-#define BIO_CPU_AFFINE	8	/* complete bio on same CPU as submitted */
-#define BIO_NULL_MAPPED 9	/* contains invalid user pages */
-#define BIO_FS_INTEGRITY 10	/* fs owns integrity data, not block layer */
-#define BIO_QUIET	11	/* Make BIO Quiet */
-#define bio_flagged(bio, flag)	((bio)->bi_flags & (1 << (flag)))
-
-/*
- * top 4 bits of bio flags indicate the pool this bio came from
- */
-#define BIO_POOL_BITS		(4)
-#define BIO_POOL_NONE		((1UL << BIO_POOL_BITS) - 1)
-#define BIO_POOL_OFFSET		(BITS_PER_LONG - BIO_POOL_BITS)
-#define BIO_POOL_MASK		(1UL << BIO_POOL_OFFSET)
-#define BIO_POOL_IDX(bio)	((bio)->bi_flags >> BIO_POOL_OFFSET)	
-
-/*
- * bio bi_rw flags
- *
- * bit 0 -- data direction
- *	If not set, bio is a read from device. If set, it's a write to device.
- * bit 1 -- fail fast device errors
- * bit 2 -- fail fast transport errors
- * bit 3 -- fail fast driver errors
- * bit 4 -- rw-ahead when set
- * bit 5 -- barrier
- *	Insert a serialization point in the IO queue, forcing previously
- *	submitted IO to be completed before this one is issued.
- * bit 6 -- synchronous I/O hint.
- * bit 7 -- Unplug the device immediately after submitting this bio.
- * bit 8 -- metadata request
- *	Used for tracing to differentiate metadata and data IO. May also
- *	get some preferential treatment in the IO scheduler
- * bit 9 -- discard sectors
- *	Informs the lower level device that this range of sectors is no longer
- *	used by the file system and may thus be freed by the device. Used
- *	for flash based storage.
- *	Don't want driver retries for any fast fail whatever the reason.
- * bit 10 -- Tell the IO scheduler not to wait for more requests after this
-	one has been submitted, even if it is a SYNC request.
- */
-enum bio_rw_flags {
-	BIO_RW,
-	BIO_RW_FAILFAST_DEV,
-	BIO_RW_FAILFAST_TRANSPORT,
-	BIO_RW_FAILFAST_DRIVER,
-	/* above flags must match REQ_* */
-	BIO_RW_AHEAD,
-	BIO_RW_BARRIER,
-	BIO_RW_SYNCIO,
-	BIO_RW_UNPLUG,
-	BIO_RW_META,
-	BIO_RW_DISCARD,
-	BIO_RW_NOIDLE,
-};
-
-/*
- * First four bits must match between bio->bi_rw and rq->cmd_flags, make
- * that explicit here.
- */
-#define BIO_RW_RQ_MASK		0xf
-
-static inline bool bio_rw_flagged(struct bio *bio, enum bio_rw_flags flag)
-{
-	return (bio->bi_rw & (1 << flag)) != 0;
-}
-
-/*
  * upper 16 bits of bi_rw define the io priority of this bio
  */
 #define BIO_PRIO_SHIFT	(8 * sizeof(unsigned long) - IOPRIO_BITS)
Index: work/include/linux/bio_types.h
===================================================================
--- /dev/null
+++ work/include/linux/bio_types.h
@@ -0,0 +1,164 @@
+/*
+ * BIO data types and constants.  Include linux/bio.h for usual cases.
+ * Directly include this file only to break include dependency loop.
+ */
+#ifndef __LINUX_BIO_TYPES_H
+#define __LINUX_BIO_TYPES_H
+
+#ifdef CONFIG_BLOCK
+
+#include <linux/types.h>
+
+struct bio_set;
+struct bio;
+struct bio_integrity_payload;
+struct page;
+struct block_device;
+
+/*
+ * was unsigned short, but we might as well be ready for > 64kB I/O pages
+ */
+struct bio_vec {
+	struct page	*bv_page;
+	unsigned int	bv_len;
+	unsigned int	bv_offset;
+};
+
+typedef void (bio_end_io_t) (struct bio *, int);
+typedef void (bio_destructor_t) (struct bio *);
+
+/*
+ * main unit of I/O for the block layer and lower layers (ie drivers and
+ * stacking drivers)
+ */
+struct bio {
+	sector_t		bi_sector;	/* device address in 512 byte
+						   sectors */
+	struct bio		*bi_next;	/* request queue link */
+	struct block_device	*bi_bdev;
+	unsigned long		bi_flags;	/* status, command, etc */
+	unsigned long		bi_rw;		/* bottom bits READ/WRITE,
+						 * top bits priority
+						 */
+
+	unsigned short		bi_vcnt;	/* how many bio_vec's */
+	unsigned short		bi_idx;		/* current index into bvl_vec */
+
+	/* Number of segments in this BIO after
+	 * physical address coalescing is performed.
+	 */
+	unsigned int		bi_phys_segments;
+
+	unsigned int		bi_size;	/* residual I/O count */
+
+	/*
+	 * To keep track of the max segment size, we account for the
+	 * sizes of the first and last mergeable segments in this bio.
+	 */
+	unsigned int		bi_seg_front_size;
+	unsigned int		bi_seg_back_size;
+
+	unsigned int		bi_max_vecs;	/* max bvl_vecs we can hold */
+
+	unsigned int		bi_comp_cpu;	/* completion CPU */
+
+	atomic_t		bi_cnt;		/* pin count */
+
+	struct bio_vec		*bi_io_vec;	/* the actual vec list */
+
+	bio_end_io_t		*bi_end_io;
+
+	void			*bi_private;
+#if defined(CONFIG_BLK_DEV_INTEGRITY)
+	struct bio_integrity_payload *bi_integrity;  /* data integrity */
+#endif
+
+	bio_destructor_t	*bi_destructor;	/* destructor */
+
+	/*
+	 * We can inline a number of vecs at the end of the bio, to avoid
+	 * double allocations for a small number of bio_vecs. This member
+	 * MUST obviously be kept at the very end of the bio.
+	 */
+	struct bio_vec		bi_inline_vecs[0];
+};
+
+/*
+ * bio flags
+ */
+#define BIO_UPTODATE	0	/* ok after I/O completion */
+#define BIO_RW_BLOCK	1	/* RW_AHEAD set, and read/write would block */
+#define BIO_EOF		2	/* out-out-bounds error */
+#define BIO_SEG_VALID	3	/* bi_phys_segments valid */
+#define BIO_CLONED	4	/* doesn't own data */
+#define BIO_BOUNCED	5	/* bio is a bounce bio */
+#define BIO_USER_MAPPED 6	/* contains user pages */
+#define BIO_EOPNOTSUPP	7	/* not supported */
+#define BIO_CPU_AFFINE	8	/* complete bio on same CPU as submitted */
+#define BIO_NULL_MAPPED 9	/* contains invalid user pages */
+#define BIO_FS_INTEGRITY 10	/* fs owns integrity data, not block layer */
+#define BIO_QUIET	11	/* Make BIO Quiet */
+#define bio_flagged(bio, flag)	((bio)->bi_flags & (1 << (flag)))
+
+/*
+ * top 4 bits of bio flags indicate the pool this bio came from
+ */
+#define BIO_POOL_BITS		(4)
+#define BIO_POOL_NONE		((1UL << BIO_POOL_BITS) - 1)
+#define BIO_POOL_OFFSET		(BITS_PER_LONG - BIO_POOL_BITS)
+#define BIO_POOL_MASK		(1UL << BIO_POOL_OFFSET)
+#define BIO_POOL_IDX(bio)	((bio)->bi_flags >> BIO_POOL_OFFSET)
+
+/*
+ * bio bi_rw flags
+ *
+ * bit 0 -- data direction
+ *	If not set, bio is a read from device. If set, it's a write to device.
+ * bit 1 -- fail fast device errors
+ * bit 2 -- fail fast transport errors
+ * bit 3 -- fail fast driver errors
+ * bit 4 -- rw-ahead when set
+ * bit 5 -- barrier
+ *	Insert a serialization point in the IO queue, forcing previously
+ *	submitted IO to be completed before this one is issued.
+ * bit 6 -- synchronous I/O hint.
+ * bit 7 -- Unplug the device immediately after submitting this bio.
+ * bit 8 -- metadata request
+ *	Used for tracing to differentiate metadata and data IO. May also
+ *	get some preferential treatment in the IO scheduler
+ * bit 9 -- discard sectors
+ *	Informs the lower level device that this range of sectors is no longer
+ *	used by the file system and may thus be freed by the device. Used
+ *	for flash based storage.
+ *	Don't want driver retries for any fast fail whatever the reason.
+ * bit 10 -- Tell the IO scheduler not to wait for more requests after this
+	one has been submitted, even if it is a SYNC request.
+ */
+enum bio_rw_flags {
+	BIO_RW,
+	BIO_RW_FAILFAST_DEV,
+	BIO_RW_FAILFAST_TRANSPORT,
+	BIO_RW_FAILFAST_DRIVER,
+	/* above flags must match REQ_* */
+	BIO_RW_AHEAD,
+	BIO_RW_BARRIER,
+	BIO_RW_SYNCIO,
+	BIO_RW_UNPLUG,
+	BIO_RW_META,
+	BIO_RW_DISCARD,
+	BIO_RW_NOIDLE,
+};
+
+/*
+ * First four bits must match between bio->bi_rw and rq->cmd_flags, make
+ * that explicit here.
+ */
+#define BIO_RW_RQ_MASK		0xf
+
+static inline bool bio_rw_flagged(struct bio *bio, enum bio_rw_flags flag)
+{
+	return (bio->bi_rw & (1 << flag)) != 0;
+}
+
+#endif /* CONFIG_BLOCK */
+#endif /* __LINUX_BIO_TYPES_H */
Index: work/include/linux/fs.h
===================================================================
--- work.orig/include/linux/fs.h
+++ work/include/linux/fs.h
@@ -8,6 +8,7 @@

 #include <linux/limits.h>
 #include <linux/ioctl.h>
+#include <linux/bio_types.h>

 /*
  * It's silly to have NR_OPEN bigger than NR_FILE, but you can change
@@ -117,7 +118,7 @@ struct inodes_stat_t {
  *			immediately wait on this read without caring about
  *			unplugging.
  * READA		Used for read-ahead operations. Lower priority, and the
- *			 block layer could (in theory) choose to ignore this
+ *			block layer could (in theory) choose to ignore this
  *			request if it runs into resource problems.
  * WRITE		A normal async write. Device will be plugged.
  * SWRITE		Like WRITE, but a special case for ll_rw_block() that
@@ -144,12 +145,12 @@ struct inodes_stat_t {
  *			of this IO.
  *
  */
-#define RW_MASK		1
-#define RWA_MASK	2
-#define READ 0
-#define WRITE 1
-#define READA 16	/* read-ahead  - don't block if no resources */
-#define SWRITE 17	/* for ll_rw_block() - wait for buffer lock */
+#define RW_MASK		(1 << BIO_RW)
+#define RWA_MASK	(1 << BIO_RW_AHEAD)
+#define READ		0
+#define WRITE		RW_MASK
+#define READA		RWA_MASK
+#define SWRITE		(WRITE | READA)
 #define READ_SYNC	(READ | (1 << BIO_RW_SYNCIO) | (1 << BIO_RW_UNPLUG))
 #define READ_META	(READ | (1 << BIO_RW_META))
 #define WRITE_SYNC_PLUG	(WRITE | (1 << BIO_RW_SYNCIO) | (1 << BIO_RW_NOIDLE))
@@ -2198,7 +2199,6 @@ static inline void insert_inode_hash(str
 extern void file_move(struct file *f, struct list_head *list);
 extern void file_kill(struct file *f);
 #ifdef CONFIG_BLOCK
-struct bio;
 extern void submit_bio(int, struct bio *);
 extern int bdev_read_only(struct block_device *);
 #endif
@@ -2265,7 +2265,6 @@ static inline int xip_truncate_page(stru
 #endif

 #ifdef CONFIG_BLOCK
-struct bio;
 typedef void (dio_submit_t)(int rw, struct bio *bio, struct inode *inode,
 			    loff_t file_offset);
 void dio_end_io(struct bio *bio, int error);

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH RESEND 1/2 block#for-linus] bio, fs: update READA and SWRITE to match the corresponding BIO_RW_* bits
  2010-08-02  0:42                 ` Neil Brown
  2010-08-02 14:12                   ` [PATCH 1/2 block#for-linus] bio, fs: update READA and SWRITE to match the corresponding BIO_RW_* bits Tejun Heo
  2010-08-02 14:15                   ` [PATCH RESEND 1/2 block#for-linus] bio, fs: update READA and SWRITE to match the corresponding BIO_RW_* bits Tejun Heo
@ 2010-08-02 14:17                   ` Tejun Heo
  2010-08-02 20:04                     ` Jens Axboe
  2 siblings, 1 reply; 26+ messages in thread
From: Tejun Heo @ 2010-08-02 14:17 UTC (permalink / raw)
  To: Jens Axboe, stable
  Cc: Vladislav Bolkhovitin, Bryan Mesich, scst-devel, linux-kernel,
	linux-raid, dm-devel, Neil Brown

Commit a82afdf (block: use the same failfast bits for bio and request)
moved BIO_RW_* bits around such that they match up with REQ_* bits.
Unfortunately, fs.h hard coded READ, WRITE, READA and SWRITE as 0, 1,
2 and 3, and expected them to match with BIO_RW_* bits.  READ/WRITE
didn't change but BIO_RW_AHEAD was moved to bit 4 instead of bit 1,
breaking READA and SWRITE.

This patch updates READA and SWRITE such that they match the BIO_RW_*
bits again.  A follow up patch will update the definitions to directly
use BIO_RW_* bits so that this kind of breakage won't happen again.

Stable: The offending commit a82afdf was released with v2.6.32, so
this patch should be applied to all kernels since then but it must
_NOT_ be applied to kernels earlier than that.

Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-and-bisected-by: Vladislav Bolkhovitin <vst@vlnb.net>
Root-caused-by: Neil Brown <neilb@suse.de>
Cc: Jens Axobe <axboe@kernel.dk>
Cc: stable@kernel.org
---
Aieee... thanks for root causing it Neil.  That was a stupid bug.  I
knew that READ/WRITE were hardcoded but forgot about READA.  :-(
Moving BIO_RW_AHEAD back to bit 1 might be a better solution but I'm
afraid that would cause more confusions downstream.  This patch
updates READA and SWRITE to match BIO_RW_AHEAD and should also appear
in -stable releases.  The next patch will create bio_types.h and
define all constants in terms of BIO_RW_*.

Thanks.

(resending w/ Jens' new address)

 include/linux/fs.h |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

Index: work/include/linux/fs.h
===================================================================
--- work.orig/include/linux/fs.h
+++ work/include/linux/fs.h
@@ -148,8 +148,8 @@ struct inodes_stat_t {
 #define RWA_MASK	2
 #define READ 0
 #define WRITE 1
-#define READA 2		/* read-ahead  - don't block if no resources */
-#define SWRITE 3	/* for ll_rw_block() - wait for buffer lock */
+#define READA 16	/* read-ahead  - don't block if no resources */
+#define SWRITE 17	/* for ll_rw_block() - wait for buffer lock */
 #define READ_SYNC	(READ | (1 << BIO_RW_SYNCIO) | (1 << BIO_RW_UNPLUG))
 #define READ_META	(READ | (1 << BIO_RW_META))
 #define WRITE_SYNC_PLUG	(WRITE | (1 << BIO_RW_SYNCIO) | (1 << BIO_RW_NOIDLE))

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH RESEND 2/2 block#for-linus] bio, fs: separate out bio_types.h and define READ/WRITE constants in terms of BIO_RW_* flags
  2010-08-02 14:12                   ` [PATCH 1/2 block#for-linus] bio, fs: update READA and SWRITE to match the corresponding BIO_RW_* bits Tejun Heo
  2010-08-02 14:13                     ` [PATCH 2/2 block#for-linus] bio, fs: separate out bio_types.h and define READ/WRITE constants in terms of BIO_RW_* flags Tejun Heo
  2010-08-02 14:15                     ` [PATCH RESEND " Tejun Heo
@ 2010-08-02 14:18                     ` Tejun Heo
  2 siblings, 0 replies; 26+ messages in thread
From: Tejun Heo @ 2010-08-02 14:18 UTC (permalink / raw)
  To: Jens Axboe, stable
  Cc: Vladislav Bolkhovitin, Bryan Mesich, scst-devel, linux-kernel,
	linux-raid, dm-devel, Neil Brown

linux/fs.h hard coded READ/WRITE constants which should match BIO_RW_*
flags.  This is fragile and caused breakage during BIO_RW_* flag
rearrangement.  The hardcoding is to avoid include dependency hell.

Create linux/bio_types.h which contatins definitions for bio data
structures and flags and include it from bio.h and fs.h, and make fs.h
define all READ/WRITE related constants in terms of BIO_RW_* flags.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Jens Axobe <axboe@kernel.dk>
---
 include/linux/bio.h       |  153 +-----------------------------------------
 include/linux/bio_types.h |  164 ++++++++++++++++++++++++++++++++++++++++++++++
 include/linux/fs.h        |   17 ++--
 3 files changed, 176 insertions(+), 158 deletions(-)

Index: work/include/linux/bio.h
===================================================================
--- work.orig/include/linux/bio.h
+++ work/include/linux/bio.h
@@ -9,7 +9,7 @@
  *
  * This program is distributed in the hope that it will be useful,
  * but WITHOUT ANY WARRANTY; without even the implied warranty of
-
+ *
  * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
  * GNU General Public License for more details.
  *
@@ -28,6 +28,9 @@

 #include <asm/io.h>

+/* struct bio, bio_vec and BIO_* flags are defined in bio_types.h */
+#include <linux/bio_types.h>
+
 #define BIO_DEBUG

 #ifdef BIO_DEBUG
@@ -41,154 +44,6 @@
 #define BIO_MAX_SECTORS		(BIO_MAX_SIZE >> 9)

 /*
- * was unsigned short, but we might as well be ready for > 64kB I/O pages
- */
-struct bio_vec {
-	struct page	*bv_page;
-	unsigned int	bv_len;
-	unsigned int	bv_offset;
-};
-
-struct bio_set;
-struct bio;
-struct bio_integrity_payload;
-typedef void (bio_end_io_t) (struct bio *, int);
-typedef void (bio_destructor_t) (struct bio *);
-
-/*
- * main unit of I/O for the block layer and lower layers (ie drivers and
- * stacking drivers)
- */
-struct bio {
-	sector_t		bi_sector;	/* device address in 512 byte
-						   sectors */
-	struct bio		*bi_next;	/* request queue link */
-	struct block_device	*bi_bdev;
-	unsigned long		bi_flags;	/* status, command, etc */
-	unsigned long		bi_rw;		/* bottom bits READ/WRITE,
-						 * top bits priority
-						 */
-
-	unsigned short		bi_vcnt;	/* how many bio_vec's */
-	unsigned short		bi_idx;		/* current index into bvl_vec */
-
-	/* Number of segments in this BIO after
-	 * physical address coalescing is performed.
-	 */
-	unsigned int		bi_phys_segments;
-
-	unsigned int		bi_size;	/* residual I/O count */
-
-	/*
-	 * To keep track of the max segment size, we account for the
-	 * sizes of the first and last mergeable segments in this bio.
-	 */
-	unsigned int		bi_seg_front_size;
-	unsigned int		bi_seg_back_size;
-
-	unsigned int		bi_max_vecs;	/* max bvl_vecs we can hold */
-
-	unsigned int		bi_comp_cpu;	/* completion CPU */
-
-	atomic_t		bi_cnt;		/* pin count */
-
-	struct bio_vec		*bi_io_vec;	/* the actual vec list */
-
-	bio_end_io_t		*bi_end_io;
-
-	void			*bi_private;
-#if defined(CONFIG_BLK_DEV_INTEGRITY)
-	struct bio_integrity_payload *bi_integrity;  /* data integrity */
-#endif
-
-	bio_destructor_t	*bi_destructor;	/* destructor */
-
-	/*
-	 * We can inline a number of vecs at the end of the bio, to avoid
-	 * double allocations for a small number of bio_vecs. This member
-	 * MUST obviously be kept at the very end of the bio.
-	 */
-	struct bio_vec		bi_inline_vecs[0];
-};
-
-/*
- * bio flags
- */
-#define BIO_UPTODATE	0	/* ok after I/O completion */
-#define BIO_RW_BLOCK	1	/* RW_AHEAD set, and read/write would block */
-#define BIO_EOF		2	/* out-out-bounds error */
-#define BIO_SEG_VALID	3	/* bi_phys_segments valid */
-#define BIO_CLONED	4	/* doesn't own data */
-#define BIO_BOUNCED	5	/* bio is a bounce bio */
-#define BIO_USER_MAPPED 6	/* contains user pages */
-#define BIO_EOPNOTSUPP	7	/* not supported */
-#define BIO_CPU_AFFINE	8	/* complete bio on same CPU as submitted */
-#define BIO_NULL_MAPPED 9	/* contains invalid user pages */
-#define BIO_FS_INTEGRITY 10	/* fs owns integrity data, not block layer */
-#define BIO_QUIET	11	/* Make BIO Quiet */
-#define bio_flagged(bio, flag)	((bio)->bi_flags & (1 << (flag)))
-
-/*
- * top 4 bits of bio flags indicate the pool this bio came from
- */
-#define BIO_POOL_BITS		(4)
-#define BIO_POOL_NONE		((1UL << BIO_POOL_BITS) - 1)
-#define BIO_POOL_OFFSET		(BITS_PER_LONG - BIO_POOL_BITS)
-#define BIO_POOL_MASK		(1UL << BIO_POOL_OFFSET)
-#define BIO_POOL_IDX(bio)	((bio)->bi_flags >> BIO_POOL_OFFSET)	
-
-/*
- * bio bi_rw flags
- *
- * bit 0 -- data direction
- *	If not set, bio is a read from device. If set, it's a write to device.
- * bit 1 -- fail fast device errors
- * bit 2 -- fail fast transport errors
- * bit 3 -- fail fast driver errors
- * bit 4 -- rw-ahead when set
- * bit 5 -- barrier
- *	Insert a serialization point in the IO queue, forcing previously
- *	submitted IO to be completed before this one is issued.
- * bit 6 -- synchronous I/O hint.
- * bit 7 -- Unplug the device immediately after submitting this bio.
- * bit 8 -- metadata request
- *	Used for tracing to differentiate metadata and data IO. May also
- *	get some preferential treatment in the IO scheduler
- * bit 9 -- discard sectors
- *	Informs the lower level device that this range of sectors is no longer
- *	used by the file system and may thus be freed by the device. Used
- *	for flash based storage.
- *	Don't want driver retries for any fast fail whatever the reason.
- * bit 10 -- Tell the IO scheduler not to wait for more requests after this
-	one has been submitted, even if it is a SYNC request.
- */
-enum bio_rw_flags {
-	BIO_RW,
-	BIO_RW_FAILFAST_DEV,
-	BIO_RW_FAILFAST_TRANSPORT,
-	BIO_RW_FAILFAST_DRIVER,
-	/* above flags must match REQ_* */
-	BIO_RW_AHEAD,
-	BIO_RW_BARRIER,
-	BIO_RW_SYNCIO,
-	BIO_RW_UNPLUG,
-	BIO_RW_META,
-	BIO_RW_DISCARD,
-	BIO_RW_NOIDLE,
-};
-
-/*
- * First four bits must match between bio->bi_rw and rq->cmd_flags, make
- * that explicit here.
- */
-#define BIO_RW_RQ_MASK		0xf
-
-static inline bool bio_rw_flagged(struct bio *bio, enum bio_rw_flags flag)
-{
-	return (bio->bi_rw & (1 << flag)) != 0;
-}
-
-/*
  * upper 16 bits of bi_rw define the io priority of this bio
  */
 #define BIO_PRIO_SHIFT	(8 * sizeof(unsigned long) - IOPRIO_BITS)
Index: work/include/linux/bio_types.h
===================================================================
--- /dev/null
+++ work/include/linux/bio_types.h
@@ -0,0 +1,164 @@
+/*
+ * BIO data types and constants.  Include linux/bio.h for usual cases.
+ * Directly include this file only to break include dependency loop.
+ */
+#ifndef __LINUX_BIO_TYPES_H
+#define __LINUX_BIO_TYPES_H
+
+#ifdef CONFIG_BLOCK
+
+#include <linux/types.h>
+
+struct bio_set;
+struct bio;
+struct bio_integrity_payload;
+struct page;
+struct block_device;
+
+/*
+ * was unsigned short, but we might as well be ready for > 64kB I/O pages
+ */
+struct bio_vec {
+	struct page	*bv_page;
+	unsigned int	bv_len;
+	unsigned int	bv_offset;
+};
+
+typedef void (bio_end_io_t) (struct bio *, int);
+typedef void (bio_destructor_t) (struct bio *);
+
+/*
+ * main unit of I/O for the block layer and lower layers (ie drivers and
+ * stacking drivers)
+ */
+struct bio {
+	sector_t		bi_sector;	/* device address in 512 byte
+						   sectors */
+	struct bio		*bi_next;	/* request queue link */
+	struct block_device	*bi_bdev;
+	unsigned long		bi_flags;	/* status, command, etc */
+	unsigned long		bi_rw;		/* bottom bits READ/WRITE,
+						 * top bits priority
+						 */
+
+	unsigned short		bi_vcnt;	/* how many bio_vec's */
+	unsigned short		bi_idx;		/* current index into bvl_vec */
+
+	/* Number of segments in this BIO after
+	 * physical address coalescing is performed.
+	 */
+	unsigned int		bi_phys_segments;
+
+	unsigned int		bi_size;	/* residual I/O count */
+
+	/*
+	 * To keep track of the max segment size, we account for the
+	 * sizes of the first and last mergeable segments in this bio.
+	 */
+	unsigned int		bi_seg_front_size;
+	unsigned int		bi_seg_back_size;
+
+	unsigned int		bi_max_vecs;	/* max bvl_vecs we can hold */
+
+	unsigned int		bi_comp_cpu;	/* completion CPU */
+
+	atomic_t		bi_cnt;		/* pin count */
+
+	struct bio_vec		*bi_io_vec;	/* the actual vec list */
+
+	bio_end_io_t		*bi_end_io;
+
+	void			*bi_private;
+#if defined(CONFIG_BLK_DEV_INTEGRITY)
+	struct bio_integrity_payload *bi_integrity;  /* data integrity */
+#endif
+
+	bio_destructor_t	*bi_destructor;	/* destructor */
+
+	/*
+	 * We can inline a number of vecs at the end of the bio, to avoid
+	 * double allocations for a small number of bio_vecs. This member
+	 * MUST obviously be kept at the very end of the bio.
+	 */
+	struct bio_vec		bi_inline_vecs[0];
+};
+
+/*
+ * bio flags
+ */
+#define BIO_UPTODATE	0	/* ok after I/O completion */
+#define BIO_RW_BLOCK	1	/* RW_AHEAD set, and read/write would block */
+#define BIO_EOF		2	/* out-out-bounds error */
+#define BIO_SEG_VALID	3	/* bi_phys_segments valid */
+#define BIO_CLONED	4	/* doesn't own data */
+#define BIO_BOUNCED	5	/* bio is a bounce bio */
+#define BIO_USER_MAPPED 6	/* contains user pages */
+#define BIO_EOPNOTSUPP	7	/* not supported */
+#define BIO_CPU_AFFINE	8	/* complete bio on same CPU as submitted */
+#define BIO_NULL_MAPPED 9	/* contains invalid user pages */
+#define BIO_FS_INTEGRITY 10	/* fs owns integrity data, not block layer */
+#define BIO_QUIET	11	/* Make BIO Quiet */
+#define bio_flagged(bio, flag)	((bio)->bi_flags & (1 << (flag)))
+
+/*
+ * top 4 bits of bio flags indicate the pool this bio came from
+ */
+#define BIO_POOL_BITS		(4)
+#define BIO_POOL_NONE		((1UL << BIO_POOL_BITS) - 1)
+#define BIO_POOL_OFFSET		(BITS_PER_LONG - BIO_POOL_BITS)
+#define BIO_POOL_MASK		(1UL << BIO_POOL_OFFSET)
+#define BIO_POOL_IDX(bio)	((bio)->bi_flags >> BIO_POOL_OFFSET)
+
+/*
+ * bio bi_rw flags
+ *
+ * bit 0 -- data direction
+ *	If not set, bio is a read from device. If set, it's a write to device.
+ * bit 1 -- fail fast device errors
+ * bit 2 -- fail fast transport errors
+ * bit 3 -- fail fast driver errors
+ * bit 4 -- rw-ahead when set
+ * bit 5 -- barrier
+ *	Insert a serialization point in the IO queue, forcing previously
+ *	submitted IO to be completed before this one is issued.
+ * bit 6 -- synchronous I/O hint.
+ * bit 7 -- Unplug the device immediately after submitting this bio.
+ * bit 8 -- metadata request
+ *	Used for tracing to differentiate metadata and data IO. May also
+ *	get some preferential treatment in the IO scheduler
+ * bit 9 -- discard sectors
+ *	Informs the lower level device that this range of sectors is no longer
+ *	used by the file system and may thus be freed by the device. Used
+ *	for flash based storage.
+ *	Don't want driver retries for any fast fail whatever the reason.
+ * bit 10 -- Tell the IO scheduler not to wait for more requests after this
+	one has been submitted, even if it is a SYNC request.
+ */
+enum bio_rw_flags {
+	BIO_RW,
+	BIO_RW_FAILFAST_DEV,
+	BIO_RW_FAILFAST_TRANSPORT,
+	BIO_RW_FAILFAST_DRIVER,
+	/* above flags must match REQ_* */
+	BIO_RW_AHEAD,
+	BIO_RW_BARRIER,
+	BIO_RW_SYNCIO,
+	BIO_RW_UNPLUG,
+	BIO_RW_META,
+	BIO_RW_DISCARD,
+	BIO_RW_NOIDLE,
+};
+
+/*
+ * First four bits must match between bio->bi_rw and rq->cmd_flags, make
+ * that explicit here.
+ */
+#define BIO_RW_RQ_MASK		0xf
+
+static inline bool bio_rw_flagged(struct bio *bio, enum bio_rw_flags flag)
+{
+	return (bio->bi_rw & (1 << flag)) != 0;
+}
+
+#endif /* CONFIG_BLOCK */
+#endif /* __LINUX_BIO_TYPES_H */
Index: work/include/linux/fs.h
===================================================================
--- work.orig/include/linux/fs.h
+++ work/include/linux/fs.h
@@ -8,6 +8,7 @@

 #include <linux/limits.h>
 #include <linux/ioctl.h>
+#include <linux/bio_types.h>

 /*
  * It's silly to have NR_OPEN bigger than NR_FILE, but you can change
@@ -117,7 +118,7 @@ struct inodes_stat_t {
  *			immediately wait on this read without caring about
  *			unplugging.
  * READA		Used for read-ahead operations. Lower priority, and the
- *			 block layer could (in theory) choose to ignore this
+ *			block layer could (in theory) choose to ignore this
  *			request if it runs into resource problems.
  * WRITE		A normal async write. Device will be plugged.
  * SWRITE		Like WRITE, but a special case for ll_rw_block() that
@@ -144,12 +145,12 @@ struct inodes_stat_t {
  *			of this IO.
  *
  */
-#define RW_MASK		1
-#define RWA_MASK	2
-#define READ 0
-#define WRITE 1
-#define READA 16	/* read-ahead  - don't block if no resources */
-#define SWRITE 17	/* for ll_rw_block() - wait for buffer lock */
+#define RW_MASK		(1 << BIO_RW)
+#define RWA_MASK	(1 << BIO_RW_AHEAD)
+#define READ		0
+#define WRITE		RW_MASK
+#define READA		RWA_MASK
+#define SWRITE		(WRITE | READA)
 #define READ_SYNC	(READ | (1 << BIO_RW_SYNCIO) | (1 << BIO_RW_UNPLUG))
 #define READ_META	(READ | (1 << BIO_RW_META))
 #define WRITE_SYNC_PLUG	(WRITE | (1 << BIO_RW_SYNCIO) | (1 << BIO_RW_NOIDLE))
@@ -2198,7 +2199,6 @@ static inline void insert_inode_hash(str
 extern void file_move(struct file *f, struct list_head *list);
 extern void file_kill(struct file *f);
 #ifdef CONFIG_BLOCK
-struct bio;
 extern void submit_bio(int, struct bio *);
 extern int bdev_read_only(struct block_device *);
 #endif
@@ -2265,7 +2265,6 @@ static inline int xip_truncate_page(stru
 #endif

 #ifdef CONFIG_BLOCK
-struct bio;
 typedef void (dio_submit_t)(int rw, struct bio *bio, struct inode *inode,
 			    loff_t file_offset);
 void dio_end_io(struct bio *bio, int error);

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH RESEND 1/2 block#for-linus] bio, fs: update READA and SWRITE  to match the corresponding BIO_RW_* bits
  2010-08-02 14:17                   ` Tejun Heo
@ 2010-08-02 20:04                     ` Jens Axboe
  2010-08-03  9:53                       ` [PATCH 1/2 block#for-2.6.36] bio, fs: update RWA_MASK, " Tejun Heo
  2010-08-03  9:54                       ` [PATCH 2/2 block#for-2.6.36] bio, fs: separate out bio_types.h and define READ/WRITE constants in terms of BIO_RW_* flags Tejun Heo
  0 siblings, 2 replies; 26+ messages in thread
From: Jens Axboe @ 2010-08-02 20:04 UTC (permalink / raw)
  To: Tejun Heo
  Cc: stable, Vladislav Bolkhovitin, Bryan Mesich, scst-devel,
	linux-kernel, linux-raid, dm-devel, Neil Brown

On 08/02/2010 04:17 PM, Tejun Heo wrote:
> Aieee... thanks for root causing it Neil.  That was a stupid bug.  I
> knew that READ/WRITE were hardcoded but forgot about READA.  :-(
> Moving BIO_RW_AHEAD back to bit 1 might be a better solution but I'm
> afraid that would cause more confusions downstream.  This patch
> updates READA and SWRITE to match BIO_RW_AHEAD and should also appear
> in -stable releases.  The next patch will create bio_types.h and
> define all constants in terms of BIO_RW_*.

Tejun, care to resend these against for-2.6.36? We can reference
these for the stable backport (at least the first one should go in).

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH RESEND 1/2 block#for-linus] bio, fs: update READA and SWRITE to match the corresponding BIO_RW_* bits
  2010-08-02 14:15                   ` [PATCH RESEND 1/2 block#for-linus] bio, fs: update READA and SWRITE to match the corresponding BIO_RW_* bits Tejun Heo
@ 2010-08-02 21:52                     ` Neil Brown
  2010-08-03  9:27                       ` Tejun Heo
  2010-08-05 18:45                     ` Jeff Moyer
  1 sibling, 1 reply; 26+ messages in thread
From: Neil Brown @ 2010-08-02 21:52 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Jens Axboe, stable, Vladislav Bolkhovitin, Bryan Mesich,
	scst-devel, linux-kernel, linux-raid, dm-devel

On Mon, 02 Aug 2010 16:15:38 +0200
Tejun Heo <tj@kernel.org> wrote:

> Commit a82afdf (block: use the same failfast bits for bio and request)
> moved BIO_RW_* bits around such that they match up with REQ_* bits.
> Unfortunately, fs.h hard coded READ, WRITE, READA and SWRITE as 0, 1,
> 2 and 3, and expected them to match with BIO_RW_* bits.  READ/WRITE
> didn't change but BIO_RW_AHEAD was moved to bit 4 instead of bit 1,
> breaking READA and SWRITE.
> 
> This patch updates READA and SWRITE such that they match the BIO_RW_*
> bits again.  A follow up patch will update the definitions to directly
> use BIO_RW_* bits so that this kind of breakage won't happen again.
> 
> Stable: The offending commit a82afdf was released with v2.6.32, so
> this patch should be applied to all kernels since then but it must
> _NOT_ be applied to kernels earlier than that.
> 
> Signed-off-by: Tejun Heo <tj@kernel.org>
> Reported-and-bisected-by: Vladislav Bolkhovitin <vst@vlnb.net>
> Root-caused-by: Neil Brown <neilb@suse.de>
> Cc: Jens Axobe <axboe@kernel.dk>
> Cc: stable@kernel.org
> ---
> Aieee... thanks for root causing it Neil.  That was a stupid bug.  I
> knew that READ/WRITE were hardcoded but forgot about READA.  :-(
> Moving BIO_RW_AHEAD back to bit 1 might be a better solution but I'm
> afraid that would cause more confusions downstream.  This patch
> updates READA and SWRITE to match BIO_RW_AHEAD and should also appear
> in -stable releases.  The next patch will create bio_types.h and
> define all constants in terms of BIO_RW_*.
> 
> Thanks.
> 
> (resending w/ Jens' new address)
> 
>  include/linux/fs.h |    4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> Index: work/include/linux/fs.h
> ===================================================================
> --- work.orig/include/linux/fs.h
> +++ work/include/linux/fs.h
> @@ -148,8 +148,8 @@ struct inodes_stat_t {
>  #define RWA_MASK	2

Close, but not quite there - RWA_MASK must be 16 too !!

Thanks,
NeilBrown


>  #define READ 0
>  #define WRITE 1
> -#define READA 2		/* read-ahead  - don't block if no resources */
> -#define SWRITE 3	/* for ll_rw_block() - wait for buffer lock */
> +#define READA 16	/* read-ahead  - don't block if no resources */
> +#define SWRITE 17	/* for ll_rw_block() - wait for buffer lock */
>  #define READ_SYNC	(READ | (1 << BIO_RW_SYNCIO) | (1 << BIO_RW_UNPLUG))
>  #define READ_META	(READ | (1 << BIO_RW_META))
>  #define WRITE_SYNC_PLUG	(WRITE | (1 << BIO_RW_SYNCIO) | (1 << BIO_RW_NOIDLE))


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH RESEND 1/2 block#for-linus] bio, fs: update READA and SWRITE to match the corresponding BIO_RW_* bits
  2010-08-02 21:52                     ` Neil Brown
@ 2010-08-03  9:27                       ` Tejun Heo
  0 siblings, 0 replies; 26+ messages in thread
From: Tejun Heo @ 2010-08-03  9:27 UTC (permalink / raw)
  To: Neil Brown
  Cc: Jens Axboe, stable, Vladislav Bolkhovitin, Bryan Mesich,
	scst-devel, linux-kernel, linux-raid, dm-devel

On 08/02/2010 11:52 PM, Neil Brown wrote:
> Close, but not quite there - RWA_MASK must be 16 too !!

Indeed.  Updated.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH 1/2 block#for-2.6.36] bio, fs: update RWA_MASK, READA and SWRITE to match the corresponding BIO_RW_* bits
  2010-08-02 20:04                     ` Jens Axboe
@ 2010-08-03  9:53                       ` Tejun Heo
  2010-08-03 11:15                         ` Jens Axboe
  2010-08-03  9:54                       ` [PATCH 2/2 block#for-2.6.36] bio, fs: separate out bio_types.h and define READ/WRITE constants in terms of BIO_RW_* flags Tejun Heo
  1 sibling, 1 reply; 26+ messages in thread
From: Tejun Heo @ 2010-08-03  9:53 UTC (permalink / raw)
  To: Jens Axboe
  Cc: stable, Vladislav Bolkhovitin, Bryan Mesich, scst-devel,
	linux-kernel, linux-raid, dm-devel, Neil Brown

Commit a82afdf (block: use the same failfast bits for bio and request)
moved BIO_RW_* bits around such that they match up with REQ_* bits.
Unfortunately, fs.h hard coded RW_MASK, RWA_MASK, READ, WRITE, READA
and SWRITE as 0, 1, 2 and 3, and expected them to match with BIO_RW_*
bits.  READ/WRITE didn't change but BIO_RW_AHEAD was moved to bit 4
instead of bit 1, breaking RWA_MASK, READA and SWRITE.

This patch updates RWA_MASK, READA and SWRITE such that they match the
BIO_RW_* bits again.  A follow up patch will update the definitions to
directly use BIO_RW_* bits so that this kind of breakage won't happen
again.

Neil also spotted missing RWA_MASK conversion.

Stable: The offending commit a82afdf was released with v2.6.32, so
this patch should be applied to all kernels since then but it must
_NOT_ be applied to kernels earlier than that.

Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-and-bisected-by: Vladislav Bolkhovitin <vst@vlnb.net>
Root-caused-by: Neil Brown <neilb@suse.de>
Cc: Jens Axobe <axboe@kernel.dk>
Cc: stable@kernel.org
---
Here's the regenerated version also w/ the missing RWA_MASK conversion
Neil spotted.

Thanks.

 include/linux/fs.h |    6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

Index: work/include/linux/fs.h
===================================================================
--- work.orig/include/linux/fs.h
+++ work/include/linux/fs.h
@@ -145,12 +145,12 @@ struct inodes_stat_t {
  *
  */
 #define RW_MASK			1
-#define RWA_MASK		2
+#define RWA_MASK		16

 #define READ			0
 #define WRITE			1
-#define READA			2 /* readahead  - don't block if no resources */
-#define SWRITE			3 /* for ll_rw_block() - wait for buffer lock */
+#define READA			16 /* readahead - don't block if no resources */
+#define SWRITE			17 /* for ll_rw_block(), wait for buffer lock */

 #define READ_SYNC		(READ | REQ_SYNC | REQ_UNPLUG)
 #define READ_META		(READ | REQ_META)

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH 2/2 block#for-2.6.36] bio, fs: separate out bio_types.h and define READ/WRITE constants in terms of BIO_RW_* flags
  2010-08-02 20:04                     ` Jens Axboe
  2010-08-03  9:53                       ` [PATCH 1/2 block#for-2.6.36] bio, fs: update RWA_MASK, " Tejun Heo
@ 2010-08-03  9:54                       ` Tejun Heo
  1 sibling, 0 replies; 26+ messages in thread
From: Tejun Heo @ 2010-08-03  9:54 UTC (permalink / raw)
  To: Jens Axboe
  Cc: stable, Vladislav Bolkhovitin, Bryan Mesich, scst-devel,
	linux-kernel, linux-raid, dm-devel, Neil Brown

linux/fs.h hard coded READ/WRITE constants which should match BIO_RW_*
flags.  This is fragile and caused breakage during BIO_RW_* flag
rearrangement.  The hardcoding is to avoid include dependency hell.

Create linux/bio_types.h which contatins definitions for bio data
structures and flags and include it from bio.h and fs.h, and make fs.h
define all READ/WRITE related constants in terms of BIO_RW_* flags.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Jens Axobe <axboe@kernel.dk>
---
I renamed the file to blk_types.h instead of bio_types.h as it now
contains the REQ bits too.

 include/linux/bio.h       |  183 -------------------------------------------
 include/linux/blk_types.h |  193 ++++++++++++++++++++++++++++++++++++++++++++++
 include/linux/fs.h        |   15 +--
 3 files changed, 204 insertions(+), 187 deletions(-)

Index: work/include/linux/bio.h
===================================================================
--- work.orig/include/linux/bio.h
+++ work/include/linux/bio.h
@@ -9,7 +9,7 @@
  *
  * This program is distributed in the hope that it will be useful,
  * but WITHOUT ANY WARRANTY; without even the implied warranty of
-
+ *
  * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
  * GNU General Public License for more details.
  *
@@ -28,6 +28,9 @@

 #include <asm/io.h>

+/* struct bio, bio_vec and BIO_* flags are defined in blk_types.h */
+#include <linux/blk_types.h>
+
 #define BIO_DEBUG

 #ifdef BIO_DEBUG
@@ -41,184 +44,6 @@
 #define BIO_MAX_SECTORS		(BIO_MAX_SIZE >> 9)

 /*
- * was unsigned short, but we might as well be ready for > 64kB I/O pages
- */
-struct bio_vec {
-	struct page	*bv_page;
-	unsigned int	bv_len;
-	unsigned int	bv_offset;
-};
-
-struct bio_set;
-struct bio;
-struct bio_integrity_payload;
-typedef void (bio_end_io_t) (struct bio *, int);
-typedef void (bio_destructor_t) (struct bio *);
-
-/*
- * main unit of I/O for the block layer and lower layers (ie drivers and
- * stacking drivers)
- */
-struct bio {
-	sector_t		bi_sector;	/* device address in 512 byte
-						   sectors */
-	struct bio		*bi_next;	/* request queue link */
-	struct block_device	*bi_bdev;
-	unsigned long		bi_flags;	/* status, command, etc */
-	unsigned long		bi_rw;		/* bottom bits READ/WRITE,
-						 * top bits priority
-						 */
-
-	unsigned short		bi_vcnt;	/* how many bio_vec's */
-	unsigned short		bi_idx;		/* current index into bvl_vec */
-
-	/* Number of segments in this BIO after
-	 * physical address coalescing is performed.
-	 */
-	unsigned int		bi_phys_segments;
-
-	unsigned int		bi_size;	/* residual I/O count */
-
-	/*
-	 * To keep track of the max segment size, we account for the
-	 * sizes of the first and last mergeable segments in this bio.
-	 */
-	unsigned int		bi_seg_front_size;
-	unsigned int		bi_seg_back_size;
-
-	unsigned int		bi_max_vecs;	/* max bvl_vecs we can hold */
-
-	unsigned int		bi_comp_cpu;	/* completion CPU */
-
-	atomic_t		bi_cnt;		/* pin count */
-
-	struct bio_vec		*bi_io_vec;	/* the actual vec list */
-
-	bio_end_io_t		*bi_end_io;
-
-	void			*bi_private;
-#if defined(CONFIG_BLK_DEV_INTEGRITY)
-	struct bio_integrity_payload *bi_integrity;  /* data integrity */
-#endif
-
-	bio_destructor_t	*bi_destructor;	/* destructor */
-
-	/*
-	 * We can inline a number of vecs at the end of the bio, to avoid
-	 * double allocations for a small number of bio_vecs. This member
-	 * MUST obviously be kept at the very end of the bio.
-	 */
-	struct bio_vec		bi_inline_vecs[0];
-};
-
-/*
- * bio flags
- */
-#define BIO_UPTODATE	0	/* ok after I/O completion */
-#define BIO_RW_BLOCK	1	/* RW_AHEAD set, and read/write would block */
-#define BIO_EOF		2	/* out-out-bounds error */
-#define BIO_SEG_VALID	3	/* bi_phys_segments valid */
-#define BIO_CLONED	4	/* doesn't own data */
-#define BIO_BOUNCED	5	/* bio is a bounce bio */
-#define BIO_USER_MAPPED 6	/* contains user pages */
-#define BIO_EOPNOTSUPP	7	/* not supported */
-#define BIO_CPU_AFFINE	8	/* complete bio on same CPU as submitted */
-#define BIO_NULL_MAPPED 9	/* contains invalid user pages */
-#define BIO_FS_INTEGRITY 10	/* fs owns integrity data, not block layer */
-#define BIO_QUIET	11	/* Make BIO Quiet */
-#define bio_flagged(bio, flag)	((bio)->bi_flags & (1 << (flag)))
-
-/*
- * top 4 bits of bio flags indicate the pool this bio came from
- */
-#define BIO_POOL_BITS		(4)
-#define BIO_POOL_NONE		((1UL << BIO_POOL_BITS) - 1)
-#define BIO_POOL_OFFSET		(BITS_PER_LONG - BIO_POOL_BITS)
-#define BIO_POOL_MASK		(1UL << BIO_POOL_OFFSET)
-#define BIO_POOL_IDX(bio)	((bio)->bi_flags >> BIO_POOL_OFFSET)	
-
-/*
- * Request flags.  For use in the cmd_flags field of struct request, and in
- * bi_rw of struct bio.  Note that some flags are only valid in either one.
- */
-enum rq_flag_bits {
-	/* common flags */
-	__REQ_WRITE,		/* not set, read. set, write */
-	__REQ_FAILFAST_DEV,	/* no driver retries of device errors */
-	__REQ_FAILFAST_TRANSPORT, /* no driver retries of transport errors */
-	__REQ_FAILFAST_DRIVER,	/* no driver retries of driver errors */
-
-	__REQ_HARDBARRIER,	/* may not be passed by drive either */
-	__REQ_SYNC,		/* request is sync (sync write or read) */
-	__REQ_META,		/* metadata io request */
-	__REQ_DISCARD,		/* request to discard sectors */
-	__REQ_NOIDLE,		/* don't anticipate more IO after this one */
-
-	/* bio only flags */
-	__REQ_UNPLUG,		/* unplug the immediately after submission */
-	__REQ_RAHEAD,		/* read ahead, can fail anytime */
-
-	/* request only flags */
-	__REQ_SORTED,		/* elevator knows about this request */
-	__REQ_SOFTBARRIER,	/* may not be passed by ioscheduler */
-	__REQ_FUA,		/* forced unit access */
-	__REQ_NOMERGE,		/* don't touch this for merging */
-	__REQ_STARTED,		/* drive already may have started this one */
-	__REQ_DONTPREP,		/* don't call prep for this one */
-	__REQ_QUEUED,		/* uses queueing */
-	__REQ_ELVPRIV,		/* elevator private data attached */
-	__REQ_FAILED,		/* set if the request failed */
-	__REQ_QUIET,		/* don't worry about errors */
-	__REQ_PREEMPT,		/* set for "ide_preempt" requests */
-	__REQ_ORDERED_COLOR,	/* is before or after barrier */
-	__REQ_ALLOCED,		/* request came from our alloc pool */
-	__REQ_COPY_USER,	/* contains copies of user pages */
-	__REQ_INTEGRITY,	/* integrity metadata has been remapped */
-	__REQ_FLUSH,		/* request for cache flush */
-	__REQ_IO_STAT,		/* account I/O stat */
-	__REQ_MIXED_MERGE,	/* merge of different types, fail separately */
-	__REQ_NR_BITS,		/* stops here */
-};
-
-#define REQ_WRITE		(1 << __REQ_WRITE)
-#define REQ_FAILFAST_DEV	(1 << __REQ_FAILFAST_DEV)
-#define REQ_FAILFAST_TRANSPORT	(1 << __REQ_FAILFAST_TRANSPORT)
-#define REQ_FAILFAST_DRIVER	(1 << __REQ_FAILFAST_DRIVER)
-#define REQ_HARDBARRIER		(1 << __REQ_HARDBARRIER)
-#define REQ_SYNC		(1 << __REQ_SYNC)
-#define REQ_META		(1 << __REQ_META)
-#define REQ_DISCARD		(1 << __REQ_DISCARD)
-#define REQ_NOIDLE		(1 << __REQ_NOIDLE)
-
-#define REQ_FAILFAST_MASK \
-	(REQ_FAILFAST_DEV | REQ_FAILFAST_TRANSPORT | REQ_FAILFAST_DRIVER)
-#define REQ_COMMON_MASK \
-	(REQ_WRITE | REQ_FAILFAST_MASK | REQ_HARDBARRIER | REQ_SYNC | \
-	 REQ_META| REQ_DISCARD | REQ_NOIDLE)
-
-#define REQ_UNPLUG		(1 << __REQ_UNPLUG)
-#define REQ_RAHEAD		(1 << __REQ_RAHEAD)
-
-#define REQ_SORTED		(1 << __REQ_SORTED)
-#define REQ_SOFTBARRIER		(1 << __REQ_SOFTBARRIER)
-#define REQ_FUA			(1 << __REQ_FUA)
-#define REQ_NOMERGE		(1 << __REQ_NOMERGE)
-#define REQ_STARTED		(1 << __REQ_STARTED)
-#define REQ_DONTPREP		(1 << __REQ_DONTPREP)
-#define REQ_QUEUED		(1 << __REQ_QUEUED)
-#define REQ_ELVPRIV		(1 << __REQ_ELVPRIV)
-#define REQ_FAILED		(1 << __REQ_FAILED)
-#define REQ_QUIET		(1 << __REQ_QUIET)
-#define REQ_PREEMPT		(1 << __REQ_PREEMPT)
-#define REQ_ORDERED_COLOR	(1 << __REQ_ORDERED_COLOR)
-#define REQ_ALLOCED		(1 << __REQ_ALLOCED)
-#define REQ_COPY_USER		(1 << __REQ_COPY_USER)
-#define REQ_INTEGRITY		(1 << __REQ_INTEGRITY)
-#define REQ_FLUSH		(1 << __REQ_FLUSH)
-#define REQ_IO_STAT		(1 << __REQ_IO_STAT)
-#define REQ_MIXED_MERGE		(1 << __REQ_MIXED_MERGE)
-
-/*
  * upper 16 bits of bi_rw define the io priority of this bio
  */
 #define BIO_PRIO_SHIFT	(8 * sizeof(unsigned long) - IOPRIO_BITS)
Index: work/include/linux/fs.h
===================================================================
--- work.orig/include/linux/fs.h
+++ work/include/linux/fs.h
@@ -8,6 +8,7 @@

 #include <linux/limits.h>
 #include <linux/ioctl.h>
+#include <linux/blk_types.h>

 /*
  * It's silly to have NR_OPEN bigger than NR_FILE, but you can change
@@ -117,7 +118,7 @@ struct inodes_stat_t {
  *			immediately wait on this read without caring about
  *			unplugging.
  * READA		Used for read-ahead operations. Lower priority, and the
- *			 block layer could (in theory) choose to ignore this
+ *			block layer could (in theory) choose to ignore this
  *			request if it runs into resource problems.
  * WRITE		A normal async write. Device will be plugged.
  * SWRITE		Like WRITE, but a special case for ll_rw_block() that
@@ -144,13 +145,13 @@ struct inodes_stat_t {
  *			of this IO.
  *
  */
-#define RW_MASK			1
-#define RWA_MASK		16
+#define RW_MASK			REQ_WRITE
+#define RWA_MASK		REQ_RAHEAD

 #define READ			0
-#define WRITE			1
-#define READA			16 /* readahead - don't block if no resources */
-#define SWRITE			17 /* for ll_rw_block(), wait for buffer lock */
+#define WRITE			RW_MASK
+#define READA			RWA_MASK
+#define SWRITE			(WRITE | READA)

 #define READ_SYNC		(READ | REQ_SYNC | REQ_UNPLUG)
 #define READ_META		(READ | REQ_META)
@@ -2200,7 +2201,6 @@ static inline void insert_inode_hash(str
 extern void file_move(struct file *f, struct list_head *list);
 extern void file_kill(struct file *f);
 #ifdef CONFIG_BLOCK
-struct bio;
 extern void submit_bio(int, struct bio *);
 extern int bdev_read_only(struct block_device *);
 #endif
@@ -2267,7 +2267,6 @@ static inline int xip_truncate_page(stru
 #endif

 #ifdef CONFIG_BLOCK
-struct bio;
 typedef void (dio_submit_t)(int rw, struct bio *bio, struct inode *inode,
 			    loff_t file_offset);
 void dio_end_io(struct bio *bio, int error);
Index: work/include/linux/blk_types.h
===================================================================
--- /dev/null
+++ work/include/linux/blk_types.h
@@ -0,0 +1,193 @@
+/*
+ * Block data types and constants.  Directly include this file only to
+ * break include dependency loop.
+ */
+#ifndef __LINUX_BLK_TYPES_H
+#define __LINUX_BLK_TYPES_H
+
+#ifdef CONFIG_BLOCK
+
+#include <linux/types.h>
+
+struct bio_set;
+struct bio;
+struct bio_integrity_payload;
+struct page;
+struct block_device;
+typedef void (bio_end_io_t) (struct bio *, int);
+typedef void (bio_destructor_t) (struct bio *);
+
+/*
+ * was unsigned short, but we might as well be ready for > 64kB I/O pages
+ */
+struct bio_vec {
+	struct page	*bv_page;
+	unsigned int	bv_len;
+	unsigned int	bv_offset;
+};
+
+/*
+ * main unit of I/O for the block layer and lower layers (ie drivers and
+ * stacking drivers)
+ */
+struct bio {
+	sector_t		bi_sector;	/* device address in 512 byte
+						   sectors */
+	struct bio		*bi_next;	/* request queue link */
+	struct block_device	*bi_bdev;
+	unsigned long		bi_flags;	/* status, command, etc */
+	unsigned long		bi_rw;		/* bottom bits READ/WRITE,
+						 * top bits priority
+						 */
+
+	unsigned short		bi_vcnt;	/* how many bio_vec's */
+	unsigned short		bi_idx;		/* current index into bvl_vec */
+
+	/* Number of segments in this BIO after
+	 * physical address coalescing is performed.
+	 */
+	unsigned int		bi_phys_segments;
+
+	unsigned int		bi_size;	/* residual I/O count */
+
+	/*
+	 * To keep track of the max segment size, we account for the
+	 * sizes of the first and last mergeable segments in this bio.
+	 */
+	unsigned int		bi_seg_front_size;
+	unsigned int		bi_seg_back_size;
+
+	unsigned int		bi_max_vecs;	/* max bvl_vecs we can hold */
+
+	unsigned int		bi_comp_cpu;	/* completion CPU */
+
+	atomic_t		bi_cnt;		/* pin count */
+
+	struct bio_vec		*bi_io_vec;	/* the actual vec list */
+
+	bio_end_io_t		*bi_end_io;
+
+	void			*bi_private;
+#if defined(CONFIG_BLK_DEV_INTEGRITY)
+	struct bio_integrity_payload *bi_integrity;  /* data integrity */
+#endif
+
+	bio_destructor_t	*bi_destructor;	/* destructor */
+
+	/*
+	 * We can inline a number of vecs at the end of the bio, to avoid
+	 * double allocations for a small number of bio_vecs. This member
+	 * MUST obviously be kept at the very end of the bio.
+	 */
+	struct bio_vec		bi_inline_vecs[0];
+};
+
+/*
+ * bio flags
+ */
+#define BIO_UPTODATE	0	/* ok after I/O completion */
+#define BIO_RW_BLOCK	1	/* RW_AHEAD set, and read/write would block */
+#define BIO_EOF		2	/* out-out-bounds error */
+#define BIO_SEG_VALID	3	/* bi_phys_segments valid */
+#define BIO_CLONED	4	/* doesn't own data */
+#define BIO_BOUNCED	5	/* bio is a bounce bio */
+#define BIO_USER_MAPPED 6	/* contains user pages */
+#define BIO_EOPNOTSUPP	7	/* not supported */
+#define BIO_CPU_AFFINE	8	/* complete bio on same CPU as submitted */
+#define BIO_NULL_MAPPED 9	/* contains invalid user pages */
+#define BIO_FS_INTEGRITY 10	/* fs owns integrity data, not block layer */
+#define BIO_QUIET	11	/* Make BIO Quiet */
+#define bio_flagged(bio, flag)	((bio)->bi_flags & (1 << (flag)))
+
+/*
+ * top 4 bits of bio flags indicate the pool this bio came from
+ */
+#define BIO_POOL_BITS		(4)
+#define BIO_POOL_NONE		((1UL << BIO_POOL_BITS) - 1)
+#define BIO_POOL_OFFSET		(BITS_PER_LONG - BIO_POOL_BITS)
+#define BIO_POOL_MASK		(1UL << BIO_POOL_OFFSET)
+#define BIO_POOL_IDX(bio)	((bio)->bi_flags >> BIO_POOL_OFFSET)
+
+/*
+ * Request flags.  For use in the cmd_flags field of struct request, and in
+ * bi_rw of struct bio.  Note that some flags are only valid in either one.
+ */
+enum rq_flag_bits {
+	/* common flags */
+	__REQ_WRITE,		/* not set, read. set, write */
+	__REQ_FAILFAST_DEV,	/* no driver retries of device errors */
+	__REQ_FAILFAST_TRANSPORT, /* no driver retries of transport errors */
+	__REQ_FAILFAST_DRIVER,	/* no driver retries of driver errors */
+
+	__REQ_HARDBARRIER,	/* may not be passed by drive either */
+	__REQ_SYNC,		/* request is sync (sync write or read) */
+	__REQ_META,		/* metadata io request */
+	__REQ_DISCARD,		/* request to discard sectors */
+	__REQ_NOIDLE,		/* don't anticipate more IO after this one */
+
+	/* bio only flags */
+	__REQ_UNPLUG,		/* unplug the immediately after submission */
+	__REQ_RAHEAD,		/* read ahead, can fail anytime */
+
+	/* request only flags */
+	__REQ_SORTED,		/* elevator knows about this request */
+	__REQ_SOFTBARRIER,	/* may not be passed by ioscheduler */
+	__REQ_FUA,		/* forced unit access */
+	__REQ_NOMERGE,		/* don't touch this for merging */
+	__REQ_STARTED,		/* drive already may have started this one */
+	__REQ_DONTPREP,		/* don't call prep for this one */
+	__REQ_QUEUED,		/* uses queueing */
+	__REQ_ELVPRIV,		/* elevator private data attached */
+	__REQ_FAILED,		/* set if the request failed */
+	__REQ_QUIET,		/* don't worry about errors */
+	__REQ_PREEMPT,		/* set for "ide_preempt" requests */
+	__REQ_ORDERED_COLOR,	/* is before or after barrier */
+	__REQ_ALLOCED,		/* request came from our alloc pool */
+	__REQ_COPY_USER,	/* contains copies of user pages */
+	__REQ_INTEGRITY,	/* integrity metadata has been remapped */
+	__REQ_FLUSH,		/* request for cache flush */
+	__REQ_IO_STAT,		/* account I/O stat */
+	__REQ_MIXED_MERGE,	/* merge of different types, fail separately */
+	__REQ_NR_BITS,		/* stops here */
+};
+
+#define REQ_WRITE		(1 << __REQ_WRITE)
+#define REQ_FAILFAST_DEV	(1 << __REQ_FAILFAST_DEV)
+#define REQ_FAILFAST_TRANSPORT	(1 << __REQ_FAILFAST_TRANSPORT)
+#define REQ_FAILFAST_DRIVER	(1 << __REQ_FAILFAST_DRIVER)
+#define REQ_HARDBARRIER		(1 << __REQ_HARDBARRIER)
+#define REQ_SYNC		(1 << __REQ_SYNC)
+#define REQ_META		(1 << __REQ_META)
+#define REQ_DISCARD		(1 << __REQ_DISCARD)
+#define REQ_NOIDLE		(1 << __REQ_NOIDLE)
+
+#define REQ_FAILFAST_MASK \
+	(REQ_FAILFAST_DEV | REQ_FAILFAST_TRANSPORT | REQ_FAILFAST_DRIVER)
+#define REQ_COMMON_MASK \
+	(REQ_WRITE | REQ_FAILFAST_MASK | REQ_HARDBARRIER | REQ_SYNC | \
+	 REQ_META| REQ_DISCARD | REQ_NOIDLE)
+
+#define REQ_UNPLUG		(1 << __REQ_UNPLUG)
+#define REQ_RAHEAD		(1 << __REQ_RAHEAD)
+
+#define REQ_SORTED		(1 << __REQ_SORTED)
+#define REQ_SOFTBARRIER		(1 << __REQ_SOFTBARRIER)
+#define REQ_FUA			(1 << __REQ_FUA)
+#define REQ_NOMERGE		(1 << __REQ_NOMERGE)
+#define REQ_STARTED		(1 << __REQ_STARTED)
+#define REQ_DONTPREP		(1 << __REQ_DONTPREP)
+#define REQ_QUEUED		(1 << __REQ_QUEUED)
+#define REQ_ELVPRIV		(1 << __REQ_ELVPRIV)
+#define REQ_FAILED		(1 << __REQ_FAILED)
+#define REQ_QUIET		(1 << __REQ_QUIET)
+#define REQ_PREEMPT		(1 << __REQ_PREEMPT)
+#define REQ_ORDERED_COLOR	(1 << __REQ_ORDERED_COLOR)
+#define REQ_ALLOCED		(1 << __REQ_ALLOCED)
+#define REQ_COPY_USER		(1 << __REQ_COPY_USER)
+#define REQ_INTEGRITY		(1 << __REQ_INTEGRITY)
+#define REQ_FLUSH		(1 << __REQ_FLUSH)
+#define REQ_IO_STAT		(1 << __REQ_IO_STAT)
+#define REQ_MIXED_MERGE		(1 << __REQ_MIXED_MERGE)
+
+#endif /* CONFIG_BLOCK */
+#endif /* __LINUX_BLK_TYPES_H */

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 1/2 block#for-2.6.36] bio, fs: update RWA_MASK, READA and  SWRITE to match the corresponding BIO_RW_* bits
  2010-08-03  9:53                       ` [PATCH 1/2 block#for-2.6.36] bio, fs: update RWA_MASK, " Tejun Heo
@ 2010-08-03 11:15                         ` Jens Axboe
  2010-08-03 11:21                           ` Jens Axboe
  0 siblings, 1 reply; 26+ messages in thread
From: Jens Axboe @ 2010-08-03 11:15 UTC (permalink / raw)
  To: Tejun Heo
  Cc: stable, Vladislav Bolkhovitin, Bryan Mesich, scst-devel,
	linux-kernel, linux-raid, dm-devel, Neil Brown

On 2010-08-03 11:53, Tejun Heo wrote:
> Commit a82afdf (block: use the same failfast bits for bio and request)
> moved BIO_RW_* bits around such that they match up with REQ_* bits.
> Unfortunately, fs.h hard coded RW_MASK, RWA_MASK, READ, WRITE, READA
> and SWRITE as 0, 1, 2 and 3, and expected them to match with BIO_RW_*
> bits.  READ/WRITE didn't change but BIO_RW_AHEAD was moved to bit 4
> instead of bit 1, breaking RWA_MASK, READA and SWRITE.
> 
> This patch updates RWA_MASK, READA and SWRITE such that they match the
> BIO_RW_* bits again.  A follow up patch will update the definitions to
> directly use BIO_RW_* bits so that this kind of breakage won't happen
> again.
> 
> Neil also spotted missing RWA_MASK conversion.
> 
> Stable: The offending commit a82afdf was released with v2.6.32, so
> this patch should be applied to all kernels since then but it must
> _NOT_ be applied to kernels earlier than that.
> 
> Signed-off-by: Tejun Heo <tj@kernel.org>
> Reported-and-bisected-by: Vladislav Bolkhovitin <vst@vlnb.net>
> Root-caused-by: Neil Brown <neilb@suse.de>
> Cc: Jens Axobe <axboe@kernel.dk>
           ^^^^^

(Too) common typo :-)

Anyway, applied to for-2.6.36, thanks a lot.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 1/2 block#for-2.6.36] bio, fs: update RWA_MASK, READA and  SWRITE to match the corresponding BIO_RW_* bits
  2010-08-03 11:15                         ` Jens Axboe
@ 2010-08-03 11:21                           ` Jens Axboe
  2010-08-03 15:52                             ` Tejun Heo
  0 siblings, 1 reply; 26+ messages in thread
From: Jens Axboe @ 2010-08-03 11:21 UTC (permalink / raw)
  To: Tejun Heo
  Cc: stable, Vladislav Bolkhovitin, Bryan Mesich, scst-devel,
	linux-kernel, linux-raid, dm-devel, Neil Brown

On 2010-08-03 13:15, Jens Axboe wrote:
> On 2010-08-03 11:53, Tejun Heo wrote:
>> Commit a82afdf (block: use the same failfast bits for bio and request)
>> moved BIO_RW_* bits around such that they match up with REQ_* bits.
>> Unfortunately, fs.h hard coded RW_MASK, RWA_MASK, READ, WRITE, READA
>> and SWRITE as 0, 1, 2 and 3, and expected them to match with BIO_RW_*
>> bits.  READ/WRITE didn't change but BIO_RW_AHEAD was moved to bit 4
>> instead of bit 1, breaking RWA_MASK, READA and SWRITE.
>>
>> This patch updates RWA_MASK, READA and SWRITE such that they match the
>> BIO_RW_* bits again.  A follow up patch will update the definitions to
>> directly use BIO_RW_* bits so that this kind of breakage won't happen
>> again.
>>
>> Neil also spotted missing RWA_MASK conversion.
>>
>> Stable: The offending commit a82afdf was released with v2.6.32, so
>> this patch should be applied to all kernels since then but it must
>> _NOT_ be applied to kernels earlier than that.
>>
>> Signed-off-by: Tejun Heo <tj@kernel.org>
>> Reported-and-bisected-by: Vladislav Bolkhovitin <vst@vlnb.net>
>> Root-caused-by: Neil Brown <neilb@suse.de>
>> Cc: Jens Axobe <axboe@kernel.dk>
>            ^^^^^
> 
> (Too) common typo :-)
> 
> Anyway, applied to for-2.6.36, thanks a lot.

Irk, we have an issue:

In file included from fs/coda/psdev.c:48:
include/linux/coda_psdev.h:91:1: warning: "REQ_WRITE" redefined
In file included from include/linux/fs.h:11,
                 from include/linux/proc_fs.h:5,
                 from fs/coda/psdev.c:31:
include/linux/blk_types.h:154:1: warning: this is the location of the previous definition

And from include/linux/coda_psdev.h:

#define REQ_ASYNC  0x1
#define REQ_READ   0x2
#define REQ_WRITE  0x4
#define REQ_ABORT  0x8

which unfortunately seem to not be under __KERNEL__ protection, but
there are things like wait_queue_head_t structs there as well so should
be safe to change.

-- 
Jens Axboe


Confidentiality Notice: This e-mail message, its contents and any attachments to it are confidential to the intended recipient, and may contain information that is privileged and/or exempt from disclosure under applicable law. If you are not the intended recipient, please immediately notify the sender and destroy the original e-mail message and any attachments (and any copies that may have been made) from your system or otherwise. Any unauthorized use, copying, disclosure or distribution of this information is strictly prohibited.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 1/2 block#for-2.6.36] bio, fs: update RWA_MASK, READA and  SWRITE to match the corresponding BIO_RW_* bits
  2010-08-03 11:21                           ` Jens Axboe
@ 2010-08-03 15:52                             ` Tejun Heo
  2010-08-03 16:02                               ` [PATCH] coda: rename REQ_* to CODA_REQ_* Tejun Heo
  0 siblings, 1 reply; 26+ messages in thread
From: Tejun Heo @ 2010-08-03 15:52 UTC (permalink / raw)
  To: Jens Axboe, Jan Harkes, coda
  Cc: stable, Vladislav Bolkhovitin, Bryan Mesich, scst-devel,
	linux-kernel, linux-raid, dm-devel, Neil Brown

(cc'ing CODA people)
Hello,

On 08/03/2010 01:21 PM, Jens Axboe wrote:
>>> Signed-off-by: Tejun Heo <tj@kernel.org>
>>> Reported-and-bisected-by: Vladislav Bolkhovitin <vst@vlnb.net>
>>> Root-caused-by: Neil Brown <neilb@suse.de>
>>> Cc: Jens Axobe <axboe@kernel.dk>
>>            ^^^^^
>>
>> (Too) common typo :-)
>>
>> Anyway, applied to for-2.6.36, thanks a lot.

Oops, I'm sorry.  Heh, it reminds me of misspelling Linus's last name
as Tolvards in a number of patches.  :-)

> Irk, we have an issue:
> 
> In file included from fs/coda/psdev.c:48:
> include/linux/coda_psdev.h:91:1: warning: "REQ_WRITE" redefined
> In file included from include/linux/fs.h:11,
>                  from include/linux/proc_fs.h:5,
>                  from fs/coda/psdev.c:31:
> include/linux/blk_types.h:154:1: warning: this is the location of the previous definition
> 
> And from include/linux/coda_psdev.h:
> 
> #define REQ_ASYNC  0x1
> #define REQ_READ   0x2
> #define REQ_WRITE  0x4
> #define REQ_ABORT  0x8
> 
> which unfortunately seem to not be under __KERNEL__ protection, but
> there are things like wait_queue_head_t structs there as well so should
> be safe to change.

Yeah, I hate it when symbols in non-core code doesn't have proper
prefix.  Prefixing CODA_ in front of those macros should do it.  Jan
Harkes, would that work for CODA?

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH] coda: rename REQ_* to CODA_REQ_*
  2010-08-03 15:52                             ` Tejun Heo
@ 2010-08-03 16:02                               ` Tejun Heo
  2010-08-03 16:11                                 ` Jan Harkes
  2010-08-03 17:31                                 ` Jens Axboe
  0 siblings, 2 replies; 26+ messages in thread
From: Tejun Heo @ 2010-08-03 16:02 UTC (permalink / raw)
  To: Jens Axboe, Jan Harkes, coda
  Cc: stable, Vladislav Bolkhovitin, Bryan Mesich, scst-devel,
	linux-kernel, linux-raid, dm-devel, Neil Brown

REQ_* constants are used for block layer requests causing inconsistent
duplicate definitions of REQ_WRITE.  Rename REQ_* used by coda to
CODA_REQ_*.

Signed-off-by: Tejun Heo <tj@kernel.org>
---
So, something like this.  Build tested only.

Thanks.

 fs/coda/psdev.c            |   14 +++++++-------
 fs/coda/upcall.c           |   12 ++++++------
 include/linux/coda_psdev.h |    8 ++++----
 3 files changed, 17 insertions(+), 17 deletions(-)

Index: work/fs/coda/psdev.c
===================================================================
--- work.orig/fs/coda/psdev.c
+++ work/fs/coda/psdev.c
@@ -177,7 +177,7 @@ static ssize_t coda_psdev_write(struct f
 		nbytes = req->uc_outSize; /* don't have more space! */
 	}
         if (copy_from_user(req->uc_data, buf, nbytes)) {
-		req->uc_flags |= REQ_ABORT;
+		req->uc_flags |= CODA_REQ_ABORT;
 		wake_up(&req->uc_sleep);
 		retval = -EFAULT;
 		goto out;
@@ -185,7 +185,7 @@ static ssize_t coda_psdev_write(struct f

 	/* adjust outsize. is this useful ?? */
         req->uc_outSize = nbytes;	
-        req->uc_flags |= REQ_WRITE;
+        req->uc_flags |= CODA_REQ_WRITE;
 	count = nbytes;

 	/* Convert filedescriptor into a file handle */
@@ -254,8 +254,8 @@ static ssize_t coda_psdev_read(struct fi
 	        retval = -EFAULT;

 	/* If request was not a signal, enqueue and don't free */
-	if (!(req->uc_flags & REQ_ASYNC)) {
-		req->uc_flags |= REQ_READ;
+	if (!(req->uc_flags & CODA_REQ_ASYNC)) {
+		req->uc_flags |= CODA_REQ_READ;
 		list_add_tail(&(req->uc_chain), &vcp->vc_processing);
 		goto out;
 	}
@@ -315,19 +315,19 @@ static int coda_psdev_release(struct ino
 		list_del(&req->uc_chain);

 		/* Async requests need to be freed here */
-		if (req->uc_flags & REQ_ASYNC) {
+		if (req->uc_flags & CODA_REQ_ASYNC) {
 			CODA_FREE(req->uc_data, sizeof(struct coda_in_hdr));
 			kfree(req);
 			continue;
 		}
-		req->uc_flags |= REQ_ABORT;
+		req->uc_flags |= CODA_REQ_ABORT;
 		wake_up(&req->uc_sleep);
 	}

 	list_for_each_entry_safe(req, tmp, &vcp->vc_processing, uc_chain) {
 		list_del(&req->uc_chain);

-		req->uc_flags |= REQ_ABORT;
+		req->uc_flags |= CODA_REQ_ABORT;
 		wake_up(&req->uc_sleep);
 	}

Index: work/fs/coda/upcall.c
===================================================================
--- work.orig/fs/coda/upcall.c
+++ work/fs/coda/upcall.c
@@ -604,7 +604,7 @@ static void coda_unblock_signals(sigset_
 			       (((r)->uc_opcode != CODA_CLOSE && \
 				 (r)->uc_opcode != CODA_STORE && \
 				 (r)->uc_opcode != CODA_RELEASE) || \
-				(r)->uc_flags & REQ_READ))
+				(r)->uc_flags & CODA_REQ_READ))

 static inline void coda_waitfor_upcall(struct upc_req *req)
 {
@@ -624,7 +624,7 @@ static inline void coda_waitfor_upcall(s
 			set_current_state(TASK_UNINTERRUPTIBLE);

 		/* got a reply */
-		if (req->uc_flags & (REQ_WRITE | REQ_ABORT))
+		if (req->uc_flags & (CODA_REQ_WRITE | CODA_REQ_ABORT))
 			break;

 		if (blocked && time_after(jiffies, timeout) &&
@@ -708,7 +708,7 @@ static int coda_upcall(struct venus_comm
 	coda_waitfor_upcall(req);

 	/* Op went through, interrupt or not... */
-	if (req->uc_flags & REQ_WRITE) {
+	if (req->uc_flags & CODA_REQ_WRITE) {
 		out = (union outputArgs *)req->uc_data;
 		/* here we map positive Venus errors to kernel errors */
 		error = -out->oh.result;
@@ -717,13 +717,13 @@ static int coda_upcall(struct venus_comm
 	}

 	error = -EINTR;
-	if ((req->uc_flags & REQ_ABORT) || !signal_pending(current)) {
+	if ((req->uc_flags & CODA_REQ_ABORT) || !signal_pending(current)) {
 		printk(KERN_WARNING "coda: Unexpected interruption.\n");
 		goto exit;
 	}

 	/* Interrupted before venus read it. */
-	if (!(req->uc_flags & REQ_READ))
+	if (!(req->uc_flags & CODA_REQ_READ))
 		goto exit;

 	/* Venus saw the upcall, make sure we can send interrupt signal */
@@ -747,7 +747,7 @@ static int coda_upcall(struct venus_comm
 	sig_inputArgs->ih.opcode = CODA_SIGNAL;
 	sig_inputArgs->ih.unique = req->uc_unique;

-	sig_req->uc_flags = REQ_ASYNC;
+	sig_req->uc_flags = CODA_REQ_ASYNC;
 	sig_req->uc_opcode = sig_inputArgs->ih.opcode;
 	sig_req->uc_unique = sig_inputArgs->ih.unique;
 	sig_req->uc_inSize = sizeof(struct coda_in_hdr);
Index: work/include/linux/coda_psdev.h
===================================================================
--- work.orig/include/linux/coda_psdev.h
+++ work/include/linux/coda_psdev.h
@@ -86,9 +86,9 @@ struct upc_req {
 	wait_queue_head_t   uc_sleep;   /* process' wait queue */
 };

-#define REQ_ASYNC  0x1
-#define REQ_READ   0x2
-#define REQ_WRITE  0x4
-#define REQ_ABORT  0x8
+#define CODA_REQ_ASYNC		0x1
+#define CODA_REQ_READ		0x2
+#define CODA_REQ_WRITE		0x4
+#define CODA_REQ_ABORT		0x8

 #endif

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH] coda: rename REQ_* to CODA_REQ_*
  2010-08-03 16:02                               ` [PATCH] coda: rename REQ_* to CODA_REQ_* Tejun Heo
@ 2010-08-03 16:11                                 ` Jan Harkes
  2010-08-03 17:31                                 ` Jens Axboe
  1 sibling, 0 replies; 26+ messages in thread
From: Jan Harkes @ 2010-08-03 16:11 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Jens Axboe, stable, Vladislav Bolkhovitin, Bryan Mesich,
	scst-devel, linux-kernel, linux-raid, dm-devel, Neil Brown

On Tue, Aug 03, 2010 at 06:02:53PM +0200, Tejun Heo wrote:
> REQ_* constants are used for block layer requests causing inconsistent
> duplicate definitions of REQ_WRITE.  Rename REQ_* used by coda to
> CODA_REQ_*.
> 
> Signed-off-by: Tejun Heo <tj@kernel.org>

Looks good to me.

Acked-by: Jan Harkes <jaharkes@cs.cmu.edu>


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH] coda: rename REQ_* to CODA_REQ_*
  2010-08-03 16:02                               ` [PATCH] coda: rename REQ_* to CODA_REQ_* Tejun Heo
  2010-08-03 16:11                                 ` Jan Harkes
@ 2010-08-03 17:31                                 ` Jens Axboe
  1 sibling, 0 replies; 26+ messages in thread
From: Jens Axboe @ 2010-08-03 17:31 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Jan Harkes, coda, stable, Vladislav Bolkhovitin, Bryan Mesich,
	scst-devel, linux-kernel, linux-raid, dm-devel, Neil Brown

On 2010-08-03 18:02, Tejun Heo wrote:
> REQ_* constants are used for block layer requests causing inconsistent
> duplicate definitions of REQ_WRITE.  Rename REQ_* used by coda to
> CODA_REQ_*.

Should have been clear, I committed the fixup already:

http://git.kernel.dk/?p=linux-2.6-block.git;a=commit;h=ceb1fde0d5c7611fdb9004176ac34140d27e745a

Looks byte-for-byte identical :-)

-- 
Jens Axboe


Confidentiality Notice: This e-mail message, its contents and any attachments to it are confidential to the intended recipient, and may contain information that is privileged and/or exempt from disclosure under applicable law. If you are not the intended recipient, please immediately notify the sender and destroy the original e-mail message and any attachments (and any copies that may have been made) from your system or otherwise. Any unauthorized use, copying, disclosure or distribution of this information is strictly prohibited.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH RESEND 1/2 block#for-linus] bio, fs: update READA and SWRITE to match the corresponding BIO_RW_* bits
  2010-08-02 14:15                   ` [PATCH RESEND 1/2 block#for-linus] bio, fs: update READA and SWRITE to match the corresponding BIO_RW_* bits Tejun Heo
  2010-08-02 21:52                     ` Neil Brown
@ 2010-08-05 18:45                     ` Jeff Moyer
  2010-08-05 18:57                       ` Vladislav Bolkhovitin
  2010-08-06  4:04                       ` Tao Ma
  1 sibling, 2 replies; 26+ messages in thread
From: Jeff Moyer @ 2010-08-05 18:45 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Jens Axboe, stable, Vladislav Bolkhovitin, Bryan Mesich,
	scst-devel, linux-kernel, linux-raid, dm-devel, Neil Brown

Tejun Heo <tj@kernel.org> writes:

> Commit a82afdf (block: use the same failfast bits for bio and request)
> moved BIO_RW_* bits around such that they match up with REQ_* bits.
> Unfortunately, fs.h hard coded READ, WRITE, READA and SWRITE as 0, 1,
> 2 and 3, and expected them to match with BIO_RW_* bits.  READ/WRITE
> didn't change but BIO_RW_AHEAD was moved to bit 4 instead of bit 1,
> breaking READA and SWRITE.
>
> This patch updates READA and SWRITE such that they match the BIO_RW_*
> bits again.  A follow up patch will update the definitions to directly
> use BIO_RW_* bits so that this kind of breakage won't happen again.
>
> Stable: The offending commit a82afdf was released with v2.6.32, so
> this patch should be applied to all kernels since then but it must
> _NOT_ be applied to kernels earlier than that.

Would someone be so kind as to remind me how this problem manifests
itself?  I know I read this recently, but my memory and googling skills
are both failing me.  :(

Cheers,
Jeff

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH RESEND 1/2 block#for-linus] bio, fs: update READA and SWRITE to match the corresponding BIO_RW_* bits
  2010-08-05 18:45                     ` Jeff Moyer
@ 2010-08-05 18:57                       ` Vladislav Bolkhovitin
  2010-08-05 19:30                         ` Jeff Moyer
  2010-08-06  4:04                       ` Tao Ma
  1 sibling, 1 reply; 26+ messages in thread
From: Vladislav Bolkhovitin @ 2010-08-05 18:57 UTC (permalink / raw)
  To: Jeff Moyer
  Cc: Tejun Heo, Jens Axboe, stable, Bryan Mesich, scst-devel,
	linux-kernel, linux-raid, dm-devel, Neil Brown


Jeff Moyer, on 08/05/2010 10:45 PM wrote:
> Tejun Heo<tj@kernel.org>  writes:
>
>> Commit a82afdf (block: use the same failfast bits for bio and request)
>> moved BIO_RW_* bits around such that they match up with REQ_* bits.
>> Unfortunately, fs.h hard coded READ, WRITE, READA and SWRITE as 0, 1,
>> 2 and 3, and expected them to match with BIO_RW_* bits.  READ/WRITE
>> didn't change but BIO_RW_AHEAD was moved to bit 4 instead of bit 1,
>> breaking READA and SWRITE.
>>
>> This patch updates READA and SWRITE such that they match the BIO_RW_*
>> bits again.  A follow up patch will update the definitions to directly
>> use BIO_RW_* bits so that this kind of breakage won't happen again.
>>
>> Stable: The offending commit a82afdf was released with v2.6.32, so
>> this patch should be applied to all kernels since then but it must
>> _NOT_ be applied to kernels earlier than that.
>
> Would someone be so kind as to remind me how this problem manifests
> itself?  I know I read this recently, but my memory and googling skills
> are both failing me.  :(

See http://lkml.org/lkml/2010/7/28/315 and link inside.

Vlad

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH RESEND 1/2 block#for-linus] bio, fs: update READA and SWRITE to match the corresponding BIO_RW_* bits
  2010-08-05 18:57                       ` Vladislav Bolkhovitin
@ 2010-08-05 19:30                         ` Jeff Moyer
  0 siblings, 0 replies; 26+ messages in thread
From: Jeff Moyer @ 2010-08-05 19:30 UTC (permalink / raw)
  To: Vladislav Bolkhovitin
  Cc: Tejun Heo, Jens Axboe, stable, Bryan Mesich, scst-devel,
	linux-kernel, linux-raid, dm-devel, Neil Brown

Vladislav Bolkhovitin <vst@vlnb.net> writes:

> Jeff Moyer, on 08/05/2010 10:45 PM wrote:
>> Tejun Heo<tj@kernel.org>  writes:
>>
>>> Commit a82afdf (block: use the same failfast bits for bio and request)
>>> moved BIO_RW_* bits around such that they match up with REQ_* bits.
>>> Unfortunately, fs.h hard coded READ, WRITE, READA and SWRITE as 0, 1,
>>> 2 and 3, and expected them to match with BIO_RW_* bits.  READ/WRITE
>>> didn't change but BIO_RW_AHEAD was moved to bit 4 instead of bit 1,
>>> breaking READA and SWRITE.
>>>
>>> This patch updates READA and SWRITE such that they match the BIO_RW_*
>>> bits again.  A follow up patch will update the definitions to directly
>>> use BIO_RW_* bits so that this kind of breakage won't happen again.
>>>
>>> Stable: The offending commit a82afdf was released with v2.6.32, so
>>> this patch should be applied to all kernels since then but it must
>>> _NOT_ be applied to kernels earlier than that.
>>
>> Would someone be so kind as to remind me how this problem manifests
>> itself?  I know I read this recently, but my memory and googling skills
>> are both failing me.  :(
>
> See http://lkml.org/lkml/2010/7/28/315 and link inside.

Thanks a ton!

-Jeff

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH RESEND 1/2 block#for-linus] bio, fs: update READA and SWRITE to match the corresponding BIO_RW_* bits
  2010-08-05 18:45                     ` Jeff Moyer
  2010-08-05 18:57                       ` Vladislav Bolkhovitin
@ 2010-08-06  4:04                       ` Tao Ma
  2010-08-06  6:29                         ` Tejun Heo
  2010-08-19  8:38                         ` [dm-devel] " Christoph Hellwig
  1 sibling, 2 replies; 26+ messages in thread
From: Tao Ma @ 2010-08-06  4:04 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Jeff Moyer, Jens Axboe, stable, Vladislav Bolkhovitin,
	Bryan Mesich, scst-devel, linux-kernel, linux-raid, dm-devel,
	Neil Brown

Hi Tejun,

On 08/06/2010 02:45 AM, Jeff Moyer wrote:
> Tejun Heo<tj@kernel.org>  writes:
>
>> Commit a82afdf (block: use the same failfast bits for bio and request)
>> moved BIO_RW_* bits around such that they match up with REQ_* bits.
>> Unfortunately, fs.h hard coded READ, WRITE, READA and SWRITE as 0, 1,
>> 2 and 3, and expected them to match with BIO_RW_* bits.  READ/WRITE
>> didn't change but BIO_RW_AHEAD was moved to bit 4 instead of bit 1,
>> breaking READA and SWRITE.
>>
>> This patch updates READA and SWRITE such that they match the BIO_RW_*
>> bits again.  A follow up patch will update the definitions to directly
>> use BIO_RW_* bits so that this kind of breakage won't happen again.
>>
>> Stable: The offending commit a82afdf was released with v2.6.32, so
>> this patch should be applied to all kernels since then but it must
>> _NOT_ be applied to kernels earlier than that.
If only this patch goes into stable, do we need to change RWA_MASK in 
this patch also(I have seen you change it the 2/2)?

-#define RWA_MASK        2
+#define RWA_MASK        16

Regards,
Tao


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH RESEND 1/2 block#for-linus] bio, fs: update READA and SWRITE to match the corresponding BIO_RW_* bits
  2010-08-06  4:04                       ` Tao Ma
@ 2010-08-06  6:29                         ` Tejun Heo
  2010-08-19  8:38                         ` [dm-devel] " Christoph Hellwig
  1 sibling, 0 replies; 26+ messages in thread
From: Tejun Heo @ 2010-08-06  6:29 UTC (permalink / raw)
  To: Tao Ma
  Cc: Jeff Moyer, Jens Axboe, stable, Vladislav Bolkhovitin,
	Bryan Mesich, scst-devel, linux-kernel, linux-raid, dm-devel,
	Neil Brown

Hello,

On 08/06/2010 06:04 AM, Tao Ma wrote:
> If only this patch goes into stable, do we need to change RWA_MASK in this patch also(I have seen you change it the 2/2)?
> 
> -#define RWA_MASK        2
> +#define RWA_MASK        16

Yeah, Neil already pointed it out and updated patch has been posted
and merged.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [dm-devel] [PATCH RESEND 1/2 block#for-linus] bio, fs: update READA and SWRITE to match the corresponding BIO_RW_* bits
  2010-08-06  4:04                       ` Tao Ma
  2010-08-06  6:29                         ` Tejun Heo
@ 2010-08-19  8:38                         ` Christoph Hellwig
  1 sibling, 0 replies; 26+ messages in thread
From: Christoph Hellwig @ 2010-08-19  8:38 UTC (permalink / raw)
  To: device-mapper development
  Cc: Tejun Heo, Vladislav Bolkhovitin, linux-kernel, Bryan Mesich,
	Jens Axboe, linux-raid, Jeff Moyer, scst-devel, stable

On Fri, Aug 06, 2010 at 12:04:43PM +0800, Tao Ma wrote:
> >>Stable: The offending commit a82afdf was released with v2.6.32, so
> >>this patch should be applied to all kernels since then but it must
> >>_NOT_ be applied to kernels earlier than that.
> If only this patch goes into stable, do we need to change RWA_MASK
> in this patch also(I have seen you change it the 2/2)?
> 
> -#define RWA_MASK        2
> +#define RWA_MASK        16

I think RW_MASK and RWA_MASK should simply go away, they don't server
any use over just masking the correct REQ_* flags.


^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2010-08-19  8:39 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <20100628010346.GA2376@atlantis.cc.ndsu.nodak.edu>
     [not found] ` <4C28EFD6.2070203@vlnb.net>
     [not found]   ` <20100714190325.GA25148@atlantis.cc.ndsu.nodak.edu>
     [not found]     ` <4C3EF3AD.5070509@vlnb.net>
     [not found]       ` <20100723191844.GB31152@atlantis.cc.ndsu.nodak.edu>
     [not found]         ` <4C4D7DF5.9060909@vlnb.net>
     [not found]           ` <20100727220110.GF31152@atlantis.cc.ndsu.nodak.edu>
2010-07-28 18:16             ` RAID/block regression starting from 2.6.32, bisected Vladislav Bolkhovitin
2010-07-30 10:29               ` Tejun Heo
2010-08-02  0:42                 ` Neil Brown
2010-08-02 14:12                   ` [PATCH 1/2 block#for-linus] bio, fs: update READA and SWRITE to match the corresponding BIO_RW_* bits Tejun Heo
2010-08-02 14:13                     ` [PATCH 2/2 block#for-linus] bio, fs: separate out bio_types.h and define READ/WRITE constants in terms of BIO_RW_* flags Tejun Heo
2010-08-02 14:15                     ` [PATCH RESEND " Tejun Heo
2010-08-02 14:18                     ` Tejun Heo
2010-08-02 14:15                   ` [PATCH RESEND 1/2 block#for-linus] bio, fs: update READA and SWRITE to match the corresponding BIO_RW_* bits Tejun Heo
2010-08-02 21:52                     ` Neil Brown
2010-08-03  9:27                       ` Tejun Heo
2010-08-05 18:45                     ` Jeff Moyer
2010-08-05 18:57                       ` Vladislav Bolkhovitin
2010-08-05 19:30                         ` Jeff Moyer
2010-08-06  4:04                       ` Tao Ma
2010-08-06  6:29                         ` Tejun Heo
2010-08-19  8:38                         ` [dm-devel] " Christoph Hellwig
2010-08-02 14:17                   ` Tejun Heo
2010-08-02 20:04                     ` Jens Axboe
2010-08-03  9:53                       ` [PATCH 1/2 block#for-2.6.36] bio, fs: update RWA_MASK, " Tejun Heo
2010-08-03 11:15                         ` Jens Axboe
2010-08-03 11:21                           ` Jens Axboe
2010-08-03 15:52                             ` Tejun Heo
2010-08-03 16:02                               ` [PATCH] coda: rename REQ_* to CODA_REQ_* Tejun Heo
2010-08-03 16:11                                 ` Jan Harkes
2010-08-03 17:31                                 ` Jens Axboe
2010-08-03  9:54                       ` [PATCH 2/2 block#for-2.6.36] bio, fs: separate out bio_types.h and define READ/WRITE constants in terms of BIO_RW_* flags Tejun Heo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).