All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: MTD RAID
       [not found] <CA+qeAOpuZ0CXZP8tCWdhoVvTEKAw26gtz63-UJmQ4XLSXAd=Yg@mail.gmail.com>
@ 2016-08-19  6:49 ` Boris Brezillon
  2016-08-19  7:08   ` Dongsheng Yang
  2016-08-23  3:44   ` Dongsheng Yang
  0 siblings, 2 replies; 20+ messages in thread
From: Boris Brezillon @ 2016-08-19  6:49 UTC (permalink / raw)
  To: Dongsheng Yang
  Cc: linux-mtd, dmitry.torokhov, Dongsheng Yang, shengyong1, starvik,
	richard, linux-cris-kernel, Colin King, jschultz, Ard Biesheuvel,
	David Woodhouse, asierra, jesper.nilsson, fabf,
	dooooongsheng.yang, Brian Norris, mtownsend1973

Hi Dongsheng,

On Fri, 19 Aug 2016 14:34:54 +0800
Dongsheng Yang <dongsheng081251@gmail.com> wrote:

> Hi guys,
>     This is a email about MTD RAID.
> 
> *Code:*
>     kernel:
> https://github.com/yangdongsheng/linux/tree/mtd_raid_v2-for-4.7

Just had a quick look at the code, and I see at least one major problem
in your RAID-1 implementation: you're ignoring the fact that NAND blocks
can be or become bad. What's the plan for that?

Regards,

Boris

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: MTD RAID
  2016-08-19  6:49 ` MTD RAID Boris Brezillon
@ 2016-08-19  7:08   ` Dongsheng Yang
  2016-08-19  7:15     ` Dongsheng Yang
  2016-08-19  8:20     ` Boris Brezillon
  2016-08-23  3:44   ` Dongsheng Yang
  1 sibling, 2 replies; 20+ messages in thread
From: Dongsheng Yang @ 2016-08-19  7:08 UTC (permalink / raw)
  To: Boris Brezillon, Dongsheng Yang
  Cc: fabf, jesper.nilsson, Dongsheng Yang, linux-cris-kernel,
	shengyong1, Ard Biesheuvel, richard, dmitry.torokhov,
	dooooongsheng.yang, jschultz, starvik, mtownsend1973, linux-mtd,
	Colin King, asierra, Brian Norris, David Woodhouse



On 08/19/2016 02:49 PM, Boris Brezillon wrote:
> Hi Dongsheng,
>
> On Fri, 19 Aug 2016 14:34:54 +0800
> Dongsheng Yang <dongsheng081251@gmail.com> wrote:
>
>> Hi guys,
>>      This is a email about MTD RAID.
>>
>> *Code:*
>>      kernel:
>> https://github.com/yangdongsheng/linux/tree/mtd_raid_v2-for-4.7
> Just had a quick look at the code, and I see at least one major problem
> in your RAID-1 implementation: you're ignoring the fact that NAND blocks
> can be or become bad. What's the plan for that?

Hi Boris,
     Thanx for your quick reply.

     When you are using RAID-1, it would erase the all mirrored blockes 
when you are erasing.
if there is a bad block in them, mtd_raid_erase will return an error and 
the userspace tool
or ubi will mark this block as bad, that means, the 
mtd_raid_block_markbad() will mark the all
  mirrored blocks as bad, although some of it are good.

In addition, when you have data in flash with RAID-1, if one block 
become bad. For example,
when the mtd0 and mtd1 are used to build a RAID-1 device mtd2. When you 
are using mtd2
and you found there is a block become bad. Don't worry about data 
losing, the data is still
saved in the good one mirror. you can replace the bad one device with 
another new mtd device.

My plan about this feature is all on the userspace tool.
(1). mtd_raid scan mtd2 <---- this will show the status of RAID device 
and each member of it.
(2). mtd_raid replace mtd2 --old mtd1 --new mtd3.   <---- this will 
replace the bad one mtd1 with mtd3.

What about this idea?

Yang
>
> Regards,
>
> Boris
>
> ______________________________________________________
> Linux MTD discussion mailing list
> http://lists.infradead.org/mailman/listinfo/linux-mtd/
>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: MTD RAID
  2016-08-19  7:08   ` Dongsheng Yang
@ 2016-08-19  7:15     ` Dongsheng Yang
  2016-08-19  7:28       ` Dongsheng Yang
  2016-08-19  8:20     ` Boris Brezillon
  1 sibling, 1 reply; 20+ messages in thread
From: Dongsheng Yang @ 2016-08-19  7:15 UTC (permalink / raw)
  To: Boris Brezillon, Dongsheng Yang
  Cc: starvik, jesper.nilsson, Dongsheng Yang, linux-cris-kernel,
	shengyong1, Ard Biesheuvel, richard, dmitry.torokhov,
	dooooongsheng.yang, jschultz, fabf, mtownsend1973, linux-mtd,
	Colin King, asierra, Brian Norris, David Woodhouse

In addition, current implementation actually have a retry in reading.


     if (++i_copy >= raid->ncopies)
         goto out;

     ret = mtd_raid_ctx_retry(ctx, i_copy);


That means, we can read the good one copy from RAID-1 device even there
is one bad device.

Yang

On 08/19/2016 03:08 PM, Dongsheng Yang wrote:
>
>
> On 08/19/2016 02:49 PM, Boris Brezillon wrote:
>> Hi Dongsheng,
>>
>> On Fri, 19 Aug 2016 14:34:54 +0800
>> Dongsheng Yang <dongsheng081251@gmail.com> wrote:
>>
>>> Hi guys,
>>>      This is a email about MTD RAID.
>>>
>>> *Code:*
>>>      kernel:
>>> https://github.com/yangdongsheng/linux/tree/mtd_raid_v2-for-4.7
>> Just had a quick look at the code, and I see at least one major problem
>> in your RAID-1 implementation: you're ignoring the fact that NAND blocks
>> can be or become bad. What's the plan for that?
>
> Hi Boris,
>     Thanx for your quick reply.
>
>     When you are using RAID-1, it would erase the all mirrored blockes 
> when you are erasing.
> if there is a bad block in them, mtd_raid_erase will return an error 
> and the userspace tool
> or ubi will mark this block as bad, that means, the 
> mtd_raid_block_markbad() will mark the all
>  mirrored blocks as bad, although some of it are good.
>
> In addition, when you have data in flash with RAID-1, if one block 
> become bad. For example,
> when the mtd0 and mtd1 are used to build a RAID-1 device mtd2. When 
> you are using mtd2
> and you found there is a block become bad. Don't worry about data 
> losing, the data is still
> saved in the good one mirror. you can replace the bad one device with 
> another new mtd device.
>
> My plan about this feature is all on the userspace tool.
> (1). mtd_raid scan mtd2 <---- this will show the status of RAID device 
> and each member of it.
> (2). mtd_raid replace mtd2 --old mtd1 --new mtd3.   <---- this will 
> replace the bad one mtd1 with mtd3.
>
> What about this idea?
>
> Yang
>>
>> Regards,
>>
>> Boris
>>
>> ______________________________________________________
>> Linux MTD discussion mailing list
>> http://lists.infradead.org/mailman/listinfo/linux-mtd/
>>
>
>
> ______________________________________________________
> Linux MTD discussion mailing list
> http://lists.infradead.org/mailman/listinfo/linux-mtd/
>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: MTD RAID
  2016-08-19  7:15     ` Dongsheng Yang
@ 2016-08-19  7:28       ` Dongsheng Yang
  0 siblings, 0 replies; 20+ messages in thread
From: Dongsheng Yang @ 2016-08-19  7:28 UTC (permalink / raw)
  To: Boris Brezillon, Dongsheng Yang
  Cc: fabf, jesper.nilsson, Dongsheng Yang, linux-cris-kernel,
	shengyong1, Ard Biesheuvel, richard, dmitry.torokhov,
	dooooongsheng.yang, jschultz, starvik, mtownsend1973, linux-mtd,
	Colin King, asierra, Brian Norris, David Woodhouse

Okey, another idea about this. When we are writing data in ubi.
When we met a writing error like below.



ubi_eba_write_leb():

... ...
write_error:
     if (err != -EIO || !ubi->bad_allowed) {
         ubi_ro_mode(ubi);
         leb_write_unlock(ubi, vol_id, lnum);
         ubi_free_vid_hdr(ubi, vid_hdr);
         return err;
     }

     /*
      * Fortunately, this is the first write operation to this physical
      * eraseblock, so just put it and request a new one. We assume that if
      * this physical eraseblock went bad, the erase code will handle that.
      */
     err = ubi_wl_put_peb(ubi, vol_id, lnum, pnum, 1);
     if (err || ++tries > UBI_IO_RETRIES) {
         ubi_ro_mode(ubi);
         leb_write_unlock(ubi, vol_id, lnum);
         ubi_free_vid_hdr(ubi, vid_hdr);
         return err;
     }

     vid_hdr->sqnum = cpu_to_be64(ubi_next_sqnum(ubi));
     ubi_msg(ubi, "try another PEB");
     goto retry;
}

Okey, in this case, if this is the first writing on this block, that's
fortunate. But if not, we will lose our data.

But if we are using RAID-1 device, we can improve this case in
ubi. Then we can migrate data in this block at first and then
mark it as bad. Because we have mirrors in RAID-1, we can
read the data from it.

Sounds good?

Yang

On 08/19/2016 03:15 PM, Dongsheng Yang wrote:
> In addition, current implementation actually have a retry in reading.
>
>
>     if (++i_copy >= raid->ncopies)
>         goto out;
>
>     ret = mtd_raid_ctx_retry(ctx, i_copy);
>
>
> That means, we can read the good one copy from RAID-1 device even there
> is one bad device.
>
> Yang
>
> On 08/19/2016 03:08 PM, Dongsheng Yang wrote:
>>
>>
>> On 08/19/2016 02:49 PM, Boris Brezillon wrote:
>>> Hi Dongsheng,
>>>
>>> On Fri, 19 Aug 2016 14:34:54 +0800
>>> Dongsheng Yang <dongsheng081251@gmail.com> wrote:
>>>
>>>> Hi guys,
>>>>      This is a email about MTD RAID.
>>>>
>>>> *Code:*
>>>>      kernel:
>>>> https://github.com/yangdongsheng/linux/tree/mtd_raid_v2-for-4.7
>>> Just had a quick look at the code, and I see at least one major problem
>>> in your RAID-1 implementation: you're ignoring the fact that NAND 
>>> blocks
>>> can be or become bad. What's the plan for that?
>>
>> Hi Boris,
>>     Thanx for your quick reply.
>>
>>     When you are using RAID-1, it would erase the all mirrored 
>> blockes when you are erasing.
>> if there is a bad block in them, mtd_raid_erase will return an error 
>> and the userspace tool
>> or ubi will mark this block as bad, that means, the 
>> mtd_raid_block_markbad() will mark the all
>>  mirrored blocks as bad, although some of it are good.
>>
>> In addition, when you have data in flash with RAID-1, if one block 
>> become bad. For example,
>> when the mtd0 and mtd1 are used to build a RAID-1 device mtd2. When 
>> you are using mtd2
>> and you found there is a block become bad. Don't worry about data 
>> losing, the data is still
>> saved in the good one mirror. you can replace the bad one device with 
>> another new mtd device.
>>
>> My plan about this feature is all on the userspace tool.
>> (1). mtd_raid scan mtd2 <---- this will show the status of RAID 
>> device and each member of it.
>> (2). mtd_raid replace mtd2 --old mtd1 --new mtd3.   <---- this will 
>> replace the bad one mtd1 with mtd3.
>>
>> What about this idea?
>>
>> Yang
>>>
>>> Regards,
>>>
>>> Boris
>>>
>>> ______________________________________________________
>>> Linux MTD discussion mailing list
>>> http://lists.infradead.org/mailman/listinfo/linux-mtd/
>>>
>>
>>
>> ______________________________________________________
>> Linux MTD discussion mailing list
>> http://lists.infradead.org/mailman/listinfo/linux-mtd/
>>
>
>
> ______________________________________________________
> Linux MTD discussion mailing list
> http://lists.infradead.org/mailman/listinfo/linux-mtd/
>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: MTD RAID
  2016-08-19  7:08   ` Dongsheng Yang
  2016-08-19  7:15     ` Dongsheng Yang
@ 2016-08-19  8:20     ` Boris Brezillon
       [not found]       ` <CA+qeAOrSAi9uTHGCi-5cAJpM_O45oJUihNP-rHHa1FWL7_ZKHQ@mail.gmail.com>
       [not found]       ` <57B6CC7B.1060208@easystack.cn>
  1 sibling, 2 replies; 20+ messages in thread
From: Boris Brezillon @ 2016-08-19  8:20 UTC (permalink / raw)
  To: Dongsheng Yang
  Cc: Dongsheng Yang, fabf, jesper.nilsson, Dongsheng Yang,
	linux-cris-kernel, shengyong1, Ard Biesheuvel, richard,
	dmitry.torokhov, dooooongsheng.yang, jschultz, starvik,
	mtownsend1973, linux-mtd, Colin King, asierra, Brian Norris,
	David Woodhouse

On Fri, 19 Aug 2016 15:08:35 +0800
Dongsheng Yang <dongsheng.yang@easystack.cn> wrote:

> On 08/19/2016 02:49 PM, Boris Brezillon wrote:
> > Hi Dongsheng,
> >
> > On Fri, 19 Aug 2016 14:34:54 +0800
> > Dongsheng Yang <dongsheng081251@gmail.com> wrote:
> >  
> >> Hi guys,
> >>      This is a email about MTD RAID.
> >>
> >> *Code:*
> >>      kernel:
> >> https://github.com/yangdongsheng/linux/tree/mtd_raid_v2-for-4.7  
> > Just had a quick look at the code, and I see at least one major problem
> > in your RAID-1 implementation: you're ignoring the fact that NAND blocks
> > can be or become bad. What's the plan for that?  
> 
> Hi Boris,
>      Thanx for your quick reply.
> 
>      When you are using RAID-1, it would erase the all mirrored blockes 
> when you are erasing.
> if there is a bad block in them, mtd_raid_erase will return an error and 
> the userspace tool
> or ubi will mark this block as bad, that means, the 
> mtd_raid_block_markbad() will mark the all
>   mirrored blocks as bad, although some of it are good.
> 
> In addition, when you have data in flash with RAID-1, if one block 
> become bad. For example,
> when the mtd0 and mtd1 are used to build a RAID-1 device mtd2. When you 
> are using mtd2
> and you found there is a block become bad. Don't worry about data 
> losing, the data is still
> saved in the good one mirror. you can replace the bad one device with 
> another new mtd device.

Okay, good to see you were aware of this problem.

> 
> My plan about this feature is all on the userspace tool.
> (1). mtd_raid scan mtd2 <---- this will show the status of RAID device 
> and each member of it.
> (2). mtd_raid replace mtd2 --old mtd1 --new mtd3.   <---- this will 
> replace the bad one mtd1 with mtd3.
> 
> What about this idea?

Not sure I follow you on #2. And, IMO, you should not depend on a
userspace tool to detect address this kind of problems.

Okay, a few more questions.

1/ What about data retention issues? Say you read from the main MTD, and
it does not show uncorrectable errors, so you keep reading on it, but,
since you're never reading from the mirror, you can't detect if there
are some uncorrectable errors or if the number of bitflips exceed the
threshold used to trigger a data move. If suddenly a page in your main
MTD becomes unreadable, you're not guaranteed that the mirror page will
be valid :-/.

2/ How do you handle write atomicity in RAID1? I don't know exactly
how RAID1 works, but I guess there's a mechanism (a journal?) to detect
that data has been written on the main MTD but not on the mirror, so
that you can replay the operation after a power-cut. Do handle this
case correctly?

On a general note, I don't think it's wise to place the RAID layer at
the MTD level. How about placing it at the UBI level (pick 2 ubi
volumes to create one UBI-RAID element)? This way you don't have to
bother about bad block handling (you're manipulating logical blocks
which can be anywhere on the NAND).

One last question? What's the real goal of this MTD-RAID layer? If
that's about addressing the MLC/TLC NAND reliability problems, I don't
think it's such a good idea.

Regards,

Boris

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: MTD RAID
       [not found]       ` <CA+qeAOrSAi9uTHGCi-5cAJpM_O45oJUihNP-rHHa1FWL7_ZKHQ@mail.gmail.com>
@ 2016-08-19  9:37         ` Boris Brezillon
  2016-08-19 10:22           ` Dongsheng Yang
  0 siblings, 1 reply; 20+ messages in thread
From: Boris Brezillon @ 2016-08-19  9:37 UTC (permalink / raw)
  To: Dongsheng Yang
  Cc: Dongsheng Yang, fabf, jesper.nilsson, Dongsheng Yang,
	linux-cris-kernel, shengyong1, Ard Biesheuvel, richard,
	dmitry.torokhov, dooooongsheng.yang, jschultz, starvik,
	mtownsend1973, linux-mtd, Colin King, asierra, Brian Norris,
	David Woodhouse

On Fri, 19 Aug 2016 17:15:56 +0800
Dongsheng Yang <dongsheng081251@gmail.com> wrote:

> Hi Boris,
> 
> On Fri, Aug 19, 2016 at 4:20 PM, Boris Brezillon <
> boris.brezillon@free-electrons.com> wrote:
> 
> > On Fri, 19 Aug 2016 15:08:35 +0800
> > Dongsheng Yang <dongsheng.yang@easystack.cn> wrote:
> >  
> > > On 08/19/2016 02:49 PM, Boris Brezillon wrote:  
> > > > Hi Dongsheng,
> > > >
> > > > On Fri, 19 Aug 2016 14:34:54 +0800
> > > > Dongsheng Yang <dongsheng081251@gmail.com> wrote:
> > > >  
> > > >> Hi guys,
> > > >>      This is a email about MTD RAID.
> > > >>
> > > >> *Code:*
> > > >>      kernel:
> > > >> https://github.com/yangdongsheng/linux/tree/mtd_raid_v2-for-4.7  
> > > > Just had a quick look at the code, and I see at least one major problem
> > > > in your RAID-1 implementation: you're ignoring the fact that NAND  
> > blocks  
> > > > can be or become bad. What's the plan for that?  
> > >
> > > Hi Boris,
> > >      Thanx for your quick reply.
> > >
> > >      When you are using RAID-1, it would erase the all mirrored blockes
> > > when you are erasing.
> > > if there is a bad block in them, mtd_raid_erase will return an error and
> > > the userspace tool
> > > or ubi will mark this block as bad, that means, the
> > > mtd_raid_block_markbad() will mark the all
> > >   mirrored blocks as bad, although some of it are good.
> > >
> > > In addition, when you have data in flash with RAID-1, if one block
> > > become bad. For example,
> > > when the mtd0 and mtd1 are used to build a RAID-1 device mtd2. When you
> > > are using mtd2
> > > and you found there is a block become bad. Don't worry about data
> > > losing, the data is still
> > > saved in the good one mirror. you can replace the bad one device with
> > > another new mtd device.  
> >
> > Okay, good to see you were aware of this problem.
> >  
> > >
> > > My plan about this feature is all on the userspace tool.
> > > (1). mtd_raid scan mtd2 <---- this will show the status of RAID device
> > > and each member of it.
> > > (2). mtd_raid replace mtd2 --old mtd1 --new mtd3.   <---- this will
> > > replace the bad one mtd1 with mtd3.
> > >
> > > What about this idea?  
> >
> > Not sure I follow you on #2. And, IMO, you should not depend on a
> > userspace tool to detect address this kind of problems.
> >
> > Okay, a few more questions.
> >
> > 1/ What about data retention issues? Say you read from the main MTD, and
> > it does not show uncorrectable errors, so you keep reading on it, but,
> > since you're never reading from the mirror, you can't detect if there
> > are some uncorrectable errors or if the number of bitflips exceed the
> > threshold used to trigger a data move. If suddenly a page in your main
> > MTD becomes unreadable, you're not guaranteed that the mirror page will
> > be valid :-/.
> >  
> 
> Yes, that could happen. But that's a case where main MTD and mirror bacome
> bad at the same time. Yes, that's possible, but that's much rare than
> pure one MTD going to bad, right?

Absolutely not, that's actually more likely than getting bad blocks. If
you're not regularly reading your data they can become bad with no way
to recover from it.

> That's what RAID-1 want. If you want
> to solve this problem, just increase the number of mirror. Then you can make
> your data safer and safer.

Except the number of bitflips is likely to increase over time, so if
you never read your mirror blocks because the main MTD is working fine,
you may not be able to read data back when you really need it.

> 
> >
> > 2/ How do you handle write atomicity in RAID1? I don't know exactly
> > how RAID1 works, but I guess there's a mechanism (a journal?) to detect
> > that data has been written on the main MTD but not on the mirror, so
> > that you can replay the operation after a power-cut. Do handle this
> > case correctly?
> >  
> 
> No, but the redundancy of RAID levels is designed to protect against a
> *disk* failure,
> not against a *power* failure, that's a responsibility of ubifs. when the
> ubifs replay,
> the not completed writing will be abandoned.

And again, you're missing one important point. UBI and UBIFS are
sitting on your RAID layer. If the mirror MTD is corrupted because of
a power-cut, but the main one is working fine, UBI and UBIFS won't
notice, until you really need to use the mirror, and it's already too
late.

> 
> >
> > On a general note, I don't think it's wise to place the RAID layer at
> > the MTD level. How about placing it at the UBI level (pick 2 ubi
> > volumes to create one UBI-RAID element)? This way you don't have to
> > bother about bad block handling (you're manipulating logical blocks
> > which can be anywhere on the NAND).
> >  
> 
> 
> But how can we handle the multiple chips problem? Some drivers
> are combining multiple chips to one single mtd device, what the
> mtd_concat is doing.

You can either pick 2 UBI volumes from 2 UBI devices (each one attached
to a different MTD device).

> 
> >
> > One last question? What's the real goal of this MTD-RAID layer? If
> > that's about addressing the MLC/TLC NAND reliability problems, I don't
> > think it's such a good idea.
> >  
> 
> Oh, that's not the main problem I want to solve. RAID-1 is just a possible
>  extension base on my RAID framework.
> 
> This work is started for only RAID0, which is used to take the use of lots
> of flash to improve performance. Then I refactored it to a MTD RAID
> framework. Then we can implement other raid level for mtd.
> 
> Example:
>     In our production, there are 40+ chips attached on one pcie card.
> Then we need to simulate all of them into one mtd device. At the same
> time, we need to consider how to manage these chips. Finally we chose
> a RAID0 mode for them. And got a great performance result.
> 
> So, the multiple chips scenario is the original problem I want to solve. And
> then I found I can refactor it for other RAID-levels.

So all you need a way to concatenate MTD devices (are we talking
about NAND devices?)? That shouldn't be to hard to define something
like an MTD-cluster aggregating several similar MTD devices to provide
a single MTD. But I'd really advise you to drop the MTD-RAID idea and
focus on your real/simple need: aggregating MTD devices.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: MTD RAID
       [not found]       ` <57B6CC7B.1060208@easystack.cn>
@ 2016-08-19  9:47         ` Richard Weinberger
  2016-08-19 10:30           ` Dongsheng Yang
  2016-08-19 10:30           ` Artem Bityutskiy
  0 siblings, 2 replies; 20+ messages in thread
From: Richard Weinberger @ 2016-08-19  9:47 UTC (permalink / raw)
  To: Dongsheng Yang, Boris Brezillon
  Cc: Dongsheng Yang, fabf, jesper.nilsson, Dongsheng Yang,
	linux-cris-kernel, shengyong1, Ard Biesheuvel, dmitry.torokhov,
	dooooongsheng.yang, jschultz, starvik, mtownsend1973, linux-mtd,
	Colin King, asierra, Brian Norris, David Woodhouse

Yang,

Sorry when I ask already answered questions, the mail thread
grows faster than I can type. ;)

On 19.08.2016 11:08, Dongsheng Yang wrote:
> But how can we handle the multiple chips problem? Some drivers
> are combining multiple chips to one single mtd device, what the
> mtd_concat is doing.

mtd_concat is a horrid and old hack which was used to combine very
small flashes.

>> One last question? What's the real goal of this MTD-RAID layer? If
>> that's about addressing the MLC/TLC NAND reliability problems, I don't
>> think it's such a good idea.
> 
> Oh, that's not the main problem I want to solve. RAID-1 is just a possible
>  extension base on my RAID framework.
> 
> This work is started for only RAID0, which is used to take the use of lots
> of flash to improve performance. Then I refactored it to a MTD RAID
> framework. Then we can implement other raid level for mtd.
> 
> Example:
>     In our production, there are 40+ chips attached on one pcie card.
> Then we need to simulate all of them into one mtd device. At the same
> time, we need to consider how to manage these chips. Finally we chose
> a RAID0 mode for them. And got a great performance result.
> 
> So, the multiple chips scenario is the original problem I want to solve. And
> then I found I can refactor it for other RAID-levels.

Are your chips NAND flash?
Combining non-NAND is not that hard, mtd_concat does already.
With NAND, as Boris pointed out, it will become complicated.

So, what you propose as MTD RAID is not really RAID.
RAID is about redundant disks. i.e. you can replace them upon failure.
The problem you address is combining multiple chips.
IOW an SSD emulated in software.

I think we don't need replication on MTD. An improved mtd_concat
does not sound bad but a proper implementation has to deal with
all the nastiness of NAND.

Thanks,
//richard

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: MTD RAID
  2016-08-19  9:37         ` Boris Brezillon
@ 2016-08-19 10:22           ` Dongsheng Yang
  2016-08-19 11:36             ` Boris Brezillon
  0 siblings, 1 reply; 20+ messages in thread
From: Dongsheng Yang @ 2016-08-19 10:22 UTC (permalink / raw)
  To: Boris Brezillon, Dongsheng Yang
  Cc: starvik, jesper.nilsson, Dongsheng Yang, linux-cris-kernel,
	shengyong1, Ard Biesheuvel, richard, dmitry.torokhov,
	dooooongsheng.yang, jschultz, fabf, mtownsend1973, linux-mtd,
	Colin King, asierra, Brian Norris, David Woodhouse



On 08/19/2016 05:37 PM, Boris Brezillon wrote:
> On Fri, 19 Aug 2016 17:15:56 +0800
> Dongsheng Yang <dongsheng081251@gmail.com> wrote:
>
>> Hi Boris,
>>
>> On Fri, Aug 19, 2016 at 4:20 PM, Boris Brezillon <
>> boris.brezillon@free-electrons.com> wrote:
>>
>>> On Fri, 19 Aug 2016 15:08:35 +0800
>>> Dongsheng Yang <dongsheng.yang@easystack.cn> wrote:
>>>   
>>>> On 08/19/2016 02:49 PM, Boris Brezillon wrote:
>>>>> Hi Dongsheng,
>>>>>
>>>>> On Fri, 19 Aug 2016 14:34:54 +0800
>>>>> Dongsheng Yang <dongsheng081251@gmail.com> wrote:
>>>>>   
>>>>>> Hi guys,
>>>>>>       This is a email about MTD RAID.
>>>>>>
>>>>>> *Code:*
>>>>>>       kernel:
>>>>>> https://github.com/yangdongsheng/linux/tree/mtd_raid_v2-for-4.7
>>>>> Just had a quick look at the code, and I see at least one major problem
>>>>> in your RAID-1 implementation: you're ignoring the fact that NAND
>>> blocks
>>>>> can be or become bad. What's the plan for that?
>>>> Hi Boris,
>>>>       Thanx for your quick reply.
>>>>
>>>>       When you are using RAID-1, it would erase the all mirrored blockes
>>>> when you are erasing.
>>>> if there is a bad block in them, mtd_raid_erase will return an error and
>>>> the userspace tool
>>>> or ubi will mark this block as bad, that means, the
>>>> mtd_raid_block_markbad() will mark the all
>>>>    mirrored blocks as bad, although some of it are good.
>>>>
>>>> In addition, when you have data in flash with RAID-1, if one block
>>>> become bad. For example,
>>>> when the mtd0 and mtd1 are used to build a RAID-1 device mtd2. When you
>>>> are using mtd2
>>>> and you found there is a block become bad. Don't worry about data
>>>> losing, the data is still
>>>> saved in the good one mirror. you can replace the bad one device with
>>>> another new mtd device.
>>> Okay, good to see you were aware of this problem.
>>>   
>>>> My plan about this feature is all on the userspace tool.
>>>> (1). mtd_raid scan mtd2 <---- this will show the status of RAID device
>>>> and each member of it.
>>>> (2). mtd_raid replace mtd2 --old mtd1 --new mtd3.   <---- this will
>>>> replace the bad one mtd1 with mtd3.
>>>>
>>>> What about this idea?
>>> Not sure I follow you on #2. And, IMO, you should not depend on a
>>> userspace tool to detect address this kind of problems.
>>>
>>> Okay, a few more questions.
>>>
>>> 1/ What about data retention issues? Say you read from the main MTD, and
>>> it does not show uncorrectable errors, so you keep reading on it, but,
>>> since you're never reading from the mirror, you can't detect if there
>>> are some uncorrectable errors or if the number of bitflips exceed the
>>> threshold used to trigger a data move. If suddenly a page in your main
>>> MTD becomes unreadable, you're not guaranteed that the mirror page will
>>> be valid :-/.
>>>   
>> Yes, that could happen. But that's a case where main MTD and mirror bacome
>> bad at the same time. Yes, that's possible, but that's much rare than
>> pure one MTD going to bad, right?
> Absolutely not, that's actually more likely than getting bad blocks. If
> you're not regularly reading your data they can become bad with no way
> to recover from it.
>
>> That's what RAID-1 want. If you want
>> to solve this problem, just increase the number of mirror. Then you can make
>> your data safer and safer.
> Except the number of bitflips is likely to increase over time, so if
> you never read your mirror blocks because the main MTD is working fine,
> you may not be able to read data back when you really need it.

Sorry, I am afraid I did not get your point. But in general, it's safer to
have two copies of data than just one copy of it I believe. Could you 
explain
more , thanx. :)
>
>>> 2/ How do you handle write atomicity in RAID1? I don't know exactly
>>> how RAID1 works, but I guess there's a mechanism (a journal?) to detect
>>> that data has been written on the main MTD but not on the mirror, so
>>> that you can replay the operation after a power-cut. Do handle this
>>> case correctly?
>>>   
>> No, but the redundancy of RAID levels is designed to protect against a
>> *disk* failure,
>> not against a *power* failure, that's a responsibility of ubifs. when the
>> ubifs replay,
>> the not completed writing will be abandoned.
> And again, you're missing one important point. UBI and UBIFS are
> sitting on your RAID layer. If the mirror MTD is corrupted because of
> a power-cut, but the main one is working fine, UBI and UBIFS won't
> notice, until you really need to use the mirror, and it's already too
> late.
Actually there is already an answer about this question in RAID-1:

https://linas.org/linux/Software-RAID/Software-RAID-4.html


But, I am glad to figure out what we can do in this case.
At this moment, I think do a raid check for the all copies of data
when ubifs is recoverying sounds possible.

>
>>> On a general note, I don't think it's wise to place the RAID layer at
>>> the MTD level. How about placing it at the UBI level (pick 2 ubi
>>> volumes to create one UBI-RAID element)? This way you don't have to
>>> bother about bad block handling (you're manipulating logical blocks
>>> which can be anywhere on the NAND).
>>>   
>>
>> But how can we handle the multiple chips problem? Some drivers
>> are combining multiple chips to one single mtd device, what the
>> mtd_concat is doing.
> You can either pick 2 UBI volumes from 2 UBI devices (each one attached
> to a different MTD device).

Yes, but, I am afraid we don't want to expose all our chips.

Please consider this scenario, One pcie card attached chips, we only 
want user
to see just one mtd device /dev/mtd0, rather than 40+ mtd devices. So we 
need to call
mtd_raid_create() in the driver for this card.
>
>>> One last question? What's the real goal of this MTD-RAID layer? If
>>> that's about addressing the MLC/TLC NAND reliability problems, I don't
>>> think it's such a good idea.
>>>   
>> Oh, that's not the main problem I want to solve. RAID-1 is just a possible
>>   extension base on my RAID framework.
>>
>> This work is started for only RAID0, which is used to take the use of lots
>> of flash to improve performance. Then I refactored it to a MTD RAID
>> framework. Then we can implement other raid level for mtd.
>>
>> Example:
>>      In our production, there are 40+ chips attached on one pcie card.
>> Then we need to simulate all of them into one mtd device. At the same
>> time, we need to consider how to manage these chips. Finally we chose
>> a RAID0 mode for them. And got a great performance result.
>>
>> So, the multiple chips scenario is the original problem I want to solve. And
>> then I found I can refactor it for other RAID-levels.
> So all you need a way to concatenate MTD devices (are we talking
> about NAND devices?)? That shouldn't be to hard to define something
> like an MTD-cluster aggregating several similar MTD devices to provide
> a single MTD. But I'd really advise you to drop the MTD-RAID idea and
> focus on your real/simple need: aggregating MTD devices.

Yes, the original problem is to concatenate the NAND devices. And we
have to use RAID-0 to improve our performance.

Later on, I found the MTD raid is a not bad idea to solve other problems,
So I tried to do a refactor for MTD-RAID.
>
> ______________________________________________________
> Linux MTD discussion mailing list
> http://lists.infradead.org/mailman/listinfo/linux-mtd/
>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: MTD RAID
  2016-08-19  9:47         ` Richard Weinberger
@ 2016-08-19 10:30           ` Dongsheng Yang
  2016-08-19 10:30           ` Artem Bityutskiy
  1 sibling, 0 replies; 20+ messages in thread
From: Dongsheng Yang @ 2016-08-19 10:30 UTC (permalink / raw)
  To: Richard Weinberger, Boris Brezillon
  Cc: starvik, jesper.nilsson, Dongsheng Yang, linux-cris-kernel,
	shengyong1, Ard Biesheuvel, David Woodhouse, dmitry.torokhov,
	dooooongsheng.yang, jschultz, fabf, mtownsend1973, linux-mtd,
	Colin King, asierra, Brian Norris, Dongsheng Yang

Hi Richard.

On 08/19/2016 05:47 PM, Richard Weinberger wrote:
> Yang,
>
> Sorry when I ask already answered questions, the mail thread
> grows faster than I can type. ;)
>
> On 19.08.2016 11:08, Dongsheng Yang wrote:
>> But how can we handle the multiple chips problem? Some drivers
>> are combining multiple chips to one single mtd device, what the
>> mtd_concat is doing.
> mtd_concat is a horrid and old hack which was used to combine very
> small flashes.
>
>>> One last question? What's the real goal of this MTD-RAID layer? If
>>> that's about addressing the MLC/TLC NAND reliability problems, I don't
>>> think it's such a good idea.
>> Oh, that's not the main problem I want to solve. RAID-1 is just a possible
>>   extension base on my RAID framework.
>>
>> This work is started for only RAID0, which is used to take the use of lots
>> of flash to improve performance. Then I refactored it to a MTD RAID
>> framework. Then we can implement other raid level for mtd.
>>
>> Example:
>>      In our production, there are 40+ chips attached on one pcie card.
>> Then we need to simulate all of them into one mtd device. At the same
>> time, we need to consider how to manage these chips. Finally we chose
>> a RAID0 mode for them. And got a great performance result.
>>
>> So, the multiple chips scenario is the original problem I want to solve. And
>> then I found I can refactor it for other RAID-levels.
> Are your chips NAND flash?
> Combining non-NAND is not that hard, mtd_concat does already.
> With NAND, as Boris pointed out, it will become complicated.

Yes, it's NAND, but I think this layer will be working for all MTD devices.
>
> So, what you propose as MTD RAID is not really RAID.
> RAID is about redundant disks. i.e. you can replace them upon failure.
> The problem you address is combining multiple chips.
> IOW an SSD emulated in software.

Yes, the original idea is from it, but I am wishing to implement a
real RAID for mtd now. Although we are only using RAID0 in
our production. I am glad to implement other RAID-levels in community.

And yes, sounds not easy for NAND devices. But I will to figure
out the solutions for these problems Boris pointed out.

BTW, what's the other idea to solve the reliability problem in MLC 
currently in community.

Yang
>
> I think we don't need replication on MTD. An improved mtd_concat
> does not sound bad but a proper implementation has to deal with
> all the nastiness of NAND.
>
> Thanks,
> //richard
>
> ______________________________________________________
> Linux MTD discussion mailing list
> http://lists.infradead.org/mailman/listinfo/linux-mtd/
>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: MTD RAID
  2016-08-19  9:47         ` Richard Weinberger
  2016-08-19 10:30           ` Dongsheng Yang
@ 2016-08-19 10:30           ` Artem Bityutskiy
  2016-08-19 10:38             ` Dongsheng Yang
  1 sibling, 1 reply; 20+ messages in thread
From: Artem Bityutskiy @ 2016-08-19 10:30 UTC (permalink / raw)
  To: Richard Weinberger, Dongsheng Yang, Boris Brezillon
  Cc: starvik, jesper.nilsson, Dongsheng Yang, linux-cris-kernel,
	shengyong1, Ard Biesheuvel, David Woodhouse, dmitry.torokhov,
	dooooongsheng.yang, jschultz, fabf, mtownsend1973, linux-mtd,
	Colin King, asierra, Brian Norris, Dongsheng Yang

On Fri, 2016-08-19 at 11:47 +0200, Richard Weinberger wrote:
> So, what you propose as MTD RAID is not really RAID.
> RAID is about redundant disks. i.e. you can replace them upon
> failure.
> The problem you address is combining multiple chips.
> IOW an SSD emulated in software.

I think this should either be a very close analog of RAID or the word
"RAID" should not be used at all. Everyone will be confused otherwise.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: MTD RAID
  2016-08-19 10:30           ` Artem Bityutskiy
@ 2016-08-19 10:38             ` Dongsheng Yang
       [not found]               ` <AL*AZwAyAecQSM1UjjjNxao0.3.1471605640762.Hmail.dongsheng.yang@easystack.cn>
  0 siblings, 1 reply; 20+ messages in thread
From: Dongsheng Yang @ 2016-08-19 10:38 UTC (permalink / raw)
  To: dedekind1, Richard Weinberger, Boris Brezillon
  Cc: fabf, jesper.nilsson, Dongsheng Yang, linux-cris-kernel,
	shengyong1, Ard Biesheuvel, Dongsheng Yang, dmitry.torokhov,
	dooooongsheng.yang, jschultz, starvik, mtownsend1973, linux-mtd,
	Colin King, asierra, Brian Norris, David Woodhouse


Artem

On 08/19/2016 06:30 PM, Artem Bityutskiy wrote:
> On Fri, 2016-08-19 at 11:47 +0200, Richard Weinberger wrote:
>> So, what you propose as MTD RAID is not really RAID.
>> RAID is about redundant disks. i.e. you can replace them upon
>> failure.

Yes, we can replace it, but we can add/delete devices in btrfs in different
RAID level by :
btrfs device add/delete <device>

That's software raid, similar with Software raid for block device in md.

Yang
>> The problem you address is combining multiple chips.
>> IOW an SSD emulated in software.
> I think this should either be a very close analog of RAID or the word
> "RAID" should not be used at all. Everyone will be confused otherwise.
>
> ______________________________________________________
> Linux MTD discussion mailing list
> http://lists.infradead.org/mailman/listinfo/linux-mtd/
>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: MTD RAID
  2016-08-19 10:22           ` Dongsheng Yang
@ 2016-08-19 11:36             ` Boris Brezillon
  0 siblings, 0 replies; 20+ messages in thread
From: Boris Brezillon @ 2016-08-19 11:36 UTC (permalink / raw)
  To: Dongsheng Yang
  Cc: Dongsheng Yang, starvik, jesper.nilsson, Dongsheng Yang,
	linux-cris-kernel, shengyong1, Ard Biesheuvel, richard,
	dmitry.torokhov, dooooongsheng.yang, jschultz, fabf,
	mtownsend1973, linux-mtd, Colin King, asierra, Brian Norris,
	David Woodhouse

On Fri, 19 Aug 2016 18:22:25 +0800
Dongsheng Yang <dongsheng.yang@easystack.cn> wrote:

> On 08/19/2016 05:37 PM, Boris Brezillon wrote:
> > On Fri, 19 Aug 2016 17:15:56 +0800
> > Dongsheng Yang <dongsheng081251@gmail.com> wrote:
> >  
> >> Hi Boris,
> >>
> >> On Fri, Aug 19, 2016 at 4:20 PM, Boris Brezillon <
> >> boris.brezillon@free-electrons.com> wrote:
> >>  
> >>> On Fri, 19 Aug 2016 15:08:35 +0800
> >>> Dongsheng Yang <dongsheng.yang@easystack.cn> wrote:
> >>>     
> >>>> On 08/19/2016 02:49 PM, Boris Brezillon wrote:  
> >>>>> Hi Dongsheng,
> >>>>>
> >>>>> On Fri, 19 Aug 2016 14:34:54 +0800
> >>>>> Dongsheng Yang <dongsheng081251@gmail.com> wrote:
> >>>>>     
> >>>>>> Hi guys,
> >>>>>>       This is a email about MTD RAID.
> >>>>>>
> >>>>>> *Code:*
> >>>>>>       kernel:
> >>>>>> https://github.com/yangdongsheng/linux/tree/mtd_raid_v2-for-4.7  
> >>>>> Just had a quick look at the code, and I see at least one major problem
> >>>>> in your RAID-1 implementation: you're ignoring the fact that NAND  
> >>> blocks  
> >>>>> can be or become bad. What's the plan for that?  
> >>>> Hi Boris,
> >>>>       Thanx for your quick reply.
> >>>>
> >>>>       When you are using RAID-1, it would erase the all mirrored blockes
> >>>> when you are erasing.
> >>>> if there is a bad block in them, mtd_raid_erase will return an error and
> >>>> the userspace tool
> >>>> or ubi will mark this block as bad, that means, the
> >>>> mtd_raid_block_markbad() will mark the all
> >>>>    mirrored blocks as bad, although some of it are good.
> >>>>
> >>>> In addition, when you have data in flash with RAID-1, if one block
> >>>> become bad. For example,
> >>>> when the mtd0 and mtd1 are used to build a RAID-1 device mtd2. When you
> >>>> are using mtd2
> >>>> and you found there is a block become bad. Don't worry about data
> >>>> losing, the data is still
> >>>> saved in the good one mirror. you can replace the bad one device with
> >>>> another new mtd device.  
> >>> Okay, good to see you were aware of this problem.
> >>>     
> >>>> My plan about this feature is all on the userspace tool.
> >>>> (1). mtd_raid scan mtd2 <---- this will show the status of RAID device
> >>>> and each member of it.
> >>>> (2). mtd_raid replace mtd2 --old mtd1 --new mtd3.   <---- this will
> >>>> replace the bad one mtd1 with mtd3.
> >>>>
> >>>> What about this idea?  
> >>> Not sure I follow you on #2. And, IMO, you should not depend on a
> >>> userspace tool to detect address this kind of problems.
> >>>
> >>> Okay, a few more questions.
> >>>
> >>> 1/ What about data retention issues? Say you read from the main MTD, and
> >>> it does not show uncorrectable errors, so you keep reading on it, but,
> >>> since you're never reading from the mirror, you can't detect if there
> >>> are some uncorrectable errors or if the number of bitflips exceed the
> >>> threshold used to trigger a data move. If suddenly a page in your main
> >>> MTD becomes unreadable, you're not guaranteed that the mirror page will
> >>> be valid :-/.
> >>>     
> >> Yes, that could happen. But that's a case where main MTD and mirror bacome
> >> bad at the same time. Yes, that's possible, but that's much rare than
> >> pure one MTD going to bad, right?  
> > Absolutely not, that's actually more likely than getting bad blocks. If
> > you're not regularly reading your data they can become bad with no way
> > to recover from it.
> >  
> >> That's what RAID-1 want. If you want
> >> to solve this problem, just increase the number of mirror. Then you can make
> >> your data safer and safer.  
> > Except the number of bitflips is likely to increase over time, so if
> > you never read your mirror blocks because the main MTD is working fine,
> > you may not be able to read data back when you really need it.  
> 
> Sorry, I am afraid I did not get your point. But in general, it's safer to
> have two copies of data than just one copy of it I believe. Could you 
> explain
> more , thanx. :)

It's safer in most cases, but if you don't make sure your mirror is
in a correct state, then it's just giving an illusion of safety, which
is not necessarily here.

> >  
> >>> 2/ How do you handle write atomicity in RAID1? I don't know exactly
> >>> how RAID1 works, but I guess there's a mechanism (a journal?) to detect
> >>> that data has been written on the main MTD but not on the mirror, so
> >>> that you can replay the operation after a power-cut. Do handle this
> >>> case correctly?
> >>>     
> >> No, but the redundancy of RAID levels is designed to protect against a
> >> *disk* failure,
> >> not against a *power* failure, that's a responsibility of ubifs. when the
> >> ubifs replay,
> >> the not completed writing will be abandoned.  
> > And again, you're missing one important point. UBI and UBIFS are
> > sitting on your RAID layer. If the mirror MTD is corrupted because of
> > a power-cut, but the main one is working fine, UBI and UBIFS won't
> > notice, until you really need to use the mirror, and it's already too
> > late.  
> Actually there is already an answer about this question in RAID-1:
> 
> https://linas.org/linux/Software-RAID/Software-RAID-4.html
> 
> 
> But, I am glad to figure out what we can do in this case.
> At this moment, I think do a raid check for the all copies of data
> when ubifs is recoverying sounds possible.

Now you're mixing different layers. How would UBIFS/UBI inform the MTD
that it needs to take some security measures?
IMO you're heading to something that is complex and error prone (mainly
because of the unreliability of the NANDs).

> 
> >  
> >>> On a general note, I don't think it's wise to place the RAID layer at
> >>> the MTD level. How about placing it at the UBI level (pick 2 ubi
> >>> volumes to create one UBI-RAID element)? This way you don't have to
> >>> bother about bad block handling (you're manipulating logical blocks
> >>> which can be anywhere on the NAND).
> >>>     
> >>
> >> But how can we handle the multiple chips problem? Some drivers
> >> are combining multiple chips to one single mtd device, what the
> >> mtd_concat is doing.  
> > You can either pick 2 UBI volumes from 2 UBI devices (each one attached
> > to a different MTD device).  
> 
> Yes, but, I am afraid we don't want to expose all our chips.
> 
> Please consider this scenario, One pcie card attached chips, we only 
> want user
> to see just one mtd device /dev/mtd0, rather than 40+ mtd devices. So we 
> need to call
> mtd_raid_create() in the driver for this card.

Yes, I was only commenting on RAID-1 implementation. For RAID-0, all
you need is an improved mtdconcat implementation.

> >  
> >>> One last question? What's the real goal of this MTD-RAID layer? If
> >>> that's about addressing the MLC/TLC NAND reliability problems, I don't
> >>> think it's such a good idea.
> >>>     
> >> Oh, that's not the main problem I want to solve. RAID-1 is just a possible
> >>   extension base on my RAID framework.
> >>
> >> This work is started for only RAID0, which is used to take the use of lots
> >> of flash to improve performance. Then I refactored it to a MTD RAID
> >> framework. Then we can implement other raid level for mtd.
> >>
> >> Example:
> >>      In our production, there are 40+ chips attached on one pcie card.
> >> Then we need to simulate all of them into one mtd device. At the same
> >> time, we need to consider how to manage these chips. Finally we chose
> >> a RAID0 mode for them. And got a great performance result.
> >>
> >> So, the multiple chips scenario is the original problem I want to solve. And
> >> then I found I can refactor it for other RAID-levels.  
> > So all you need a way to concatenate MTD devices (are we talking
> > about NAND devices?)? That shouldn't be to hard to define something
> > like an MTD-cluster aggregating several similar MTD devices to provide
> > a single MTD. But I'd really advise you to drop the MTD-RAID idea and
> > focus on your real/simple need: aggregating MTD devices.  
> 
> Yes, the original problem is to concatenate the NAND devices. And we
> have to use RAID-0 to improve our performance.
> 
> Later on, I found the MTD raid is a not bad idea to solve other problems,
> So I tried to do a refactor for MTD-RAID.

Except it's way more complicated than aggregating several MTD devices
to expose a single entity. So you'd better focus on the mtdconcat
feature instead of trying to implement a RAID layer possibly supporting
all kind of RAID configs.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: MTD RAID
       [not found]               ` <AL*AZwAyAecQSM1UjjjNxao0.3.1471605640762.Hmail.dongsheng.yang@easystack.cn>
@ 2016-08-19 11:55                 ` Boris Brezillon
  2016-08-22  4:01                   ` Dongsheng Yang
       [not found]                   ` <57BA6FFA.6000601@easystack.cn>
  0 siblings, 2 replies; 20+ messages in thread
From: Boris Brezillon @ 2016-08-19 11:55 UTC (permalink / raw)
  To: 杨东升
  Cc: dedekind1, Richard Weinberger, fabf, jesper.nilsson,
	Dongsheng Yang, linux-cris-kernel, shengyong1, Ard Biesheuvel,
	Dongsheng Yang, dmitry.torokhov, dooooongsheng.yang, jschultz,
	starvik, mtownsend1973, linux-mtd, Colin King, asierra,
	Brian Norris, David Woodhouse

On Fri, 19 Aug 2016 19:20:40 +0800 (GMT+08:00)
杨东升 <dongsheng.yang@easystack.cn> wrote:

> Hi guys,Sorry I think i did not express myself clearly. From this reference: 
> 
> 
> https://linas.org/linux/Software-RAID/Software-RAID.txt
> 
> 
> we can see, RAID stands for "Redundant Array of Inexpensive Disks"
> and is meant to be a way of creating a fast and reliable
> disk-drive subsystem out of individual disks. In the PC
> world, "I" has come to stand for "Independent".
> 
> 
> There are two benifets in RAID, "fast" and "reliable". 
> So I introduce the RAID framework in MTD world. and implement
> 3 types of RAID currently.
> 
> 
> (1) single: I reuse this work same with what it is in BTRFS.
> It's not a standard RAID level. But just concat the devices.
> 
> 
> (2) RAID0: also known as Striping mode. This can make device faster.
> From what I show in my first email, we can see we can get 51.1 MB/s in dd
> although the original device is only 14.0 MB/s. 

Some comments on your results. It's all theoretical (based on nandsim),
and assuming your NAND chips are connected to the same NAND controller
you would just get the same perf as in 'single' mode (accesses through
the NAND controller are currently serialized, that's something I'm
trying to change but it's not here yet).

So yes, in an ideal word, sequential accesses would be improved, but
we're not here yet. BTW, did you run this test on a real HW?

> 
> 
> (3) RAID1: also known as Mirroring mode. This can make device more reliable. 
> Yes, Boris pointed out that there could be some problems if we are using NAND flash.
> But I think these all are possible to be solved. And I don't think this is the problem of 
> MTD RAID, but the problem of the special use case of NAND. I am glad to make the
> MTD RAID working better on MLC and TLC.

It's not only an MLC/TLC problem, it's just that you're more likely to
see it on MLC/TLC NANDs. The fact that you're not regularly
reading/refreshing some blocks of the mirror MTD is a real problem, and
this lead to the safety illusion I was mentioning in my previous answer.

That's why I think implementing RAID on top of raw MTD devices is a bad
idea.

> 
> 
> In addition, there are some more RAID levels, such as RAID10, RAID4/5/6. All of them are
> useful for "fast" and "reliable".

I'm not saying RAID is useless, I'm just saying it's a pain to
implement on top of MTD devices.

> 
> 
> I hope this mail helps to express my idea here. 
>

I think I got the main idea, and I already explained why I think it's
not a good idea. If you still want to go this road then you'll have to
convince me that your implementation is safe (which is not the case
yet).

BTW, think about that: if you use an SSD in a RAID setting, the SSD's
FTL is taking care of the NAND unreliability problems. Here you're just
ignoring these problems and are assuming doing RAID on a NAND is just
as safe as doing RAID on an SSD, which is wrong.
If you want to be in a 'similar' setting, then the RAID array has to
operate on top of the FTL/WL layer (in this specific case, UBI).

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: MTD RAID
  2016-08-19 11:55                 ` Boris Brezillon
@ 2016-08-22  4:01                   ` Dongsheng Yang
  2016-08-22  7:09                     ` Boris Brezillon
       [not found]                   ` <57BA6FFA.6000601@easystack.cn>
  1 sibling, 1 reply; 20+ messages in thread
From: Dongsheng Yang @ 2016-08-22  4:01 UTC (permalink / raw)
  To: Boris Brezillon
  Cc: starvik, jesper.nilsson, Dongsheng Yang, shengyong1,
	linux-cris-kernel, dedekind1, Brian Norris, Richard Weinberger,
	Ard Biesheuvel, dooooongsheng.yang, jschultz, fabf,
	mtownsend1973, linux-mtd, Colin King, asierra, dmitry.torokhov,
	Dongsheng Yang, David Woodhouse



On 08/19/2016 07:55 PM, Boris Brezillon wrote:
> On Fri, 19 Aug 2016 19:20:40 +0800 (GMT+08:00)
> 杨东升 <dongsheng.yang@easystack.cn> wrote:
>
>> Hi guys,Sorry I think i did not express myself clearly. From this reference:
>>
>>
>> https://linas.org/linux/Software-RAID/Software-RAID.txt
>>
>>
>> we can see, RAID stands for "Redundant Array of Inexpensive Disks"
>> and is meant to be a way of creating a fast and reliable
>> disk-drive subsystem out of individual disks. In the PC
>> world, "I" has come to stand for "Independent".
>>
>>
>> There are two benifets in RAID, "fast" and "reliable".
>> So I introduce the RAID framework in MTD world. and implement
>> 3 types of RAID currently.
>>
>>
>> (1) single: I reuse this work same with what it is in BTRFS.
>> It's not a standard RAID level. But just concat the devices.
>>
>>
>> (2) RAID0: also known as Striping mode. This can make device faster.
>>  From what I show in my first email, we can see we can get 51.1 MB/s in dd
>> although the original device is only 14.0 MB/s.
> Some comments on your results. It's all theoretical (based on nandsim),
> and assuming your NAND chips are connected to the same NAND controller
> you would just get the same perf as in 'single' mode (accesses through
> the NAND controller are currently serialized, that's something I'm
> trying to change but it's not here yet).
>
> So yes, in an ideal word, sequential accesses would be improved, but
> we're not here yet. BTW, did you run this test on a real HW?

http://www.fujitsu.com/global/about/resources/news/press-releases/2015/1119-01.html
>
>>
>> (3) RAID1: also known as Mirroring mode. This can make device more reliable.
>> Yes, Boris pointed out that there could be some problems if we are using NAND flash.
>> But I think these all are possible to be solved. And I don't think this is the problem of
>> MTD RAID, but the problem of the special use case of NAND. I am glad to make the
>> MTD RAID working better on MLC and TLC.
> It's not only an MLC/TLC problem, it's just that you're more likely to
> see it on MLC/TLC NANDs. The fact that you're not regularly
> reading/refreshing some blocks of the mirror MTD is a real problem, and
> this lead to the safety illusion I was mentioning in my previous answer.
>
> That's why I think implementing RAID on top of raw MTD devices is a bad
> idea.
>
>>
>> In addition, there are some more RAID levels, such as RAID10, RAID4/5/6. All of them are
>> useful for "fast" and "reliable".
> I'm not saying RAID is useless, I'm just saying it's a pain to
> implement on top of MTD devices.
>
>>
>> I hope this mail helps to express my idea here.
>>
> I think I got the main idea, and I already explained why I think it's
> not a good idea. If you still want to go this road then you'll have to
> convince me that your implementation is safe (which is not the case
> yet).
>
> BTW, think about that: if you use an SSD in a RAID setting, the SSD's
> FTL is taking care of the NAND unreliability problems. Here you're just
> ignoring these problems and are assuming doing RAID on a NAND is just
> as safe as doing RAID on an SSD, which is wrong.
> If you want to be in a 'similar' setting, then the RAID array has to
> operate on top of the FTL/WL layer (in this specific case, UBI).
>
> ______________________________________________________
> Linux MTD discussion mailing list
> http://lists.infradead.org/mailman/listinfo/linux-mtd/

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: MTD RAID
       [not found]                   ` <57BA6FFA.6000601@easystack.cn>
@ 2016-08-22  7:07                     ` Boris Brezillon
  2016-08-22  7:27                     ` Artem Bityutskiy
  1 sibling, 0 replies; 20+ messages in thread
From: Boris Brezillon @ 2016-08-22  7:07 UTC (permalink / raw)
  To: Dongsheng Yang
  Cc: starvik, jesper.nilsson, Dongsheng Yang, shengyong1,
	linux-cris-kernel, dedekind1, Brian Norris, Richard Weinberger,
	Ard Biesheuvel, dooooongsheng.yang, jschultz, fabf,
	mtownsend1973, linux-mtd, Colin King, asierra, dmitry.torokhov,
	Dongsheng Yang, David Woodhouse

On Mon, 22 Aug 2016 11:22:34 +0800
Dongsheng Yang <dongsheng.yang@easystack.cn> wrote:

> On 08/19/2016 07:55 PM, Boris Brezillon wrote:
> > On Fri, 19 Aug 2016 19:20:40 +0800 (GMT+08:00)
> > 杨东升 <dongsheng.yang@easystack.cn> wrote:
> >  
> >> Hi guys,Sorry I think i did not express myself clearly. From this reference:
> >>
> >>
> >> https://linas.org/linux/Software-RAID/Software-RAID.txt
> >>
> >>
> >> we can see, RAID stands for "Redundant Array of Inexpensive Disks"
> >> and is meant to be a way of creating a fast and reliable
> >> disk-drive subsystem out of individual disks. In the PC
> >> world, "I" has come to stand for "Independent".
> >>
> >>
> >> There are two benifets in RAID, "fast" and "reliable".
> >> So I introduce the RAID framework in MTD world. and implement
> >> 3 types of RAID currently.
> >>
> >>
> >> (1) single: I reuse this work same with what it is in BTRFS.
> >> It's not a standard RAID level. But just concat the devices.
> >>
> >>
> >> (2) RAID0: also known as Striping mode. This can make device faster.
> >>  From what I show in my first email, we can see we can get 51.1 MB/s in dd
> >> although the original device is only 14.0 MB/s.  
> > Some comments on your results. It's all theoretical (based on nandsim),
> > and assuming your NAND chips are connected to the same NAND controller
> > you would just get the same perf as in 'single' mode (accesses through
> > the NAND controller are currently serialized, that's something I'm
> > trying to change but it's not here yet).
> >
> > So yes, in an ideal word, sequential accesses would be improved, but
> > we're not here yet. BTW, did you run this test on a real HW?  
> 
> Of course, we got a great performance in our production. And yes,
> we have 16 control channels with 256 nand chips.
> >  
> >>
> >> (3) RAID1: also known as Mirroring mode. This can make device more reliable.
> >> Yes, Boris pointed out that there could be some problems if we are using NAND flash.
> >> But I think these all are possible to be solved. And I don't think this is the problem of
> >> MTD RAID, but the problem of the special use case of NAND. I am glad to make the
> >> MTD RAID working better on MLC and TLC.  
> > It's not only an MLC/TLC problem, it's just that you're more likely to
> > see it on MLC/TLC NANDs. The fact that you're not regularly
> > reading/refreshing some blocks of the mirror MTD is a real problem, and
> > this lead to the safety illusion I was mentioning in my previous answer.
> >
> > That's why I think implementing RAID on top of raw MTD devices is a bad
> > idea.  
> 
> Let me copy the topics from other thread here.
> 
>  >> Sorry, I am afraid I did not get your point. But in general, it's   
> safer to
>  >> have two copies of data than just one copy of it I believe. Could you
>  >> explain
>  >> more , thanx. :)  
> 
>  > It's safer in most cases, but if you don't make sure your mirror is
>  > in a correct state, then it's just giving an illusion of safety, which
>  > is not necessarily here.'  
> 
> Actually, I would say, MTD RAID is working on upper level than what you
> are worrying. RAID-1 does not care about what the problem happened
> in the MTD device. What it want to do is just make the data safer. it save
> more copies of data in different MTD devices.
> 
> IOW, it is totally different with other idea such as "paired pages" to solve
> the MLC problem. It protect data from disk failure, but don't care about
> what's the disk failure. Even because one MTD device are destroyed
> by a bullet.
> 
> So, it's really really not a replacement for "paired pages" or other 
> solution
> for MLC reliability. It works on upper on them. I think we should get an
> agreement at first about what should/can MTD RAID do.

Except the code supposed to deal with MLC constraints is placed in UBI,
so as I said, you're putting your RAID layer on something that is not
and will never be reliable.

> 
> >  
> >>
> >> In addition, there are some more RAID levels, such as RAID10, RAID4/5/6. All of them are
> >> useful for "fast" and "reliable".  
> > I'm not saying RAID is useless, I'm just saying it's a pain to
> > implement on top of MTD devices.
> >  
> >>
> >> I hope this mail helps to express my idea here.
> >>  
> > I think I got the main idea, and I already explained why I think it's
> > not a good idea. If you still want to go this road then you'll have to
> > convince me that your implementation is safe (which is not the case
> > yet).
> >
> > BTW, think about that: if you use an SSD in a RAID setting, the SSD's
> > FTL is taking care of the NAND unreliability problems. Here you're just
> > ignoring these problems and are assuming doing RAID on a NAND is just
> > as safe as doing RAID on an SSD, which is wrong.
> > If you want to be in a 'similar' setting, then the RAID array has to
> > operate on top of the FTL/WL layer (in this specific case, UBI).  
> 
> As I explained above, MTD RAID is not just a solution for reliability
> problem for MLC/TLC.
> 
> Yes, SSD solved the problem of reliability problem of flash. But please
>   think about why we should software-RAID in md? Because RAID is not
> used to solve such problems. we don't care about what exactly problem
> of each device, just improve yourself as you want. we are working on
> an upper level, protecting data even your device are destroyed by a hammer.

Exactly what I say: RAID is not robust against NAND reliability
issues, so you should not try to put a SW-RAID layer of top of raw
NANDs, otherwise the RAID layer is likely to be impacted by those
problems, and UBI won't be able to do its job correctly (see my
comments on how the RAID layer will hide the mirror device).

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: MTD RAID
  2016-08-22  4:01                   ` Dongsheng Yang
@ 2016-08-22  7:09                     ` Boris Brezillon
  0 siblings, 0 replies; 20+ messages in thread
From: Boris Brezillon @ 2016-08-22  7:09 UTC (permalink / raw)
  To: Dongsheng Yang
  Cc: starvik, jesper.nilsson, Dongsheng Yang, shengyong1,
	linux-cris-kernel, dedekind1, Brian Norris, Richard Weinberger,
	Ard Biesheuvel, dooooongsheng.yang, jschultz, fabf,
	mtownsend1973, linux-mtd, Colin King, asierra, dmitry.torokhov,
	Dongsheng Yang, David Woodhouse

On Mon, 22 Aug 2016 12:01:40 +0800
Dongsheng Yang <dongsheng.yang@easystack.cn> wrote:

> On 08/19/2016 07:55 PM, Boris Brezillon wrote:
> > On Fri, 19 Aug 2016 19:20:40 +0800 (GMT+08:00)
> > 杨东升 <dongsheng.yang@easystack.cn> wrote:
> >  
> >> Hi guys,Sorry I think i did not express myself clearly. From this reference:
> >>
> >>
> >> https://linas.org/linux/Software-RAID/Software-RAID.txt
> >>
> >>
> >> we can see, RAID stands for "Redundant Array of Inexpensive Disks"
> >> and is meant to be a way of creating a fast and reliable
> >> disk-drive subsystem out of individual disks. In the PC
> >> world, "I" has come to stand for "Independent".
> >>
> >>
> >> There are two benifets in RAID, "fast" and "reliable".
> >> So I introduce the RAID framework in MTD world. and implement
> >> 3 types of RAID currently.
> >>
> >>
> >> (1) single: I reuse this work same with what it is in BTRFS.
> >> It's not a standard RAID level. But just concat the devices.
> >>
> >>
> >> (2) RAID0: also known as Striping mode. This can make device faster.
> >>  From what I show in my first email, we can see we can get 51.1 MB/s in dd
> >> although the original device is only 14.0 MB/s.  
> > Some comments on your results. It's all theoretical (based on nandsim),
> > and assuming your NAND chips are connected to the same NAND controller
> > you would just get the same perf as in 'single' mode (accesses through
> > the NAND controller are currently serialized, that's something I'm
> > trying to change but it's not here yet).
> >
> > So yes, in an ideal word, sequential accesses would be improved, but
> > we're not here yet. BTW, did you run this test on a real HW?  
> 
> http://www.fujitsu.com/global/about/resources/news/press-releases/2015/1119-01.html

Is this solution really using a mainline kernel?

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: MTD RAID
       [not found]                   ` <57BA6FFA.6000601@easystack.cn>
  2016-08-22  7:07                     ` Boris Brezillon
@ 2016-08-22  7:27                     ` Artem Bityutskiy
       [not found]                       ` <CAAp9bSh-geStbHpA6+vYdLfNLcinWkfVLOGmX4kdRbja+d2MdA@mail.gmail.com>
  1 sibling, 1 reply; 20+ messages in thread
From: Artem Bityutskiy @ 2016-08-22  7:27 UTC (permalink / raw)
  To: Dongsheng Yang, Boris Brezillon
  Cc: starvik, jesper.nilsson, Dongsheng Yang, shengyong1,
	linux-cris-kernel, Brian Norris, Richard Weinberger,
	Ard Biesheuvel, dooooongsheng.yang, jschultz, fabf,
	mtownsend1973, linux-mtd, Colin King, asierra, dmitry.torokhov,
	Dongsheng Yang, David Woodhouse

On Mon, 2016-08-22 at 11:22 +0800, Dongsheng Yang wrote:
> As I explained above, MTD RAID is not just a solution for reliability
> problem for MLC/TLC. 

Could you please answer these questions.

1. Does MTD raid work on MLC or is it SLC-only?

2. If I am building RAID-0, I have 2 flash chips, one has every even
block bad, the other has every odd block bad. What happens?

3. Same question, but for RAID-1.

4. Suppose I have RAID-1 like in this picture:

https://en.wikipedia.org/wiki/Standard_RAID_levels#/media/File:RAID_1.svg

Just assume we have flash chips, not disks, and eraseblocks, not
sectors.

Suppose eraseblock A1 goes bad. What happens next?


5. Suppose I have RAID-0 like here:

https://en.wikipedia.org/wiki/Standard_RAID_levels#/media/File:RAID_0.svg

Again, we are talking about chips and eraseblocks here.

What happens if A1 goes bad.

Thanks!

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: MTD RAID
       [not found]                       ` <CAAp9bSh-geStbHpA6+vYdLfNLcinWkfVLOGmX4kdRbja+d2MdA@mail.gmail.com>
@ 2016-08-22 14:54                         ` Artem Bityutskiy
  2016-08-22 15:30                           ` David Woodhouse
  0 siblings, 1 reply; 20+ messages in thread
From: Artem Bityutskiy @ 2016-08-22 14:54 UTC (permalink / raw)
  To: Dongsheng Yang
  Cc: Dongsheng Yang, Boris Brezillon, starvik, jesper.nilsson,
	Dongsheng Yang, shengyong1, linux-cris-kernel, Brian Norris,
	Richard Weinberger, Ard Biesheuvel, jschultz, fabf,
	mtownsend1973, linux-mtd, Colin King, asierra, dmitry.torokhov,
	Dongsheng Yang, David Woodhouse

On Mon, 2016-08-22 at 18:55 +0800, Dongsheng Yang wrote:
> 
> 
> On Mon, Aug 22, 2016 at 3:27 PM, Artem Bityutskiy <dedekind1@gmail.co
> m> wrote:
> > On Mon, 2016-08-22 at 11:22 +0800, Dongsheng Yang wrote:
> > > As I explained above, MTD RAID is not just a solution for
> > reliability
> > > problem for MLC/TLC. 
> > 
> > Could you please answer these questions.
> > 
> > 1. Does MTD raid work on MLC or is it SLC-only?
> 
> Good question. No, it is based on MTD layer, so it should be fine for
> any MTD devices in theory, although we are using nand flash in our
> production.
> > 
> > 2. If I am building RAID-0, I have 2 flash chips, one has every
> > even
> > block bad, the other has every odd block bad. What happens?
> 
>  All blocks would be marked as bad. Because we are combining the
> striping related blocks as a larger block, so if one of the blocks is
> bad, we will
> mark the virtual large block as bad. Example as RAID0 in my first
> email, I build a RAID0 device by 4 devices with block size of 16K.
> then the block
> size of the new RAID0 device is 64K. I am combining these 4 devices
> and do a striping on the new/larger block, for a better performance. 
> > 
> > 3. Same question, but for RAID-1.
> 
> Same result but with different reason, In RAID1, we don't need to
> enlarge the block size. But we should make the blockes mirrored. If
> block of the
> master or mirror is bad, then the block for the RAID1 device should
> be marked as bad, because we can't promise requested number of copies
> of data in this case. 
> > 
> > 4. Suppose I have RAID-1 like in this picture:
> > 
> > https://en.wikipedia.org/wiki/Standard_RAID_levels#/media/File:RAID
> > _1.svg
> > 
> > Just assume we have flash chips, not disks, and eraseblocks, not
> > sectors.
> > 
> > Suppose eraseblock A1 goes bad. What happens next?
> 
> If A1 in disk0 goes bad, the data will be read out from disk1. But
> there would be a warning about this problem. Then we should noticed
> that our
> data in this section is not safe enough now. Then there are different
> scenario. 

> (1) you are using ubi on this device, I mentioned once last week, we
> can enhance ubi to notice this kind of problem, then do a data
> migration. 

So to make my data become mirrored again, MTD RAID needs help from
upper layers.

In case of RAID, you do not need this kind of help. All you need is
change the disk when it starts getting bad sectors.

I mean, this is a bit like: here is our MTD RAID which is not a RAID
unless it works with special SW like UBI on top of it.

Are you sure you want to call this MTD RAID?

If this was on top of UBI, the RAID would probably be a closer match,
because then you could assume you are on top of a "reliable" media.

Note, I am not insisting, just asking.

> (2) you are using some other application on this device, Then you can
> check the status of this MTD RAID device now by "mtd_raid scan <dev>"
> if this command report there is some blocks are bad, you can do a
> "mtd_raid replace <>" to replace the bad one.

How will the replacement work. What gets replaced - the entire flash
chip? Probably not, because bad blocks are natural for raw NAND. Then
how do you replace the blocks if there is an FS on top?

Artem.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: MTD RAID
  2016-08-22 14:54                         ` Artem Bityutskiy
@ 2016-08-22 15:30                           ` David Woodhouse
  0 siblings, 0 replies; 20+ messages in thread
From: David Woodhouse @ 2016-08-22 15:30 UTC (permalink / raw)
  To: dedekind1, Dongsheng Yang
  Cc: Dongsheng Yang, Boris Brezillon, starvik, jesper.nilsson,
	Dongsheng Yang, shengyong1, linux-cris-kernel, Brian Norris,
	Richard Weinberger, Ard Biesheuvel, jschultz, fabf,
	mtownsend1973, linux-mtd, Colin King, asierra, dmitry.torokhov,
	Dongsheng Yang

[-- Attachment #1: Type: text/plain, Size: 627 bytes --]

On Mon, 2016-08-22 at 17:54 +0300, Artem Bityutskiy wrote:
> 
> If this was on top of UBI, the RAID would probably be a closer match,
> because then you could assume you are on top of a "reliable" media.

I think implementing RAUD within UBI makes a lot of sense. Virtual
eraseblocks could then be either duplicated across different physical
regions (like RAID1) but without having to worry about being in
precisely the same locations on each underlying device.

Mirroring and even RAID5-like configurations could also be handled. And
the user just gets a simple UBI volume that they can use as normal.

-- 
dwmw2

[-- Attachment #2: smime.p7s --]
[-- Type: application/x-pkcs7-signature, Size: 5760 bytes --]

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: MTD RAID
  2016-08-19  6:49 ` MTD RAID Boris Brezillon
  2016-08-19  7:08   ` Dongsheng Yang
@ 2016-08-23  3:44   ` Dongsheng Yang
  1 sibling, 0 replies; 20+ messages in thread
From: Dongsheng Yang @ 2016-08-23  3:44 UTC (permalink / raw)
  To: Boris Brezillon, Dongsheng Yang, richard, linux-mtd, David Woodhouse
  Cc: fabf, jesper.nilsson, Dongsheng Yang, linux-cris-kernel,
	shengyong1, Ard Biesheuvel, dmitry.torokhov, dooooongsheng.yang,
	jschultz, starvik, mtownsend1973, Colin King, asierra,
	Brian Norris

Hi David, Boris, Richard, Artem,

     Thanx for your suggestions, after a second thinking, I will 
consider to put
the RAID layer top on ubi, that sounds better, although we have some 
specified
problems if we don't build RAID device on MTD layer. I will send another 
email
about this topic later if I have a decision to move RAID layer on ubi or 
not.

Thanx a lot
Yang

On 08/19/2016 02:49 PM, Boris Brezillon wrote:
> Hi Dongsheng,
>
> On Fri, 19 Aug 2016 14:34:54 +0800
> Dongsheng Yang <dongsheng081251@gmail.com> wrote:
>
>> Hi guys,
>>      This is a email about MTD RAID.
>>
>> *Code:*
>>      kernel:
>> https://github.com/yangdongsheng/linux/tree/mtd_raid_v2-for-4.7
> Just had a quick look at the code, and I see at least one major problem
> in your RAID-1 implementation: you're ignoring the fact that NAND blocks
> can be or become bad. What's the plan for that?
>
> Regards,
>
> Boris
>
> ______________________________________________________
> Linux MTD discussion mailing list
> http://lists.infradead.org/mailman/listinfo/linux-mtd/
>

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2016-08-23  3:44 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <CA+qeAOpuZ0CXZP8tCWdhoVvTEKAw26gtz63-UJmQ4XLSXAd=Yg@mail.gmail.com>
2016-08-19  6:49 ` MTD RAID Boris Brezillon
2016-08-19  7:08   ` Dongsheng Yang
2016-08-19  7:15     ` Dongsheng Yang
2016-08-19  7:28       ` Dongsheng Yang
2016-08-19  8:20     ` Boris Brezillon
     [not found]       ` <CA+qeAOrSAi9uTHGCi-5cAJpM_O45oJUihNP-rHHa1FWL7_ZKHQ@mail.gmail.com>
2016-08-19  9:37         ` Boris Brezillon
2016-08-19 10:22           ` Dongsheng Yang
2016-08-19 11:36             ` Boris Brezillon
     [not found]       ` <57B6CC7B.1060208@easystack.cn>
2016-08-19  9:47         ` Richard Weinberger
2016-08-19 10:30           ` Dongsheng Yang
2016-08-19 10:30           ` Artem Bityutskiy
2016-08-19 10:38             ` Dongsheng Yang
     [not found]               ` <AL*AZwAyAecQSM1UjjjNxao0.3.1471605640762.Hmail.dongsheng.yang@easystack.cn>
2016-08-19 11:55                 ` Boris Brezillon
2016-08-22  4:01                   ` Dongsheng Yang
2016-08-22  7:09                     ` Boris Brezillon
     [not found]                   ` <57BA6FFA.6000601@easystack.cn>
2016-08-22  7:07                     ` Boris Brezillon
2016-08-22  7:27                     ` Artem Bityutskiy
     [not found]                       ` <CAAp9bSh-geStbHpA6+vYdLfNLcinWkfVLOGmX4kdRbja+d2MdA@mail.gmail.com>
2016-08-22 14:54                         ` Artem Bityutskiy
2016-08-22 15:30                           ` David Woodhouse
2016-08-23  3:44   ` Dongsheng Yang

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.