All of lore.kernel.org
 help / color / mirror / Atom feed
* Fwd: Fw: some questions about uploading a Linux kernel driver
       [not found] <6a7c0aba219642de8b3f1cc680d53d85@AM0P193MB0754.EURP193.PROD.OUTLOOK.COM>
@ 2020-04-22 12:26 ` Xiaosong Ma
  2020-04-24  8:24   ` some questions about uploading a Linux kernel driver FusionRAID Paul Menzel
                     ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: Xiaosong Ma @ 2020-04-22 12:26 UTC (permalink / raw)
  To: song, linux-raid; +Cc: ty-jiang18, Guangyan Zhang, wei-jy19

Dear Song,

This is Xiaosong Ma from Qatar Computing Research Institute. I am
writing to follow up with the questions posed by a co-author from
Tsinghua U, regarding upstreaming our alternative md implementation
that is designed to significantly reduce SSD RAID latency (both median
and tail) for large SSD pools (such as 20-disk or more).

We read the Linux kernel upstreaming instructions, and believe that
our implementation has excellent separability from the current code
base (as a plug-and-play module with identical interfaces as md).
Meanwhile, we wonder whether there are standard test cases or
preferred applications that we should test our system with, before
doing code cleaning up. Your guidance is much appreciated.

Best regards,
Xiaosong

Dr. Xiaosong Ma
Principal Scientist
Distributed Systems

Qatar Computing Research Institute
Hamad Bin Khalifa University
HBKU – Research Complex
P.O. Box 5825
Doha, Qatar
Tel: +974 4454 6190
www.qcri.qa
<http://www.qcri.qa>



---------- Forwarded message ---------
From: 姜天洋 <ty-jiang18@mails.tsinghua.edu.cn>
Date: Tue, Apr 14, 2020 at 2:10 PM
Subject: Fw: some questions about uploading a Linux kernel driver
To: Dr. Xiaosong Ma <xma@hbku.edu.qa>, gyzh@tsinghua.edu.cn
<gyzh@tsinghua.edu.cn>, wei-jy19@mails.tsinghua.edu.cn
<wei-jy19@mails.tsinghua.edu.cn>





-----原始邮件-----
发件人:"姜天洋" <ty-jiang18@mails.tsinghua.edu.cn>
发送时间:2020-04-08 20:34:44 (星期三)
收件人: song@kernel.org
抄送: linux-raid@vger.kernel.org
主题: some questions about uploading a Linux kernel driver

Hello
I am Tianyang JIANG, a PhD student from Tsinghua U. We finish a study
which focuses on achieving consistent low latency for SSD arrays,
especially timing tail latency in RAID level. We implement a Linux
kernel driver called FusionRAID and we are interested in uploading
codes to Linux upstream.
I notice that I should seperate my changes and style-check my codes
before submitting. Are there any other issues I need to be aware of?
Thank you for your time.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: some questions about uploading a Linux kernel driver FusionRAID
  2020-04-22 12:26 ` Fwd: Fw: some questions about uploading a Linux kernel driver Xiaosong Ma
@ 2020-04-24  8:24   ` Paul Menzel
  2020-04-30  7:10   ` Fw: some questions about uploading a Linux kernel driver Song Liu
       [not found]   ` <1b9dc66b2afd49d1bc260691e62858fc@AM0P193MB0754.EURP193.PROD.OUTLOOK.COM>
  2 siblings, 0 replies; 4+ messages in thread
From: Paul Menzel @ 2020-04-24  8:24 UTC (permalink / raw)
  To: Xiaosong Ma, song, linux-raid; +Cc: ty-jiang18, Guangyan Zhang, wei-jy19, LKML

Dear Xiaosong, dear Tsinghua,


Am 22.04.20 um 14:26 schrieb Xiaosong Ma:

> This is Xiaosong Ma from Qatar Computing Research Institute. I am
> writing to follow up with the questions posed by a co-author from
> Tsinghua U, regarding upstreaming our alternative md implementation
> that is designed to significantly reduce SSD RAID latency (both median
> and tail) for large SSD pools (such as 20-disk or more).

Sorry for the late reply, and thank you for wanting to upstream the driver.

> We read the Linux kernel upstreaming instructions, and believe that
> our implementation has excellent separability from the current code
> base (as a plug-and-play module with identical interfaces as md).

Is there a chance to integrate it into the current driver, and then 
choose it, when creating the RAID?

> Meanwhile, we wonder whether there are standard test cases or
> preferred applications that we should test our system with, before
> doing code cleaning up. Your guidance is much appreciated.

[…]
> I am Tianyang JIANG, a PhD student from Tsinghua U. We finish a study
> which focuses on achieving consistent low latency for SSD arrays,
> especially timing tail latency in RAID level. We implement a Linux
> kernel driver called FusionRAID and we are interested in uploading
> codes to Linux upstream.
> I notice that I should separate my changes and style-check my codes
> before submitting. Are there any other issues I need to be aware of?
> Thank you for your time.

Is your code in some public git branch to be looked at already?

Otherwise, I believe just posting the patch train with `git send-email` 
and a cover letter, might be the best first step, so the developers can 
comment early before you put too much time into refactoring.

Some easy to reproduce test scripts to verify the performance benefits 
would indeed be nice, but I do not know, if that can be integrated into 
some Linux kernel test infrastructure already.


Kind regards,

Paul

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Fw: some questions about uploading a Linux kernel driver
  2020-04-22 12:26 ` Fwd: Fw: some questions about uploading a Linux kernel driver Xiaosong Ma
  2020-04-24  8:24   ` some questions about uploading a Linux kernel driver FusionRAID Paul Menzel
@ 2020-04-30  7:10   ` Song Liu
       [not found]   ` <1b9dc66b2afd49d1bc260691e62858fc@AM0P193MB0754.EURP193.PROD.OUTLOOK.COM>
  2 siblings, 0 replies; 4+ messages in thread
From: Song Liu @ 2020-04-30  7:10 UTC (permalink / raw)
  To: Xiaosong Ma
  Cc: linux-raid, 姜天洋, Guangyan Zhang, wei-jy19

Hi Xiaosong,

On Wed, Apr 22, 2020 at 5:26 AM Xiaosong Ma <xma@qf.org.qa> wrote:
>
> Dear Song,
>
> This is Xiaosong Ma from Qatar Computing Research Institute. I am
> writing to follow up with the questions posed by a co-author from
> Tsinghua U, regarding upstreaming our alternative md implementation
> that is designed to significantly reduce SSD RAID latency (both median
> and tail) for large SSD pools (such as 20-disk or more).
>
> We read the Linux kernel upstreaming instructions, and believe that
> our implementation has excellent separability from the current code
> base (as a plug-and-play module with identical interfaces as md).

Plug-and-play is not the key for upstream new code/module. There are
some other keys to consider:

1. Why do we need it? (better performance is a good reason here).
2. What's the impact on existing users?
3. Can we improve existing code to achieve the same benefit?

> Meanwhile, we wonder whether there are standard test cases or
> preferred applications that we should test our system with, before
> doing code cleaning up. Your guidance is much appreciated.

For testing, "mdadm test" is a good starting point (if it works here).
We also need data integrity tests and stress tests.

Thanks,
Song

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Fw: some questions about uploading a Linux kernel driver
       [not found]   ` <1b9dc66b2afd49d1bc260691e62858fc@AM0P193MB0754.EURP193.PROD.OUTLOOK.COM>
@ 2020-05-09 10:25     ` Xiaosong Ma
  0 siblings, 0 replies; 4+ messages in thread
From: Xiaosong Ma @ 2020-05-09 10:25 UTC (permalink / raw)
  To: Song Liu
  Cc: linux-raid, 姜天洋,
	Guangyan Zhang, wei-jy19, Paul Menzel

Dear Song, Paul, and other Linux-RAID kernel management members,

Thank you so much for your detailed reply and we apologize for our
delayed response. We faced uncertainty in school reopening in China,
as till this day the Tsinghua campus is not open, making full testing
and debugging hard (which requires physical access to our testbed).
Meanwhile, we can make plans to clean up the code and set up the mdadm
test as Song suggested.

Yes, the major advantage our module offers is performance, targeting
larger all-flash arrays increasingly common today. Its main source of
improvement comes from shortened/simpler write path, real-time
light-weight SSD performance spike detection, and RAID declustering.
As a result, it improves median write latency by up to several times,
and tail latency by nearly 400 times compared to existing RAID5 with
md, on multiple storage traces and YCSB running on RocksDB. The
declustering part of our work can be found in our FAST 2018 paper:
https://www.usenix.org/node/210543.

As to the other questions:
(2)  What's the impact on existing users? and (3) Can we improve
existing code to achieve the same benefit?
Their answers are related. As our module is targeting larger RAID
pools, such as SSD enclosures of 20 or more drives, to modify existing
md would not deliver benefits to small array users (for sizes like 7+1
RAID5). The code base is close to 5000 lines in C and we believe it
would work better as an alternative module which can be used by larger
arrays. Its internal workings are entirely transparent, with no new
user interfaces.

If the university opens by the end of May, we will target mid-late
June to finish basic testing and cleaning, and then release our code
for your review by a private github repo. Is that acceptable?

Best regards,

Xiaosong, Tianyang, Guangyan, and Junyu

Dr. Xiaosong Ma
Principal Scientist
Distributed Systems

Qatar Computing Research Institute
Hamad Bin Khalifa University
HBKU – Research Complex
P.O. Box 5825
Doha, Qatar
Tel: +974 4454 6190
www.qcri.qa
<http://www.qcri.qa>

On Thu, Apr 30, 2020 at 10:10 AM Song Liu <song@kernel.org> wrote:
>
> Hi Xiaosong,
>
> On Wed, Apr 22, 2020 at 5:26 AM Xiaosong Ma <xma@qf.org.qa> wrote:
> >
> > Dear Song,
> >
> > This is Xiaosong Ma from Qatar Computing Research Institute. I am
> > writing to follow up with the questions posed by a co-author from
> > Tsinghua U, regarding upstreaming our alternative md implementation
> > that is designed to significantly reduce SSD RAID latency (both median
> > and tail) for large SSD pools (such as 20-disk or more).
> >
> > We read the Linux kernel upstreaming instructions, and believe that
> > our implementation has excellent separability from the current code
> > base (as a plug-and-play module with identical interfaces as md).
>
> Plug-and-play is not the key for upstream new code/module. There are
> some other keys to consider:
>
> 1. Why do we need it? (better performance is a good reason here).
> 2. What's the impact on existing users?
> 3. Can we improve existing code to achieve the same benefit?
>
> > Meanwhile, we wonder whether there are standard test cases or
> > preferred applications that we should test our system with, before
> > doing code cleaning up. Your guidance is much appreciated.
>
> For testing, "mdadm test" is a good starting point (if it works here).
> We also need data integrity tests and stress tests.
>
> Thanks,
> Song

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2020-05-09 10:25 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <6a7c0aba219642de8b3f1cc680d53d85@AM0P193MB0754.EURP193.PROD.OUTLOOK.COM>
2020-04-22 12:26 ` Fwd: Fw: some questions about uploading a Linux kernel driver Xiaosong Ma
2020-04-24  8:24   ` some questions about uploading a Linux kernel driver FusionRAID Paul Menzel
2020-04-30  7:10   ` Fw: some questions about uploading a Linux kernel driver Song Liu
     [not found]   ` <1b9dc66b2afd49d1bc260691e62858fc@AM0P193MB0754.EURP193.PROD.OUTLOOK.COM>
2020-05-09 10:25     ` Xiaosong Ma

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.