All of lore.kernel.org
 help / color / mirror / Atom feed
From: Lars Ellenberg <lars.ellenberg@linbit.com>
To: James Bottomley <James.Bottomley@HansenPartnership.com>
Cc: Philipp Reisner <philipp.reisner@linbit.com>,
	linux-kernel@vger.kernel.org, Jens Axboe <jens.axboe@oracle.com>,
	Greg KH <gregkh@suse.de>, Neil Brown <neilb@suse.de>,
	Sam Ravnborg <sam@ravnborg.org>, Dave Jones <davej@redhat.com>,
	Nikanth Karthikesan <knikanth@suse.de>,
	Lars Marowsky-Bree <lmb@suse.de>,
	"Nicholas A. Bellinger" <nab@linux-iscsi.org>,
	Kyle Moffett <kyle@moffetthome.net>,
	Bart Van Assche <bart.vanassche@gmail.com>
Subject: Re: [PATCH 04/16] DRBD: bitmap
Date: Sat, 2 May 2009 19:28:38 +0200	[thread overview]
Message-ID: <20090502172838.GC6466@racke> (raw)
In-Reply-To: <1241278918.3639.46.camel@mulgrave.int.hansenpartnership.com>

On Sat, May 02, 2009 at 10:41:58AM -0500, James Bottomley wrote:
> On Thu, 2009-04-30 at 13:26 +0200, Philipp Reisner wrote:
> > DRBD maintains a dirty bitmap in case it has to run without peer node or
> > without local disk. Writes to the on disk dirty bitmap are minimized by the
> > activity log (=AL). Each time an extent is evicted from the AL the part of
> > the bitmap no longer covered by the AL is written to disk.
> > 
> > Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
> > Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
> 
> The way the bitmap and activity log work are very similar to the way the
> md bitmap works (and are implemented for almost exactly the same
> reason).  Is there any way we could combine them?

in principle yes.
the DRBD bitmap has a granularity of 4 kB per bit,
and the "activity log" covers 4 MB per what we call "al extent".

though there is a very important difference.

in MD, when the bitmap is in use, I think the approach is:

  for each write queued to the lower level devices,
     dirty bits in memory
     for every newly dirtied bitmap page,
	flush bitmap pages to disk
	wait for these bitmap writes to complete
  then unplug the lowe level devices

  in background: periodically try to clean some pages,
	and write them to disk

the DRBD approach is:
  if target "al extent" of this write request
  is NOT in the in-memory "lru_cache" already,
	get it into the cache,
		if that means we have to kick an
		old element from the cache, and
		the associated bitmap is dirty
			write that part of the bitmap
        write an "al transaction" (synchonous single sector write)
  else
  	FAST PATH, no additional "meta data" write needed.
  
  submit to lower level device.


MD most of the time just _needs_ the additional "meta data" writes.
DRBD most of the time does not (unless you have completely random
writes, always requesting an extent not yet/anymore in the activity log.

I'm in the process of generalizing DRBDs approach to allow more than one
"al extent" to change during a "prepare" step, and cover several such changes
in one "al transaction", so the number of meta data updates can be
reduced even further.

adopting this "activity log" approach would make MD even better, IMO.

Thanks,

	Lars

  reply	other threads:[~2009-05-02 17:29 UTC|newest]

Thread overview: 85+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-04-30 11:26 [PATCH 00/16] DRBD: a block device for HA clusters Philipp Reisner
2009-04-30 11:26 ` [PATCH 01/16] DRBD: major.h Philipp Reisner
2009-04-30 11:26   ` [PATCH 02/16] DRBD: lru_cache Philipp Reisner
2009-04-30 11:26     ` [PATCH 03/16] DRBD: activity_log Philipp Reisner
2009-04-30 11:26       ` [PATCH 04/16] DRBD: bitmap Philipp Reisner
2009-04-30 11:26         ` [PATCH 05/16] DRBD: request Philipp Reisner
2009-04-30 11:26           ` [PATCH 06/16] DRBD: userspace_interface Philipp Reisner
2009-04-30 11:26             ` [PATCH 07/16] DRBD: internal_data_structures Philipp Reisner
2009-04-30 11:26               ` [PATCH 08/16] DRBD: main Philipp Reisner
2009-04-30 11:26                 ` [PATCH 09/16] DRBD: receiver Philipp Reisner
2009-04-30 11:26                   ` [PATCH 10/16] DRBD: proc Philipp Reisner
2009-04-30 11:26                     ` [PATCH 11/16] DRBD: worker Philipp Reisner
2009-04-30 11:26                       ` [PATCH 12/16] DRBD: variable_length_integer_encoding Philipp Reisner
2009-04-30 11:26                         ` [PATCH 13/16] DRBD: misc Philipp Reisner
2009-04-30 11:26                           ` [PATCH 14/16] DRBD: tracepoint_probes Philipp Reisner
2009-04-30 11:26                             ` [PATCH 15/16] DRBD: documentation Philipp Reisner
2009-04-30 11:26                               ` [PATCH 16/16] DRBD: final Philipp Reisner
2009-05-02 15:45                         ` [PATCH 12/16] DRBD: variable_length_integer_encoding James Bottomley
2009-05-02 17:29                           ` Lars Ellenberg
2009-05-02 15:44                     ` [PATCH 10/16] DRBD: proc James Bottomley
2009-05-02 20:23                       ` Lars Ellenberg
2009-05-02 15:41         ` [PATCH 04/16] DRBD: bitmap James Bottomley
2009-05-02 17:28           ` Lars Ellenberg [this message]
2009-05-03  5:21             ` Neil Brown
2009-05-03  7:38               ` Lars Ellenberg
2009-05-05 17:48               ` Lars Marowsky-Bree
2009-05-05 17:51                 ` James Bottomley
2009-05-05 22:26                 ` Neil Brown
2009-05-01  9:01       ` [PATCH 03/16] DRBD: activity_log Andrew Morton
2009-05-02 17:00         ` Lars Ellenberg
2009-05-01  8:59     ` [PATCH 02/16] DRBD: lru_cache Andrew Morton
2009-05-02 15:26       ` Lars Ellenberg
2009-05-02 17:58         ` Andrew Morton
2009-05-02 18:13           ` Lars Ellenberg
2009-05-02 18:26             ` Andrew Morton
2009-05-02 19:39               ` Lars Ellenberg
2009-05-02 23:51     ` Kyle Moffett
2009-05-03  6:27       ` Lars Ellenberg
2009-05-03 14:06         ` Kyle Moffett
2009-05-03 22:48           ` Lars Ellenberg
2009-05-04  0:48             ` Kyle Moffett
2009-05-04  1:01               ` Kyle Moffett
2009-05-04 16:12                 ` Rik van Riel
2009-05-04 16:15                   ` Lars Ellenberg
2009-05-01  8:59   ` [PATCH 01/16] DRBD: major.h Andrew Morton
2009-05-01  8:59 ` [PATCH 00/16] DRBD: a block device for HA clusters Andrew Morton
2009-05-01 11:15   ` Lars Marowsky-Bree
2009-05-01 13:14     ` Dave Jones
2009-05-01 19:14       ` Andrew Morton
2009-05-05  4:05     ` Christian Kujau
2009-05-02  7:33   ` Bart Van Assche
2009-05-03  5:36     ` Willy Tarreau
2009-05-03  5:40       ` david
2009-05-03 14:21         ` James Bottomley
2009-05-03 14:36           ` david
2009-05-03 14:45             ` James Bottomley
2009-05-03 14:56               ` david
2009-05-03 15:09                 ` James Bottomley
2009-05-03 15:22                   ` david
2009-05-03 15:38                     ` James Bottomley
2009-05-03 15:48                       ` david
2009-05-03 16:02                         ` James Bottomley
2009-05-03 16:13                           ` david
2009-05-04  8:28               ` Philipp Reisner
2009-05-04 17:24                 ` James Bottomley
2009-05-05  8:21                   ` Philipp Reisner
2009-05-05 14:09                     ` James Bottomley
2009-05-05 15:56                       ` Philipp Reisner
2009-05-05 17:05                         ` James Bottomley
2009-05-05 21:45                           ` Philipp Reisner
2009-05-05 21:53                             ` James Bottomley
2009-05-06  8:17                               ` Philipp Reisner
2009-05-05 15:03                     ` Bart Van Assche
2009-05-05 15:57                       ` Philipp Reisner
2009-05-05 17:38                         ` Lars Marowsky-Bree
2009-05-03 10:06       ` Philipp Reisner
2009-05-03 10:15         ` Thomas Backlund
2009-05-03  5:53 ` Neil Brown
2009-05-03  6:24   ` david
2009-05-03  8:29   ` Lars Ellenberg
2009-05-03 11:00     ` Neil Brown
2009-05-03 21:32       ` Lars Ellenberg
2009-05-04 16:12         ` Lars Marowsky-Bree
2009-05-05 22:08         ` Lars Ellenberg
2009-05-15 12:10 Philipp Reisner
2009-05-15 12:10 ` [PATCH 01/16] DRBD: major.h Philipp Reisner
2009-05-15 12:10   ` [PATCH 02/16] DRBD: lru_cache Philipp Reisner
2009-05-15 12:10     ` [PATCH 03/16] DRBD: activity_log Philipp Reisner
2009-05-15 12:10       ` [PATCH 04/16] DRBD: bitmap Philipp Reisner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20090502172838.GC6466@racke \
    --to=lars.ellenberg@linbit.com \
    --cc=James.Bottomley@HansenPartnership.com \
    --cc=bart.vanassche@gmail.com \
    --cc=davej@redhat.com \
    --cc=gregkh@suse.de \
    --cc=jens.axboe@oracle.com \
    --cc=knikanth@suse.de \
    --cc=kyle@moffetthome.net \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lmb@suse.de \
    --cc=nab@linux-iscsi.org \
    --cc=neilb@suse.de \
    --cc=philipp.reisner@linbit.com \
    --cc=sam@ravnborg.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.