All of lore.kernel.org
 help / color / mirror / Atom feed
* SVN Branch Description Format
@ 2012-03-11 10:59 Andrew Sayers
  2012-03-18 23:18 ` Steven Michalske
                   ` (3 more replies)
  0 siblings, 4 replies; 10+ messages in thread
From: Andrew Sayers @ 2012-03-11 10:59 UTC (permalink / raw)
  To: Git Mailing List, Sam Vilain, Stephen Bash, Nathan Gray,
	Jeff King, Sverre Rabbelier, Dmitry Ivankov,
	Ramkumar Ramachandra, David Barr, Jonathan Nieder

(CCing everyone from the "Approaches to SVN to Git conversion" thread)

A recent discussion of the remote helper for Subversion turned into a plan for
separating out revision->commit mapper projects.  Just as content import is
split into three parts (svn-fe, fast-import protocol, git-fast-import), r2c
mapping can be split into SVN export, protocol, and git import.  I've included
an initial draft protocol below.

My approximate roadmap from here is to add test cases and a reference
implementation to the protocol, then create an SVN exporter from my existing
proof-of-concept, then finally write a git importer.  My proof-of-concept work
is at https://github.com/andrew-sayers/Proof-of-concept-History-Converter

All comments gladly welcomed, but I would particularly appreciate suggestions
about how to specify the format more clearly, example SVN use cases that can't
be readily expressed with the language as specified, and suggestions about
whether forward compatibility should be built in.

	- Andrew


SVN Branch Description Format v0.1
==================================
Andrew Sayers <andrew-svn@pileofstuff.org>

This file specifies a simple format for describing how SVN revisions
apply to branches.  The goals of this format are to be a communication
protocol for programs operating on SVN history and to provide a common
language for developers discussing unusual SVN use cases.

Overview
--------

The SVN Branch Description Format (``BDF'') is a line-based ASCII
format, where each line specifies one action in the SVN history.  An
SVN history file contains two sections: the first is the ``header''
section that provides information about the file as a whole, the
second is the ``body'' section that provides information about things
that happened in specific revisions.

Any line in a BDF file that begins with a hash or semicolon character,
or that contains only whitespace (including empty lines) is considered
to be a comment, and must be ignored.  Clients should use a leading
'#' to create ordinary comments to be read by users, and a leading ';'
for commented-out actions the client wishes to suggest to a user.
This makes it easier for a user to accept/reject actions by
search-and-replace.

Here is an example file:

    # the file must begin by specifying the revision of the format being used:
    This is a version 0.1 SVN Branch Description file
    Body:
    In r1, create branch "trunk"
    In r10, create branch "branches/1.0" from "trunk" r9
    In r20, create tag "tags/version_1" from "branches/1.0" r19
    In r20, deactivate "tags/version_1"
    # User intervention is required to confirm this action:
    ; In r25, merge "trunk" r24 into "branches/1.0"

Header section
--------------

The header section begins with a version identifier, continues with
any number of private actions, and ends with the header-body boundary
marker.

Any unrecognised action in the header should be treated as a fatal
error.

Version identifier
~~~~~~~~~~~~~~~~~~

The first action in this section must exactly match the following:

    This is a version 0.1 SVN Branch Description file

Later versions of the format will use a different identifier here.
This might be another number (e.g. ``version 3''), a non-numeric
identifier (e.g. ``experimental version''), a different format name
(e.g. ``SVN-BDF file''), or anything else.  Clients should treat anything
other than the exact string above as a fatal error.

Private actions
~~~~~~~~~~~~~~~

Private actions begin with an open bracket and end with a close
bracket.  For example:

    (my-great-parser will write debugging information to "debug.log")

These actions are intended for internal use by clients using BDF as a
storage format.  Clients must begin any private action they create
with a client-specific identifier (`my-great-parser` in the above
example), and must ignore any private action that does not begin with
their identifier.  These requirements are designed to ensure that
clients do not use private actions to communicate with other clients -
please send such messages out-of-band (e.g. through arguments to a
command-line program), or propose a revision to the format so that all
clients can communicate using a standard language.

Header-body boundary marker
~~~~~~~~~~~~~~~~~~~~~~~~~~~

The last action in this section must exactly match the following:

    Body:

This serves only to indicate that the header is finishing, and the
body beginning.  The remainder of the file must be treated as the body
section.

Body section
------------

The body section contains zero or more actions as described below.
The following components are widely used in these actions:

Revision identifiers::
  These strings begin with the letter `r`, followed by a number in the
  range 1-9, then zero or more numbers in the range 0-9.  So `r1`,
  `r10` and `r999` are valid revision identifiers, but `1`, `r01` and
  `revision 999` are not.  Revision identifiers are indicated with
  `<revision>` in the definition of an action.

String identifiers::

  These strings begin with a double quote, then contain zero or more
  valid characters, then another double quote.  Valid characters
  include a backslash followed by `\`, `r`, `n`, or `"`, or any
  character other than backslash, carriage return, newline, or double
  quote.  So `"foo"`, `"foo\""` and `"foo\\"` are valid identifiers,
  but `"foo\"`, `"foo""` and `"foo\t"` are not.  Clients must unescape
  string identifiers by converting `\"`, `\\`, `\r` and `\n` to double
  quote, backslash, carriage return and newline respectively.  String
  identifiers are indicated with `<string>` in the definition of an
  action.

These are the valid actions in the body section:

    In <revision>, create branch <string>
    In <revision>, create branch <string> as <string>
    In <revision>, create branch <string> from <string> <revision>
    In <revision>, create branch <string> as <string> from <string> <revision>

    In <revision>, create tag <string>
    In <revision>, create tag <string> as <string>
    In <revision>, create tag <string> from <string> <revision>
    In <revision>, create tag <string> as <string> from <string> <revision>

    In <revision>, deactivate <string>
    In <revision>, delete <string>

    In <revision>, merge <string> up to <revision> into <string>
    In <revision>, cherry-pick <string> <revision> into <string>
    In <revision>, cherry-pick <string> <revision> to <revision> into <string>
    In <revision>, revert <string> <revision> from <string>
    In <revision>, revert <string> <revision> to <revision> from <string>

    In <revision>, ignore <string>
    In <revision>, amend <string>, keeping the old log message
    In <revision>, amend <string>, keeping the new log message
    In <revision>, amend <string>, keeping both log messages

All actions begin with `In <revision>,`.  This identifies the revision
the action applies to.  Each action must refer to a revision greater
than or equal to the previous action, except the first action which
must be greater than or equal to revision 1.  Clients should treat
revision numbers that are too low as fatal errors.  Clients may treat
revision numbers as errors if they are not present in the repository
(i.e. higher than the maximum revision or not part of the
sub-repository being examined)

Create actions
~~~~~~~~~~~~~~

The `create branch` and `create tag` actions identify a branch or tag
being created in the specified revision.  The first string identifier
indicates the SVN directory name associated with the branch.  The `as
<string>` identifier indicates the name the user intended for the
branch or tag (if unspecified, clients should assume the user intended
the branch to have the same name as the directory).  The `from
<string> <revision>` identifiers indicate the directory associated
with a previously-named branch, and the revision number used as the
parent for this branch or tag (if unspecified, clients should assume
the user intended this to be a trunk).  Here are some examples:

    # Create a trunk branch named "trunk" from directory "trunk":
    In r1, create branch "trunk"

    # Create a branch named "branches/foo" from directory "branches/foo",
    # whose parent is the directory "trunk" as it was in revision 5:
    In r10, create branch "branches/foo" as "foo" from "trunk" r5

    # Create a tag named "1.0" from directory "tags/1.0",
    # whose parent is the directory "branches/foo" as it was in revision 15:
    In r20, create tag "tags/1.0" as "1.0" from "branches/foo" r15

Clients should treat it as a fatal error if the `from` revision is
greater than the current revision, or if the `from` branch was not
active in the specified revision (i.e. it hadn't been created, had
been deactivated or had been deleted).

If the `from` branch was active but not changed in the specified
revision, clients should behave as if the user specified the last
revision in which the branch was edited prior to the specified
revision.  If the `from` revision is equal to the current revision,
clients should warn the user if the `from` branch was edited in that
revision.  These requirements make it significantly easier for users
to manually edit a BDF file.

Clients should treat it as a fatal error if the name specified for a
branch or tag is currently in use.  A name is currently in use if it
was previously declared with one of the `create` actions, and has not
yet been deleted with the `delete` action (the `deactivate` action
must not be treated as removing a branch name).  Clients that check
for names currently in use must treat branch and tag names to be in
different namespaces - in other words, it is legal to have a branch
named `foo` at the same time as a tag named `foo`.

Delete actions
~~~~~~~~~~~~~~

The `deactivate` and `delete` actions identify a branch or tag that
should no longer be considered active.  The string identifier
indicates the directory name associated with the branch.

The `deactivate` action indicates that a branch or tag is still of
historical interest, but should no longer be updated when changes are
made.  For example, a tag might be deactivated after it is created, to
indicate that the tag should be considered immutable even though
changes were accidentally made later on.

The `delete` action indicates that a branch or tag is no longer of any
interest, and can be ignored completely.  For example, a branch might
be deleted if it had been fully merged into another branch before the
directory was removed.

Merge actions
~~~~~~~~~~~~~

The `merge`, `cherry-pick` and `revert` actions identify changes in
the relationship between two branches.  The first string identifier
indicates the directory name for the action's source.  The last string
identifier indicates the directory name for the action's destination.
The revision identifier(s) indicates the revision(s) for the action's
source.

The `merge` action indicates that all revisions up to the specified
point have been applied from the source to the destination.  The
`cherry-pick` and `revert` actions identify an inclusive set of
revisions that have been applied from the source to the destination.  The
`merge` and `cherry-pick` actions indicate that revisions that
previously had not been merged now have been merged.  The `revert`
action indicates that revisions which had previously been merged have
no longer been merged.

If two revisions were specified, clients must treat it as a fatal
error if the second revision is less than or equal to the first.

Clients must treat it as a fatal error if the `from` branch was not
active in the specified revision.  If two revisions are specified,
clients must treat it as a fatal error unless the `from` branch was
active in both revisions.

For the `merge` action, if the `from` branch was active but not
changed in the specified revision, clients must behave as if the user
specified the highest revision before the specified revision in which
the branch was changed.  If the `from` revision is equal to the
current revision, clients should warn the user if the from branch was
changed in that revision.

For the two-revision `revert` and `cherry-pick` actions, clients must
treat the second revision as in the previous paragraph.  If the branch
was not changed in the first revision, clients must behave as if the
user specified the lowest revision after the specified revision in
which the branch was changed.

For all `revert` and `cherry-pick` actions, clients must treat it as a
fatal error if no revisions in the specified range changed the branch.

Note: the above requirements for altering revision numbers make it
significantly easier for users to manually edit an SVN Branch
Description file.

If a `merge` action specifies a revision less than or equal to any
earlier merge action that has not been reverted, clients must treat it
as a fatal error.  Clients should not treat it as an error if the
merge includes revisions that have previously been cherry-picked.

If a `cherry-pick` action includes the first revision in which the
branch was changed after any earlier merge action, or if the
destination branch was originally created from the source branch and
the `cherry-pick` action includes the next revision in which the
branch was changed, clients should warn the user that they probably
meant to merge.

If a `revert` action specifies a revision that has not been
cherry-picked, or is greater than any earlier merge action that has
not been reverted, clients should treat it as a fatal error.

Clients may warn about any merge actions they feel are unusual, but
should not treat anything as a fatal error unless specified above.

Note: there is no `unmerge` action.  See the `merge` examples in the
library for how unmerging is achieved.

Edit actions
~~~~~~~~~~~~

The `ignore` and `amend` actions identify a directory and revision
which should not create a new revision in the history.  For example,
an accidental `svn commit` followed immediately by an `svn commit`
reverting it might simply be ignored.  The string identifier indicates
the name of the directory to be edited.

The `ignore` action indicates that the client must act as if no
changes were made to the directory in the specified revision, whether
or not changes were actually made.  Note that this only includes
changes made in the SVN repository itself, not those indicated by
actions in the SVN history file.

The `amend` actions indicate that the state of the branch in the
current revision should overwrite the most recent revision for that
branch.  Clients may warn, but should not treat it as a fatal error,
if the branch was not actually changed in the current revision.  When
overwriting the most recent revision, clients must retain the revision
log from the previous revison if the action specifies `keeping the old
log message`, replace the revision log entirely with the log message
for the current revision if the action specifies `keeping the new log
message`, or concatenate the new log message to the end of the old one
if the action specifies `keeping both log messages`.  Clients may
reformat log messages when keeping both, but are reminded of the need
for messages to look sensible when long chains of amendments are
created.

Clients should treat it as a fatal error if an edit action is applied
to a branch in the same revision as the branch is created.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: SVN Branch Description Format
  2012-03-11 10:59 SVN Branch Description Format Andrew Sayers
@ 2012-03-18 23:18 ` Steven Michalske
  2012-03-19  1:28   ` Andrew Sayers
  2012-03-19  1:28 ` Licensing a file format (was Re: SVN Branch Description Format) Andrew Sayers
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 10+ messages in thread
From: Steven Michalske @ 2012-03-18 23:18 UTC (permalink / raw)
  To: Andrew Sayers
  Cc: Git Mailing List, Sam Vilain, Stephen Bash, Nathan Gray,
	Jeff King, Sverre Rabbelier, Dmitry Ivankov,
	Ramkumar Ramachandra, David Barr, Jonathan Nieder

Consider .SVN_BDF or .SVN.BDF instead of .BDF

I worry about a branch or tag containing a "
Can subversion contain a "?

Steve
On Mar 11, 2012, at 3:59 AM, Andrew Sayers wrote:

> (CCing everyone from the "Approaches to SVN to Git conversion" thread)
> 
> A recent discussion of the remote helper for Subversion turned into a plan for
> separating out revision->commit mapper projects.  Just as content import is
> split into three parts (svn-fe, fast-import protocol, git-fast-import), r2c
> mapping can be split into SVN export, protocol, and git import.  I've included
> an initial draft protocol below.
> 
> My approximate roadmap from here is to add test cases and a reference
> implementation to the protocol, then create an SVN exporter from my existing
> proof-of-concept, then finally write a git importer.  My proof-of-concept work
> is at https://github.com/andrew-sayers/Proof-of-concept-History-Converter
> 
> All comments gladly welcomed, but I would particularly appreciate suggestions
> about how to specify the format more clearly, example SVN use cases that can't
> be readily expressed with the language as specified, and suggestions about
> whether forward compatibility should be built in.
> 
> 	- Andrew
> 
> 
> SVN Branch Description Format v0.1
> ==================================
> Andrew Sayers <andrew-svn@pileofstuff.org>
> 
> This file specifies a simple format for describing how SVN revisions
> apply to branches.  The goals of this format are to be a communication
> protocol for programs operating on SVN history and to provide a common
> language for developers discussing unusual SVN use cases.
> 
> Overview
> --------
> 
> The SVN Branch Description Format (``BDF'') is a line-based ASCII
> format, where each line specifies one action in the SVN history.  An
> SVN history file contains two sections: the first is the ``header''
> section that provides information about the file as a whole, the
> second is the ``body'' section that provides information about things
> that happened in specific revisions.
> 
> Any line in a BDF file that begins with a hash or semicolon character,
> or that contains only whitespace (including empty lines) is considered
> to be a comment, and must be ignored.  Clients should use a leading
> '#' to create ordinary comments to be read by users, and a leading ';'
> for commented-out actions the client wishes to suggest to a user.
> This makes it easier for a user to accept/reject actions by
> search-and-replace.
> 
> Here is an example file:
> 
>    # the file must begin by specifying the revision of the format being used:
>    This is a version 0.1 SVN Branch Description file
>    Body:
>    In r1, create branch "trunk"
>    In r10, create branch "branches/1.0" from "trunk" r9
>    In r20, create tag "tags/version_1" from "branches/1.0" r19
>    In r20, deactivate "tags/version_1"
>    # User intervention is required to confirm this action:
>    ; In r25, merge "trunk" r24 into "branches/1.0"
> 
> Header section
> --------------
> 
> The header section begins with a version identifier, continues with
> any number of private actions, and ends with the header-body boundary
> marker.
> 
> Any unrecognised action in the header should be treated as a fatal
> error.
> 
> Version identifier
> ~~~~~~~~~~~~~~~~~~
> 
> The first action in this section must exactly match the following:
> 
>    This is a version 0.1 SVN Branch Description file
> 
> Later versions of the format will use a different identifier here.
> This might be another number (e.g. ``version 3''), a non-numeric
> identifier (e.g. ``experimental version''), a different format name
> (e.g. ``SVN-BDF file''), or anything else.  Clients should treat anything
> other than the exact string above as a fatal error.
> 
> Private actions
> ~~~~~~~~~~~~~~~
> 
> Private actions begin with an open bracket and end with a close
> bracket.  For example:
> 
>    (my-great-parser will write debugging information to "debug.log")
> 
> These actions are intended for internal use by clients using BDF as a
> storage format.  Clients must begin any private action they create
> with a client-specific identifier (`my-great-parser` in the above
> example), and must ignore any private action that does not begin with
> their identifier.  These requirements are designed to ensure that
> clients do not use private actions to communicate with other clients -
> please send such messages out-of-band (e.g. through arguments to a
> command-line program), or propose a revision to the format so that all
> clients can communicate using a standard language.
> 
> Header-body boundary marker
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~
> 
> The last action in this section must exactly match the following:
> 
>    Body:
> 
> This serves only to indicate that the header is finishing, and the
> body beginning.  The remainder of the file must be treated as the body
> section.
> 
> Body section
> ------------
> 
> The body section contains zero or more actions as described below.
> The following components are widely used in these actions:
> 
> Revision identifiers::
>  These strings begin with the letter `r`, followed by a number in the
>  range 1-9, then zero or more numbers in the range 0-9.  So `r1`,
>  `r10` and `r999` are valid revision identifiers, but `1`, `r01` and
>  `revision 999` are not.  Revision identifiers are indicated with
>  `<revision>` in the definition of an action.
> 
> String identifiers::
> 
>  These strings begin with a double quote, then contain zero or more
>  valid characters, then another double quote.  Valid characters
>  include a backslash followed by `\`, `r`, `n`, or `"`, or any
>  character other than backslash, carriage return, newline, or double
>  quote.  So `"foo"`, `"foo\""` and `"foo\\"` are valid identifiers,
>  but `"foo\"`, `"foo""` and `"foo\t"` are not.  Clients must unescape
>  string identifiers by converting `\"`, `\\`, `\r` and `\n` to double
>  quote, backslash, carriage return and newline respectively.  String
>  identifiers are indicated with `<string>` in the definition of an
>  action.
> 
> These are the valid actions in the body section:
> 
>    In <revision>, create branch <string>
>    In <revision>, create branch <string> as <string>
>    In <revision>, create branch <string> from <string> <revision>
>    In <revision>, create branch <string> as <string> from <string> <revision>
> 
>    In <revision>, create tag <string>
>    In <revision>, create tag <string> as <string>
>    In <revision>, create tag <string> from <string> <revision>
>    In <revision>, create tag <string> as <string> from <string> <revision>
> 
>    In <revision>, deactivate <string>
>    In <revision>, delete <string>
> 
>    In <revision>, merge <string> up to <revision> into <string>
>    In <revision>, cherry-pick <string> <revision> into <string>
>    In <revision>, cherry-pick <string> <revision> to <revision> into <string>
>    In <revision>, revert <string> <revision> from <string>
>    In <revision>, revert <string> <revision> to <revision> from <string>
> 
>    In <revision>, ignore <string>
>    In <revision>, amend <string>, keeping the old log message
>    In <revision>, amend <string>, keeping the new log message
>    In <revision>, amend <string>, keeping both log messages
> 
> All actions begin with `In <revision>,`.  This identifies the revision
> the action applies to.  Each action must refer to a revision greater
> than or equal to the previous action, except the first action which
> must be greater than or equal to revision 1.  Clients should treat
> revision numbers that are too low as fatal errors.  Clients may treat
> revision numbers as errors if they are not present in the repository
> (i.e. higher than the maximum revision or not part of the
> sub-repository being examined)
> 
> Create actions
> ~~~~~~~~~~~~~~
> 
> The `create branch` and `create tag` actions identify a branch or tag
> being created in the specified revision.  The first string identifier
> indicates the SVN directory name associated with the branch.  The `as
> <string>` identifier indicates the name the user intended for the
> branch or tag (if unspecified, clients should assume the user intended
> the branch to have the same name as the directory).  The `from
> <string> <revision>` identifiers indicate the directory associated
> with a previously-named branch, and the revision number used as the
> parent for this branch or tag (if unspecified, clients should assume
> the user intended this to be a trunk).  Here are some examples:
> 
>    # Create a trunk branch named "trunk" from directory "trunk":
>    In r1, create branch "trunk"
> 
>    # Create a branch named "branches/foo" from directory "branches/foo",
>    # whose parent is the directory "trunk" as it was in revision 5:
>    In r10, create branch "branches/foo" as "foo" from "trunk" r5
> 
>    # Create a tag named "1.0" from directory "tags/1.0",
>    # whose parent is the directory "branches/foo" as it was in revision 15:
>    In r20, create tag "tags/1.0" as "1.0" from "branches/foo" r15
> 
> Clients should treat it as a fatal error if the `from` revision is
> greater than the current revision, or if the `from` branch was not
> active in the specified revision (i.e. it hadn't been created, had
> been deactivated or had been deleted).
> 
> If the `from` branch was active but not changed in the specified
> revision, clients should behave as if the user specified the last
> revision in which the branch was edited prior to the specified
> revision.  If the `from` revision is equal to the current revision,
> clients should warn the user if the `from` branch was edited in that
> revision.  These requirements make it significantly easier for users
> to manually edit a BDF file.
> 
> Clients should treat it as a fatal error if the name specified for a
> branch or tag is currently in use.  A name is currently in use if it
> was previously declared with one of the `create` actions, and has not
> yet been deleted with the `delete` action (the `deactivate` action
> must not be treated as removing a branch name).  Clients that check
> for names currently in use must treat branch and tag names to be in
> different namespaces - in other words, it is legal to have a branch
> named `foo` at the same time as a tag named `foo`.
> 
> Delete actions
> ~~~~~~~~~~~~~~
> 
> The `deactivate` and `delete` actions identify a branch or tag that
> should no longer be considered active.  The string identifier
> indicates the directory name associated with the branch.
> 
> The `deactivate` action indicates that a branch or tag is still of
> historical interest, but should no longer be updated when changes are
> made.  For example, a tag might be deactivated after it is created, to
> indicate that the tag should be considered immutable even though
> changes were accidentally made later on.
> 
> The `delete` action indicates that a branch or tag is no longer of any
> interest, and can be ignored completely.  For example, a branch might
> be deleted if it had been fully merged into another branch before the
> directory was removed.
> 
> Merge actions
> ~~~~~~~~~~~~~
> 
> The `merge`, `cherry-pick` and `revert` actions identify changes in
> the relationship between two branches.  The first string identifier
> indicates the directory name for the action's source.  The last string
> identifier indicates the directory name for the action's destination.
> The revision identifier(s) indicates the revision(s) for the action's
> source.
> 
> The `merge` action indicates that all revisions up to the specified
> point have been applied from the source to the destination.  The
> `cherry-pick` and `revert` actions identify an inclusive set of
> revisions that have been applied from the source to the destination.  The
> `merge` and `cherry-pick` actions indicate that revisions that
> previously had not been merged now have been merged.  The `revert`
> action indicates that revisions which had previously been merged have
> no longer been merged.
> 
> If two revisions were specified, clients must treat it as a fatal
> error if the second revision is less than or equal to the first.
> 
> Clients must treat it as a fatal error if the `from` branch was not
> active in the specified revision.  If two revisions are specified,
> clients must treat it as a fatal error unless the `from` branch was
> active in both revisions.
> 
> For the `merge` action, if the `from` branch was active but not
> changed in the specified revision, clients must behave as if the user
> specified the highest revision before the specified revision in which
> the branch was changed.  If the `from` revision is equal to the
> current revision, clients should warn the user if the from branch was
> changed in that revision.
> 
> For the two-revision `revert` and `cherry-pick` actions, clients must
> treat the second revision as in the previous paragraph.  If the branch
> was not changed in the first revision, clients must behave as if the
> user specified the lowest revision after the specified revision in
> which the branch was changed.
> 
> For all `revert` and `cherry-pick` actions, clients must treat it as a
> fatal error if no revisions in the specified range changed the branch.
> 
> Note: the above requirements for altering revision numbers make it
> significantly easier for users to manually edit an SVN Branch
> Description file.
> 
> If a `merge` action specifies a revision less than or equal to any
> earlier merge action that has not been reverted, clients must treat it
> as a fatal error.  Clients should not treat it as an error if the
> merge includes revisions that have previously been cherry-picked.
> 
> If a `cherry-pick` action includes the first revision in which the
> branch was changed after any earlier merge action, or if the
> destination branch was originally created from the source branch and
> the `cherry-pick` action includes the next revision in which the
> branch was changed, clients should warn the user that they probably
> meant to merge.
> 
> If a `revert` action specifies a revision that has not been
> cherry-picked, or is greater than any earlier merge action that has
> not been reverted, clients should treat it as a fatal error.
> 
> Clients may warn about any merge actions they feel are unusual, but
> should not treat anything as a fatal error unless specified above.
> 
> Note: there is no `unmerge` action.  See the `merge` examples in the
> library for how unmerging is achieved.
> 
> Edit actions
> ~~~~~~~~~~~~
> 
> The `ignore` and `amend` actions identify a directory and revision
> which should not create a new revision in the history.  For example,
> an accidental `svn commit` followed immediately by an `svn commit`
> reverting it might simply be ignored.  The string identifier indicates
> the name of the directory to be edited.
> 
> The `ignore` action indicates that the client must act as if no
> changes were made to the directory in the specified revision, whether
> or not changes were actually made.  Note that this only includes
> changes made in the SVN repository itself, not those indicated by
> actions in the SVN history file.
> 
> The `amend` actions indicate that the state of the branch in the
> current revision should overwrite the most recent revision for that
> branch.  Clients may warn, but should not treat it as a fatal error,
> if the branch was not actually changed in the current revision.  When
> overwriting the most recent revision, clients must retain the revision
> log from the previous revison if the action specifies `keeping the old
> log message`, replace the revision log entirely with the log message
> for the current revision if the action specifies `keeping the new log
> message`, or concatenate the new log message to the end of the old one
> if the action specifies `keeping both log messages`.  Clients may
> reformat log messages when keeping both, but are reminded of the need
> for messages to look sensible when long chains of amendments are
> created.
> 
> Clients should treat it as a fatal error if an edit action is applied
> to a branch in the same revision as the branch is created.
> --
> To unsubscribe from this list: send the line "unsubscribe git" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: SVN Branch Description Format
  2012-03-18 23:18 ` Steven Michalske
@ 2012-03-19  1:28   ` Andrew Sayers
  0 siblings, 0 replies; 10+ messages in thread
From: Andrew Sayers @ 2012-03-19  1:28 UTC (permalink / raw)
  To: Steven Michalske
  Cc: Git Mailing List, Sam Vilain, Stephen Bash, Nathan Gray,
	Jeff King, Sverre Rabbelier, Dmitry Ivankov,
	Ramkumar Ramachandra, David Barr, Jonathan Nieder

On 18/03/12 23:18, Steven Michalske wrote:
> Consider .SVN_BDF or .SVN.BDF instead of .BDF
> 
> I worry about a branch or tag containing a "
> Can subversion contain a "?
> 
> Steve

I had a look into valid paths during the week, but hadn't checked quote
marks specifically.  Thankfully it seems to work:

mkdir test_dir
cd test_dir
svnadmin create repo
mkdir checkout
svn checkout file://$(pwd)/repo/ checkout/
cd checkout/
mkdir '"'
svn ci -m 'Created directory "'

I'll add this to the test suite :)

Of course, it wouldn't do for SVN to make things that simple.  [1] seems
to be the official definition of what a valid path looks like, and I've
skipped over some important requirements in the spec.  Most importantly,
redundant '/'s are allowed at the end of a path, and multiple '/'s are
collapsed down to one in SVN, so it seems prudent to import that little
eccentricity into this format.

I could be persuaded about making '.svn-bdf' the recommended extension,
but I'd also be happy to go with a more TLA-friendly name for the whole
thing.  "SVN Branch Format" would lend itself neatly to a three-letter
extension (.sbf) that doesn't appear in Wikipedia's list of file
formats[2], although it still encourages the RAS syndrome[3] I've
repeatedly stumbled over while writing the spec.  "SVN Branching
Language" might work, and unlike "BDF" or "SBF", "SBL" doesn't sound at
all like "PDF" when mumbled indistinctly.  Alternative suggestions are
welcome, with the obvious proviso that this is largely subjective and
I'm going to pick whatever sounds best to my ear :)

I'm approaching a natural break in defining the format, so I'll paste a
new version next week.  After that I'll probably pause the format and
work on the SVN exporter for a bit, so I'll have a structure to continue
building the tests and reference implementation against.

	- Andrew

[1]http://subversion.apache.org/docs/api/latest/group__svn__fs__directories.html#details
[2]http://en.wikipedia.org/wiki/List_of_file_formats
[3]http://en.wikipedia.org/wiki/RAS_syndrome

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Licensing a file format (was Re: SVN Branch Description Format)
  2012-03-11 10:59 SVN Branch Description Format Andrew Sayers
  2012-03-18 23:18 ` Steven Michalske
@ 2012-03-19  1:28 ` Andrew Sayers
  2012-03-19  1:34   ` Jonathan Nieder
  2012-03-23  0:08 ` SVN Branching Language " Andrew Sayers
  2012-03-30  4:06 ` SVN Branch Description Format Ramkumar Ramachandra
  3 siblings, 1 reply; 10+ messages in thread
From: Andrew Sayers @ 2012-03-19  1:28 UTC (permalink / raw)
  To: Git Mailing List, Sam Vilain, Stephen Bash, Nathan Gray,
	Jeff King, Sverre Rabbelier, Dmitry Ivankov,
	Ramkumar Ramachandra, David Barr, Jonathan Nieder,
	semen.vadishev

(CCing Semen Vadishev as I'd like to know if the SubGit project has any
opinion about this)

If you haven't been following the thread - this is a discussion of a new
format for describing SVN branching/merging/tagging behaviour.  Among
other things, this would let people plug different SVN exporters into
different Git importers.

I'd like advice/opinions from the community about licensing the
specification and reference implementation for this format, because it
seems like establishing an open standard is a bit different to promoting
open source.  I've always thought of copyleft as the tool I use to
promote sharing, but standards get more sharing by abandoning copyleft
and relying on the network effect - forking a standard makes your
product less valuable, unless you're not allowed to use the standard or
you have so much market share you don't have to care about standards in
the first place.

I'm planning to release the spec under a Creative Commons
Attribution-NoDerivs license (i.e. commercial use allowed, changes have
to go through me) and the reference implementation under an MIT license
(i.e. blatant theft of the exact recommended behaviour is encouraged).
This should minimise the barriers for people wanting to implement the
format as specified, and maximise the barriers for people wanting to
subvert the format.  The downside is that it makes life difficult for
everyone if I'm hit by a bus, and makes me less inclined to put some of
the more complex algorithms into the non-copyleft reference implementation.

Just to be clear, the format is one of three parts involved in getting
SVN branches/merges/tags into git.  I plan to release an SVN exporter
and git importer under the GPL, but expect to make a special case for
the format.

So the big question - would you be more inclined to use/contribute to
the SVN Branch Description Format if it had a different license?

	- Andrew

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Licensing a file format (was Re: SVN Branch Description Format)
  2012-03-19  1:28 ` Licensing a file format (was Re: SVN Branch Description Format) Andrew Sayers
@ 2012-03-19  1:34   ` Jonathan Nieder
  2012-03-19 20:31     ` Andrew Sayers
  0 siblings, 1 reply; 10+ messages in thread
From: Jonathan Nieder @ 2012-03-19  1:34 UTC (permalink / raw)
  To: Andrew Sayers
  Cc: Git Mailing List, Sam Vilain, Stephen Bash, Nathan Gray,
	Jeff King, Sverre Rabbelier, Dmitry Ivankov,
	Ramkumar Ramachandra, David Barr, semen.vadishev

Hi Andrew,

Andrew Sayers wrote:

> I'm planning to release the spec under a Creative Commons
> Attribution-NoDerivs license
[...]
> So the big question - would you be more inclined to use/contribute to
> the SVN Branch Description Format if it had a different license?

Yes.  By the way, I think fear of forking/discussion of potential
improvements/translation into other languages in the context of
standards is misguided.  If you would like legal protection for your
standard, that is what trademark law is for.

Kind regards,
Jonathan

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Licensing a file format (was Re: SVN Branch Description Format)
  2012-03-19  1:34   ` Jonathan Nieder
@ 2012-03-19 20:31     ` Andrew Sayers
  2012-03-20 22:59       ` Jeff King
  0 siblings, 1 reply; 10+ messages in thread
From: Andrew Sayers @ 2012-03-19 20:31 UTC (permalink / raw)
  To: Jonathan Nieder
  Cc: Andrew Sayers, Git Mailing List, Sam Vilain, Stephen Bash,
	Nathan Gray, Jeff King, Sverre Rabbelier, Dmitry Ivankov,
	Ramkumar Ramachandra, David Barr, semen.vadishev

On 19/03/12 01:34, Jonathan Nieder wrote:
> Hi Andrew,
> 
> Andrew Sayers wrote:
> 
>> I'm planning to release the spec under a Creative Commons
>> Attribution-NoDerivs license
> [...]
>> So the big question - would you be more inclined to use/contribute to
>> the SVN Branch Description Format if it had a different license?
> 
> Yes.  By the way, I think fear of forking/discussion of potential
> improvements/translation into other languages in the context of
> standards is misguided.  If you would like legal protection for your
> standard, that is what trademark law is for.
> 
> Kind regards,
> Jonathan
> --
> To unsubscribe from this list: send the line "unsubscribe git" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

Could you expand on this?  A quick tour of the git codebase suggests
your objection is just to the "no derivatives" bit for documentation,
and not to the MIT license for code?

It sounds like you're saying that forking isn't a big real-world
problem, which I guess makes sense - it'll all work out in the end as
long as a single standard is in everybody's interests.  So the CC-BY
license is my favourite for now.

	- Andrew

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Licensing a file format (was Re: SVN Branch Description Format)
  2012-03-19 20:31     ` Andrew Sayers
@ 2012-03-20 22:59       ` Jeff King
  0 siblings, 0 replies; 10+ messages in thread
From: Jeff King @ 2012-03-20 22:59 UTC (permalink / raw)
  To: Andrew Sayers
  Cc: Jonathan Nieder, Andrew Sayers, Git Mailing List, Sam Vilain,
	Stephen Bash, Nathan Gray, Sverre Rabbelier, Dmitry Ivankov,
	Ramkumar Ramachandra, David Barr, semen.vadishev

On Mon, Mar 19, 2012 at 08:31:41PM +0000, Andrew Sayers wrote:

> > Yes.  By the way, I think fear of forking/discussion of potential
> > improvements/translation into other languages in the context of
> > standards is misguided.  If you would like legal protection for your
> > standard, that is what trademark law is for.
> [...]
> 
> Could you expand on this?  A quick tour of the git codebase suggests
> your objection is just to the "no derivatives" bit for documentation,
> and not to the MIT license for code?
> 
> It sounds like you're saying that forking isn't a big real-world
> problem, which I guess makes sense - it'll all work out in the end as
> long as a single standard is in everybody's interests.  So the CC-BY
> license is my favourite for now.

I think the problem is that there are two levels of forking. You want
people to be able to build off of your standard for a number of
legitimate reasons. Perhaps they are publishing a draft proposal of
enhancements. Perhaps they are adapting parts of the content of your
standard to a different domain. Perhaps the original author has become
unresponsive and somebody else wants to pick up maintainership. Those
are all things we do with code, and they help the ecosystem.

What _isn't_ good is somebody modifying your standard and then claiming
that their implementation is "the real SVN History Description" format.

I think Jonathan's point is that CC-BY-ND doesn't allow the legitimate
things in the top paragraph. And your real problem (in the second
paragraph) is not derivatives, but derivatives claiming to be something
they are not (the official standard). And trademarks are the legal tool
for avoiding confusion like that.

In practice, I don't think this kind of name-hijacking is a big deal.
There are many forks of git, and somebody could make a derivative git
that is buggy, interacts badly with existing repository formats, or
interacts badly with other git clients via the network protocol. But
people are usually kind enough not to call their other implementations
"git", and it just works out in practice. So you could probably get by
with just a regular source code license (but I am far from an expert, so
take the appropriate grain of salt).

-Peff

^ permalink raw reply	[flat|nested] 10+ messages in thread

* SVN Branching Language (was Re: SVN Branch Description Format)
  2012-03-11 10:59 SVN Branch Description Format Andrew Sayers
  2012-03-18 23:18 ` Steven Michalske
  2012-03-19  1:28 ` Licensing a file format (was Re: SVN Branch Description Format) Andrew Sayers
@ 2012-03-23  0:08 ` Andrew Sayers
  2012-03-30  4:06 ` SVN Branch Description Format Ramkumar Ramachandra
  3 siblings, 0 replies; 10+ messages in thread
From: Andrew Sayers @ 2012-03-23  0:08 UTC (permalink / raw)
  To: Git Mailing List, Sam Vilain, Stephen Bash, Nathan Gray,
	Jeff King, Sverre Rabbelier, Dmitry Ivankov,
	Ramkumar Ramachandra, David Barr, Jonathan Nieder

This is the second draft of the SVN branching language.  If you're
interested in how SVN revisions map to git commits, please read it and
let me know what you think.  You might also be interested in the
reference implementation and nascent collection of tests[1], although
these wouldn't yet withstand a thorough review.

I plan to pause language work and concentrate on SVN export for a
while now, because it will be easier to e.g. write tests once I have
some code to test.

Here's a little background for people new to the list:

I'm interested in ways to map SVN revisions to git commits, but have
found it surprisingly hard to grasp the problem.  Having trouble
grasping a problem is generally a good sign that your solution is too
small, so I split the problem in three:

1. A language to describe SVN branching and merging
2. A program to export such a file from an SVN repository
3. A program to import such a file into a git repository

This has let me concentrate on manageable parts of the problem.  For
example, Sam Vilain recently pointed out[2] so-called "piecemeal
merges" that split a merge across several revisions.  There's no way I
could have figured out a plan to solve that in one go, but between us
we were able to find a representation and a way of importing it into
git.  I still don't know how to detect and export it from Subversion,
but I can worry about that another day.

This is the second draft I've brought to the git mailing list for
discussion.  The list has a long and proud tradition of including
patches in e-mails so people can review code from their e-mail client,
but in this case the patch would be so large that it's better simply
to include the new text.  I've deliberately left the reference
implementation and tests out of this e-mail because they're not ready
for review.

	- Andrew

[1] https://github.com/andrew-sayers/SVN-Branching-Language
[2] http://article.gmane.org/gmane.comp.version-control.git/192418



SVN Branching Language v0.1
===========================
Andrew Sayers <andrew-sbl@pileofstuff.org>

This file specifies a simple language for describing how SVN revisions
relate to branches.  The goals of this language are to be a
communication protocol for programs operating on SVN history and to
provide a lingua franca for developers discussing unusual SVN use
cases.

Overview
--------

The SVN Branching Language (``SBL'') is a line-based UTF-8 format,
where each line specifies one action in the SVN history.  An SBL file
contains two sections: the first is the ``header'' section that
provides information about the file as a whole, the second is the
``body'' section that provides information about things that happened
in specific revisions.

Any line in an SBL file that begins with a hash or semicolon
character, or that contains only whitespace (including empty lines) is
considered to be a comment, and must be ignored.  Clients should use a
leading '#' to create ordinary comments to be read by users, and a
leading ';' for commented-out actions the client wishes to suggest to
a user.  This makes it easier for a user to accept/reject actions by
search-and-replace.

Here is an example file:

    # the file must begin by specifying the revision of the language being used:
    This is a version 0.1 SVN Branching Language file
    Body:
    In r1, create branch "trunk"
    In r10, create branch "branches/1.0" from "trunk" r9
    In r20, create tag "tags/version_1" from "branches/1.0" r19
    In r20, deactivate "tags/version_1"
    # User intervention is required to confirm this action:
    ; In r25, merge "trunk" up to r24 into "branches/1.0"

Terminology
-----------

The following words should be interpreted as specified below:

Directory::
  A string representing a directory in the SVN tree.  A directory can
  have several flows (i.e. it can be created and deleted multiple
  times), some of which may have names and others not.  The exact
  rules for valid directory names are discussed below, but should be
  as close as possible to the rules for a valid SVN directory name.

Name::
  A string representing a branch or tag.  This is the conventional
  name for a tag as described by users, and is not represented
  anywhere in SVN.  For example, a directory "branches/v1.x/v1.0"
  might have the name "v1.0".  Note that although tags and branches
  are functionally identical, clients must treat tag names and branch
  names as being in different namespaces - for example, a branch name
  "v1.0" and a tag name "v1.0" can both exist at the same time.

Exist::
  A directory is said to exist in a given revision if the directory
  would appear in an `svn ls` command for that revision.  Clients can
  fully implement this language without tracking whether directories
  exist - the term is defined here purely to make discussion easier.

Active::
  A directory is said to be "active" if it its most recent "create"
  action occurred more recently than its most recent "delete" or
  "deactivate" action.  A branch will usually be active if it exists
  and has a name, but there are special cases - for example, an
  existent directory associated with a tag might be deactivated as a
  precautionary measure in case of accidental commits.  Note that only
  directories are said to be "active" - branches and tags are
  instead "accessible" (under a slightly different set of
  circumstances).

Accessible::
  A branch or tag is said to be "accessible" if it has been created
  more recently than it has been deleted (but not deactivated).  Note
  that only branches and tags are said to be "accessible".
  Directories are instead "active" (under a slightly different set of
  circumstances).

Trunk::
  A directory, branch or tag is said to be a "trunk" if it has no
  parent directory, branch or tag.  Trunk tags are possible but
  exceedingly rare.

Header section
--------------

The header section begins with a version identifier, continues with
any number of private actions, and ends with the header-body boundary
marker.

Any unrecognised action in the header should be treated as a fatal
error.

Version identifier
~~~~~~~~~~~~~~~~~~

The first action in this section (not including comments) must exactly
match the following:

    This is a version 0.1 SVN Branching Language file

Later versions of the language will use a different identifier here.
This might be another number (e.g. ``version 3''), a non-numeric
identifier (e.g. ``experimental version''), a different name
(e.g. ``SBL file''), or anything else.  Clients must treat anything
other than the exact string above as a fatal error.

Private actions
~~~~~~~~~~~~~~~

Private actions begin with an open bracket and end with a close
bracket.  For example:

    (my-great-parser will write debugging information to "debug.log")

These actions are intended for internal use by clients using SBL as a
storage format.  Clients must begin any private action they create
with a client-specific identifier followed by a space
(`my-great-parser` in the above example), and must ignore any private
action that does not begin with their identifier.  Client-specific
identifiers must contain one or more characters and must not include
the space, carriage return or newline characters.  These requirements
are designed to ensure that clients do not use private actions to
communicate with other clients - please send such messages out-of-band
(e.g. through arguments to a command-line program), or propose a
revision to the language so that all clients can communicate using a
standard language.

Header-body boundary marker
~~~~~~~~~~~~~~~~~~~~~~~~~~~

The last action in this section must exactly match the following:

    Body:

This serves only to indicate that the header is finishing, and the
body beginning.  The remainder of the file must be treated as the body
section.  Note that this action significantly reduces the complexity
of writing an SBL parser, because clients can use this to switch
between ``header'' and ``body'' parsing modes.

Body section
------------

The body section contains zero or more actions as described below.
Any unrecognised action in the header should be treated as a fatal
error.

The following components are widely used in body actions:

Revision identifiers::
  These strings begin with the letter `r`, followed by a number in the
  range 1-9, then zero or more numbers in the range 0-9.  So `r1`,
  `r10` and `r999` are valid revision identifiers, but `1`, `r01` and
  `revision 999` are not.  Revision identifiers are indicated with
  `<revision>` in the definition of an action.

String identifiers::

  These strings begin with a double quote, then contain zero or more
  valid characters, then another double quote.  Valid characters
  include a backslash followed by `\`, `r`, `n`, or `"`, or any
  character other than backslash, carriage return, newline, double
  quote, or the null character (U+0000).  So `"foo"`, `"foo\""` and
  `"foo\\"` are valid identifiers, but `"foo\"`, `"foo""` and
  `"foo\t"` are not.  Clients must unescape string identifiers by
  removing the leading/trailing quotes and converting `\"`, `\\`, `\r`
  and `\n` to double quote, backslash, carriage return and newline
  respectively.

Directory identifiers::
  These string identifiers indicate a directory.  As well as the
  requirements of a string identifier, valid directory identifiers
  must be valid directories when unescaped.  A valid directory must be
  a sequence of zero or more directory entry names, separated by slash
  characters `/`, and possibly ending with slash characters.  Each
  directory entry name can contain any Unicode character except the
  null character and the slash character `/`. No directory entry may
  be named `.`, `..`, or the empty string.  When unescaping a
  directory, clients must first do the unescaping for a string
  identifier, then convert the identifier to Unicode canonical
  decomposition (NFD) form, then collapse every series of '/'
  characters to a single character (e.g. `foo//bar` should become
  `foo/bar`), and finally remove any trailing `/` character.
  Directory identifiers are indicated with `<directory>` in the
  definition of an action.
  Note that although this definition is based on SVN's own
  documentation, SBL clients must follow the rules above instead of
  the SVN rules in the event of any contradiction between the two.
  SVN's documentation is available from their source repository:
  http://subversion.apache.org/docs/api/latest/group__svn__fs__directories.html#details

Name identifiers::
  These string identifiers indicate a branch or tag name.  As well as
  the requirements for a string identifier, valid name identifiers
  must be valid names when unescaped (see the "terminology" section
  for details).  As well as the rules for string identifers, the empty
  string is not a valid name identifier.

These are the valid actions in the body section:

    In <revision>, create branch <directory>
    In <revision>, create branch <directory> as <name>
    In <revision>, create branch <directory> from <directory> <revision>
    In <revision>, create branch <directory> as <name> from <directory> <revision>

    In <revision>, create tag <directory>
    In <revision>, create tag <directory> as <name>
    In <revision>, create tag <directory> from <directory> <revision>
    In <revision>, create tag <directory> as <name> from <directory> <revision>

    In <revision>, deactivate <directory>
    In <revision>, delete <directory>
    In <revision>, delete branch <name>
    In <revision>, delete tag <name>

    In <revision>, merge <directory> up to <revision> into <directory>
    In <revision>, cherry-pick <directory> <revision> into <directory>
    In <revision>, cherry-pick <directory> <revision> to <revision> into <directory>
    In <revision>, revert <directory> <revision> from <directory>
    In <revision>, revert <directory> <revision> to <revision> from <directory>
    
    In <revision>, ignore <directory>
    In <revision>, amend <directory>, keeping the old log message
    In <revision>, amend <directory>, keeping the new log message
    In <revision>, amend <directory>, keeping both log messages

All actions begin with `In <revision>,`.  This identifies the revision
the action applies to.  Each action must refer to a revision greater
than or equal to the previous action, except the first action which
must be greater than or equal to revision 1.  Clients should treat
revision numbers that are too low as fatal errors.

Create actions
~~~~~~~~~~~~~~

The `create branch` and `create tag` actions identify a directory
being made active and a branch/tag being made accessible in the
specified revision.  The first string identifier indicates the SVN
directory name associated with the branch/tag.  The `as <name>`
identifier indicates the name the user intended for the branch/tag.
If the `as <name>` identifier is not specified, clients must use the
directory name as the identifier, unless the directory name is the
empty string (root directory), in which case they should treat it as a
fatal error (because the empty string is a valid directory but not a
valid name).  The `from <directory> <revision>` identifiers indicate
the directory and revision number to be used as the parent for this
branch/tag (if unspecified, clients should assume the user intended
this to be a trunk).  Here are some examples:

    # Create a trunk branch named "trunk" from directory "trunk":
    In r1, create branch "trunk"

    # Create a branch named "branches/foo" from directory "branches/foo",
    # whose parent is the directory "trunk" as it was in revision 5:
    In r10, create branch "branches/foo" as "foo" from "trunk" r5

    # Create a tag named "1.0" from directory "tags/1.0",
    # whose parent is the directory "branches/foo" as it was in revision 15:
    In r20, create tag "tags/1.0" as "1.0" from "branches/foo" r15

Clients should treat it as a fatal error if the directory was already
active in the current revision.

Clients should treat it as a fatal error if the name was already
accessible in the current revision (remember that branches and tags
must be treated as being in different namespaces).

Clients should treat it as a fatal error if the `from` revision is
greater than the current revision, or if the `from` directory was not
accessible in the specified revision.

If the `from` directory was active but not changed in the specified
revision, clients should behave as if the user specified the last
revision in which the directory was changed prior to the specified
revision.  If the `from` revision is equal to the current revision,
and the `from` directory was changed in the current revision, then
clients should warn the user, but should not treat it as a fatal
error.  These requirements make it significantly easier for users to
manually edit SBL files.

Delete actions
~~~~~~~~~~~~~~

The `deactivate` and `delete` actions identify a name that should
become inaccessible, and/or a directory that should become inactive.
The string identifier indicates the directory or name to be modified.

The `deactivate` action indicates that a directory should become
inactive, but that the associated name should remain accessible.  This
is commonly used for a branch or tag that is still of historical
interest, but should no longer be updated when changes are made.  For
example, a tag might be deactivated after it is created, to indicate
that the tag should be considered immutable even though changes were
accidentally committed to the directory later on.

The `delete` action indicates that a directory should become inactive,
and the associated name should become inaccessible.  This is commonly
used when a branch or tag is no longer of any interest, and can be
ignored completely.  For example, a branch might be deleted if it had
been fully merged into another branch before the directory was
removed.

The `delete branch` and `delete tag` actions indicate that a name
should become inaccessible, and the associated directory should become
inactive if it is still active.  For example, the branch associated
with a deactivated directory might be deleted when the user wants to
create a new branch with the same name.

Clients should only generate `delete branch` and `delete tag` actions
if it would be unsafe to generate a `delete` action because the
directory has already been deactivated, but should expect users
to manually edit files and add any type of action.

Actions that deactivate a directory should treat it as a fatal error
if the directory is not currently active.  This includes the `delete`
action, which behave in ways users would not exect if the directory is
renamed or replaced in SVN before the action occurs.

Actions that make a name inaccessible should treat it as a fatal error
if the name is not currently accessible.

Merge actions
~~~~~~~~~~~~~

The `merge`, `cherry-pick` and `revert` actions identify changes in
the relationship between two directories.  The first string identifier
indicates the directory name for the action's source.  The last string
identifier indicates the directory name for the action's destination.
The revision identifier(s) indicates the revision(s) for the action's
source.

The `merge` action indicates that all revisions up to the specified
point have been applied from the source to the destination.  The
`cherry-pick` and `revert` actions identify an inclusive set of
revisions that have been applied from the source to the destination.
The `merge` and `cherry-pick` actions indicate that revisions that
previously had not been applied now have been applied.  The `revert`
action indicates that revisions which had previously been applied have
no longer been applied.

If two revisions were specified, clients must treat it as a fatal
error if the first revision is greater than the second.

Clients must treat it as a fatal error if the `from` directory was not
active in the specified revision.  If two revisions are specified,
clients must treat it as a fatal error if the `from` directory was
inactive in either revision, or was deactivated after the first
revision and reactivated before the last revision.

For the `merge` action, if the `from` directory was active but not
changed in the specified revision, clients must behave as if the user
specified the highest revision in which the directory was changed
before the revision they actually specified.  If the `from` revision
is equal to the current revision, clients should warn the user if the
from directory was changed in that revision.

For the two-revision `revert` and `cherry-pick` actions, clients must
treat the second revision as in the previous paragraph.  If the from
directory was not changed in the first revision, clients must behave
as if the user specified the lowest revision in which the directory
was changed after the specified revision.

For all `revert` and `cherry-pick` actions, clients must treat it as a
fatal error if no revisions in the specified range changed the
directory.

Note: the above requirements for altering revision numbers make it
significantly easier for users to manually edit an SBL file.

If a `merge` action specifies a revision less than or equal to any
earlier merge action that has not been reverted, clients should treat
it as a fatal error.  Clients should not treat it as a fatal error if
the merge includes revisions that have previously been cherry-picked.

If a `cherry-pick` action includes the first revision in which the
directory was changed after any earlier merge action, or if the
destination directory was originally created from the source directory
and the `cherry-pick` action includes the next revision in which the
directory was changed, clients should warn the user that they probably
meant to merge.

If a `revert` action specifies a revision that has not been
cherry-picked, or is greater than any earlier merge action that has
not been reverted, clients should treat it as a fatal error.

Clients may warn about any merge actions they feel are unusual, but
should not treat anything as a fatal error unless specified above.

Note: there is no `unmerge` action.  See `t/basic/unmerge.sh` for
how unmerging is achieved.

Edit actions
~~~~~~~~~~~~

The `ignore` and `amend` actions identify a directory and revision
which should not create a new revision in the history.  For example,
an accidental `svn commit` followed immediately by an `svn commit`
reverting it might simply be ignored.  The string identifier indicates
the name of the directory to be edited.

The `ignore` action indicates that the client must act as if no
changes were made to the directory in the specified revision, whether
or not changes were actually made.  Note that this only includes
changes made in the SVN repository itself, not those indicated by
actions in the SBL file.

The `amend` actions indicate that the state of the directory in the
current revision should overwrite the most recent revision for that
directory.  When overwriting the most recent revision, clients must
retain the revision log from the previous revision if the action
specifies `keeping the old log message`, replace the revision log
entirely with the log message for the current revision if the action
specifies `keeping the new log message`, or concatenate the new log
message to the end of the old one if the action specifies `keeping
both log messages`.  Clients may reformat log messages when keeping
both, but are reminded of the need for messages to look sensible when
long chains of amendments are created.

Clients may warn, but should not treat it as a fatal error, if an edit
action is specified for a directory that was not changed in the
current revision.

Clients should treat it as a fatal error if an edit action is applied
to a directory in the same revision as the directory becomes active.

Limitations
-----------

SBL only allows a directory to be associated with a maximum of one
branch or one tag at once.  There is no clear use case for allowing a
user to e.g. associate a tag with a directory that is already a
branch, and disallowing it enables better detection of user errors.
Future versions of this language might allow multiple branches/tags
per directory if a use case is found.

Although not explicitly forbidden, clients are not required to support
recursive branches.  For example, if ``trunk'' is a branch, clients
may assume that ``trunk/foo'' is neither a branch nor a tag.

See `t/advanced/subproject_branch.sh` for a common edge case for
parsers that disallow recursive branches.

License
-------

SVN Branching Language by Andrew Sayers <andrew-sbl@pileofstuff.org>
is licensed under a Creative Commons Attribution 3.0 Unported License.

The full license is available here: http://creativecommons.org/licenses/by/3.0/legalcode
A human-readable summary is available here: http://creativecommons.org/licenses/by/3.0/

The following explanation must have no bearing on the meaning of the
above license:

My primary goal in drafting this language was to provide a single
format that everyone could use to describe the behaviour of SVN
repositories.  I felt this goal was best served by allowing commercial
use of the language, and by allowing other dialects to be formed but
discouraged because of the network effect of a common language.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: SVN Branch Description Format
  2012-03-11 10:59 SVN Branch Description Format Andrew Sayers
                   ` (2 preceding siblings ...)
  2012-03-23  0:08 ` SVN Branching Language " Andrew Sayers
@ 2012-03-30  4:06 ` Ramkumar Ramachandra
  2012-03-31  1:27   ` Andrew Sayers
  3 siblings, 1 reply; 10+ messages in thread
From: Ramkumar Ramachandra @ 2012-03-30  4:06 UTC (permalink / raw)
  To: Andrew Sayers
  Cc: Git Mailing List, Sam Vilain, Stephen Bash, Nathan Gray,
	Jeff King, Sverre Rabbelier, Dmitry Ivankov, David Barr,
	Jonathan Nieder

Hi,

Andrew Sayers wrote:
> SVN Branch Description Format v0.1

I found this pretty interesting.  Doesn't it duplicate some of the
functionality of reposurgeon [1] though?

[1]: http://esr.ibiblio.org/?p=4071

    Ram

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: SVN Branch Description Format
  2012-03-30  4:06 ` SVN Branch Description Format Ramkumar Ramachandra
@ 2012-03-31  1:27   ` Andrew Sayers
  0 siblings, 0 replies; 10+ messages in thread
From: Andrew Sayers @ 2012-03-31  1:27 UTC (permalink / raw)
  To: Ramkumar Ramachandra
  Cc: Git Mailing List, Sam Vilain, Stephen Bash, Nathan Gray,
	Jeff King, Sverre Rabbelier, Dmitry Ivankov, David Barr,
	Jonathan Nieder

On 30/03/12 05:06, Ramkumar Ramachandra wrote:
> Hi,
> 
> Andrew Sayers wrote:
>> SVN Branch Description Format v0.1
> 
> I found this pretty interesting.  Doesn't it duplicate some of the
> functionality of reposurgeon [1] though?
> 
> [1]: http://esr.ibiblio.org/?p=4071

Yes, I've been procrastinating all week instead of reading up on
reposurgeon and contacting ESR about possibile collaboration.

I think you need something a bit more expressive than reposurgeon's
format to do SVN<->Git conversion well, and I think you need something a
bit more accessible in order to document SVN edge cases.  For example, I
don't see how reposurgeon could represent all the madness around SVN
cherry-picks that become merges when you manually add information from
revision logs, then become cherry-picks again when you find a revert
coming in from another branch.  Having said that, a (lossy) conversion
between SBL and reposurgeon format would probably be useful and not that
hard.

The link above put it very well that most people leave an embarrassed
“to be done” comment and disappear when they realise how much of a
nightmare the mapping is.  What it doesn't mention is that everyone
experiences a slightly different part of the nightmare, and that we can
only really tackle the problem by getting everyone's freaky edge cases
written up in one language in one place.  The test suite[1] isn't that
impressive right now, but in the long-term I'm really keen to get
implementers to pool their knowledge so we can all benefit.  SBL is
designed to let people open the relevant test without reading the spec
and say "oh right I understand what a piecemeal merge is now.  I'll go
implement that in my project".

I'm currently working on code to read an SVN dump and write to SBL.
This will definitely overlap with reposurgeon's SVN export
functionality, but without seeing the final code I can't say how much.
That's fine though - as I say, the only way to get a good solution is
for multiple implementers to investigate the problem and share the edge
cases they find.

	- Andrew

[1]https://github.com/andrew-sayers/SVN-Branching-Language/tree/master/t

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2012-03-31  2:08 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-03-11 10:59 SVN Branch Description Format Andrew Sayers
2012-03-18 23:18 ` Steven Michalske
2012-03-19  1:28   ` Andrew Sayers
2012-03-19  1:28 ` Licensing a file format (was Re: SVN Branch Description Format) Andrew Sayers
2012-03-19  1:34   ` Jonathan Nieder
2012-03-19 20:31     ` Andrew Sayers
2012-03-20 22:59       ` Jeff King
2012-03-23  0:08 ` SVN Branching Language " Andrew Sayers
2012-03-30  4:06 ` SVN Branch Description Format Ramkumar Ramachandra
2012-03-31  1:27   ` Andrew Sayers

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.