<?xml version="1.0"?>
<rss version="2.0"
     xmlns:dc="http://purl.org/dc/elements/1.1/"
     xmlns:dcterms="http://purl.org/dc/terms/" >
<channel>
<title>pages tagged git</title>
<link>http://sam.vilain.net//tags/git.html</link>
<description>samv.blog</description>
<item>

	<title>past synthesis</title>


	<guid isPermaLink="false">http://sam.vilain.net//comp/git/gittorrent/past_synthesis.html</guid>

	<link>http://sam.vilain.net//comp/git/gittorrent/past_synthesis.html</link>


	<category>git</category>

	<category>gittorrent</category>


	<pubDate>Sun, 06 Mar 2011 19:18:00 +0000</pubDate>
	<dcterms:modified>2011-03-13T16:00:31Z</dcterms:modified>

	<description>&lt;h1&gt;GitTorrent: a synthesis of past efforts&lt;/h1&gt;

&lt;p&gt;If you read &lt;a href=&quot;http://git.661346.n2.nabble.com/Re-Resumable-clone-Gittorrent-again-stable-packs-tp5894379p5908685.html&quot;&gt;this list post&lt;/a&gt; (&lt;a href=&quot;http://thread.gmane.org/gmane.comp.version-control.git/164569/focus=164897&quot;&gt;gmane archive&lt;/a&gt;), then you will probably see not much new here.  I include it as a back-drop for the subsequent articles.&lt;/p&gt;

&lt;h2&gt;GitTorrent concept: torrent the pack files&lt;/h2&gt;

&lt;p&gt;The idea of applying the straight BitTorrent protocol to the pack
files was the starting point for GitTorrent.  However, this turns out
not to be useful, as the pack files are not determinisitic.  It is
only under a very strict set of precarious circumstances that any two
nodes computing a pack for a git set of git objects will produce the
same binary content.  Fluke, if you will.&lt;/p&gt;

&lt;p&gt;Therefore, it seemed to add little to the idea of using unmodified
BitTorrent, perhaps distributing a pack file or a git bundle; for
instance, no peer could participate in the swarm - even with a
complete clone of the repository - without downloading the exact pack
file that the repository was serving.&lt;/p&gt;

&lt;p&gt;So, over the period of several months, Jonas and I revised the RFC
principally to expressed it in terms of stable object manifests, with
the goal that nodes could participate with .  You can get a flavour
for the exchance by glancing at &lt;a href=&quot;https://github.com/samv/gittorrent/commits/master?page=3&quot;&gt;the RFC source
history&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;The &lt;a href=&quot;http://utsl.gen.nz/gittorrent/rfc.html&quot;&gt;resultant RFC&lt;/a&gt; invents
terms such as &quot;Commit Reel&quot;, defined by a sorting algorithm for
objects, similar to the order returned by:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;git rev-list --date-order --objects
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;The above ordering is for all intents and purposes stable, with only a
very minor edge case where no strict order exists.&lt;/p&gt;

&lt;h2&gt;GitTorrent Summer of Code project&lt;/h2&gt;

&lt;p&gt;There is &lt;a href=&quot;http://github.com/samv/VCS-Git-Torrent&quot;&gt;prototype code&lt;/a&gt; from
a 2008 Google Summer of Code project.  While this project was not
considered successful, some key concepts can be demonstrated with it
and so I will make that the starting point of the next post in this
series, and use it to illustrate the design of the protocol.&lt;/p&gt;

&lt;p&gt;One of the practical discoveries was that the code base could not
quickly generate the object indexes required for efficiently answering
GitTorrent messages.&lt;/p&gt;

&lt;h2&gt;Related project: git rev-cache&lt;/h2&gt;

&lt;p&gt;This project was aimed at being a generic cache for git revision tree
walking.  The idea is that while git&#39;s &lt;a href=&quot;http://en.wikipedia.org/wiki/Graph_coloring&quot;&gt;graph
colouring&lt;/a&gt; algorithm is
fast enough for most operations that are important to a user, such as
good interactive performance, they are not sufficient for a gittorrent
server, or even for the &#39;initial git clone&#39; case:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;Computing the results involves a huge amount of &lt;em&gt;pointer chasing&lt;/em&gt; that requires that the cache be &lt;em&gt;hot&lt;/em&gt;.  If the cache is not hot, such as on a busy server, it can take &lt;em&gt;minutes&lt;/em&gt; just to calculate the amount of work to do.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;If you want to take a large amount of objects and retrieve a particular sub-section of them, then you have to do all the above work.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;


&lt;p&gt;So, the revision cache helps by keeping just the important data in a binary, sequential file: all of the important information necessary for graph traversal can be retrieved quickly and computed quickly, too.  I will dedicate at least one post to this project, where I will try to merge it with the latest git and show it in action.&lt;/p&gt;

&lt;h2&gt;GitTorrent distilled: mirror-sync&lt;/h2&gt;

&lt;p&gt;One of the challenges with GitTorrent was the amount of infrastructure
that was required just to get to the point where the core algorithms
could be designed.  By using Perl, there were already off-the-shelf
packages available for things like Bencoding, etc - but it was still
quite a drag.&lt;/p&gt;

&lt;p&gt;After some reflection on this, and from having read the BitTorrent
protocol, I decided that the BitTorrent protocol itself is all cruft
and that trying to cut it down to be useful was a waste of time.&lt;/p&gt;

&lt;p&gt;The idea of &quot;automatic mirroring&quot; came from this.  With Automatic
Mirroring, the two main functions of P2P operation - peer discovery
and partial transfer - are broken into discrete features.&lt;/p&gt;

&lt;p&gt;I presented this idea at &lt;a href=&quot;https://git.wiki.kernel.org/index.php/GitTogether&quot;&gt;GitTogether&lt;/a&gt; 2009, and produced &lt;a href=&quot;http://thread.gmane.org/gmane.comp.version-control.git/133626/focus=133628&quot;&gt;a patch series&lt;/a&gt; called &quot;client-side mirroring&quot; that was to be efforts towards this goal.&lt;/p&gt;

&lt;p&gt;The &lt;a href=&quot;http://code.google.com/p/gittorrent/wiki/MirrorSync&quot;&gt;design of
Mirror-Sync&lt;/a&gt; is
simple enough to be expressed on a single page, making it a vast
improvement over GitTorrent already.  Additionally, it would fit
within the existing git protocol, allowing existing git servers to
smoothly get the benefits from peer to peer technology.&lt;/p&gt;

&lt;p&gt;If you want to follow this series, you can subscribe to &lt;a href=&quot;http://sam.vilain.net/tags/gittorrent.html&quot;&gt;the
gittorrent tag&lt;/a&gt;, &lt;a href=&quot;http://sam.vilain.net/comp/git.html&quot;&gt;my git
section&lt;/a&gt;, &lt;a href=&quot;http://sam.vilain.net/comp.html&quot;&gt;my comp section&lt;/a&gt; or even &lt;a href=&quot;http://sam.vilain.net/blog.html&quot;&gt;my
entire blog&lt;/a&gt;.&lt;/p&gt;
</description>


	<comments>/comp/git/gittorrent/past_synthesis.html#comments</comments>

</item>
<item>

	<title>svn in review</title>


	<guid isPermaLink="false">http://sam.vilain.net//comp/git/svn_in_review.html</guid>

	<link>http://sam.vilain.net//comp/git/svn_in_review.html</link>


	<category>git</category>

	<category>svn</category>


	<pubDate>Sun, 06 Apr 2008 20:00:00 +0100</pubDate>
	<dcterms:modified>2011-02-26T06:41:11Z</dcterms:modified>

	<description>&lt;h1&gt;Subversion review&lt;/h1&gt;

&lt;p&gt;The design roots of Subversion can be traced back to the first very simplistic attempts at version control, such as SCCS and RCS.  The design of it has steamrolled on from the 70&#39;s with little consideration of stable internet development methods practiced since at least the mid-eighties.&lt;/p&gt;

&lt;p&gt;The claim is made that Subversion &quot;just fixes CVS&quot;.  And while Subversion is generally more robust and versatile than CVS, some still see it as a step backwards.  Unlike CVS, SVN is hard to fix when it goes wrong - there are no user-servicable parts inside.  Branches and even tags are denied first class recognition by the system, no doubt borrowing some design from Perforce but missing the important bit that made it work (p4&#39;s integration - only now being added with &quot;merge tracking&quot;).  CVS fixed?  Hardly - CVS re-engineered as a cripple.  (For a true &quot;drop-in&quot; replacement for CVS that fixes the most important bugs of CVS and doesn&#39;t remove features, try git-cvsserver)&lt;/p&gt;

&lt;p&gt;Don&#39;t buy the &quot;svn 1.5 will fix merging&quot; snake-oil; the new design is still vastly deficient compared with the real A-class tools out there today, such as Git, Bazaar-NG and Mercurial.  It might be almost as good as Perforce, hooray you&#39;ve caught up to 10 years ago.  That&#39;s if we ever see a release - since last November, the Subversion team have managed 6 minor releases, compared to git&#39;s 1 major release, 3 minor releases, 30 stable releases and 26 stable release candidates.  There really is no comparison.&lt;/p&gt;

&lt;p&gt;As for the speed, after using Git or Mercurial for a while, you go back to SVN and you seriously start to think it&#39;s broken or hung - then you realise no, it&#39;s just slow.  Especially if you are trying to treat your code as a revision data warehouse, for techniques such as code annotation or bisection.&lt;/p&gt;

&lt;p&gt;As far as &quot;using HTTP infrastructure&quot; - this is an oversold benefit - note that Subversion is actually using HTTP+WebDAV as a horrific delivery mechanism for its XML-RPC messages.  There&#39;s nothing standard about it at all - and that&#39;s ignoring the fact that WebDAV required us all to upgrade our webservers.  Some users were forced into an upgrade treadmill to install the specific, alpha version of Apache that was required.&lt;/p&gt;

&lt;p&gt;By my own observation, virtually every proponent of Subversion left either has a significant stake in it, or has simply never tried any other system.  They are in another world - a world where removing the ability to do sane branching, merging and tagging was construed as a feature.  The net effect is that the open source community is now left with a legacy of useless history for the 5 years or so that the SVN fad has taken the world by storm.  This legacy is not caused by the difficulty in conversion - not at all - but more from the dreadful development practices its idiotic design promotes.  The buzz word of &quot;commit bit&quot; disguises a widespread practice of skimping on code review.  Sure, it might be possible to figure out what the individual changes are in that repository, but who can dig them out from the mess of commits?  And with sufficiently few eyeballs to review changes, all code bases are buggy.&lt;/p&gt;

&lt;p&gt;And buggy it certainly is.  Virtually every project I encountered that tried to use its API - assuming they could figure its crazy system of batons and allocation pools and callbacks out - were mired with random segfaults and difficult to track down core bugs.&lt;/p&gt;

&lt;p&gt;Subversion has already become a modern relic; it&#39;s a zombie project unable to make stable releases or effectively manage their spaghetti codebase.  Abandon ship now.&lt;/p&gt;

&lt;p&gt;(NOTE: not that I don&#39;t have some good things to say about it, see for instance &lt;a href=&quot;http://use.perl.org/~mugwumpjism/journal/30574&quot;&gt;use perl article on subversion&lt;/a&gt;, and also &lt;a href=&quot;http://utsl.gen.nz/talks/git-svn/intro.html#sux&quot;&gt;this section in an article I wrote&lt;/a&gt; )&lt;/p&gt;
</description>


	<comments>/comp/git/svn_in_review.html#comments</comments>

</item>
<item>

	<title>free providers</title>


	<guid isPermaLink="false">http://sam.vilain.net//comp/git/free_providers.html</guid>

	<link>http://sam.vilain.net//comp/git/free_providers.html</link>


	<category>git</category>


	<pubDate>Sun, 30 Mar 2008 14:00:00 +0100</pubDate>
	<dcterms:modified>2011-02-26T06:41:11Z</dcterms:modified>

	<description>&lt;h1&gt;Public Access Git Repositories&lt;/h1&gt;

&lt;p&gt;It seems that a lot of sites have cropped up that offer free Git hosting;

&lt;ul&gt;
&lt;li&gt;First there was &lt;a href=&quot;http://repo.or.cz&quot;&gt;repo&lt;/a&gt; by Petr Baudis of cogito fame.  A service running from Prague, based on a few simple CGIs, themselves published.
&lt;li&gt;Then I think &lt;a href=&quot;http://gitorious.org&quot;&gt;gitorious&lt;/a&gt; came along, and also &lt;a href=&quot;http://github.com&quot;&gt;GitHub&lt;/a&gt; - both Ruby implementations and some adding services
&lt;/ul&gt;

&lt;p&gt;I have a hunch that people are writing these things as they cotton on to the benefits of distributed version control, and none of the centralised based sites out there (eg, SourceForge, etc) were really coming to the table quickly enough.

&lt;p&gt;These sites use CTAN / CPAN rules - ie, first come, first served when it comes to project names.  However, unlike CPAN, these systems will allow forks without requiring a package name change.  This is an idea which was specified for Perl 6, and a problem space that I debated extensively with Mark Overmeer, the result being the &lt;a href=&quot;http://cpan6.org/papers/cpan6-design.pdf&quot;&gt;early design documents for CPAN6&lt;/a&gt;.

&lt;p&gt;It&#39;s funny how Canonical&#39;s Launchpad never really achieved this ball of motion, despite bzr being technically just as capable as git - though missing some &lt;a href=&quot;http://utsl.gen.nz/talks/git-svn/intro.html#yay-mirroring&quot;&gt;Ferrari Features&lt;/a&gt;.   My hunch is that it is because no version control system really got lots of people excited as when git hit the scene.

&lt;p&gt;I have managed to mirror repo to a &lt;a href=&quot;http://planet.catalyst.net.nz/&quot;&gt;Catalyst&lt;/a&gt;-hosted machine - currently browsable under http at &lt;a href=&quot;http://git.utsl.gen.nz/mirror/repo.or.cz/&quot;&gt;git.utsl.gen.nz&lt;/a&gt; and updating every 8 hours.  I hope to also talk to the other major providers of git hosting, and see if I can pull together some kind of co-ordination of effort, so that mirroring and searching these git hosting sites can be &lt;em&gt;easy&lt;/em&gt;.

&lt;p&gt;It&#39;s funny, in lots of ways I keep feeling that I&#39;m going around in circles.  The CPAN6 thing first came up ... was it two years ago?  I feel so disappointed in the results of that process; however I think there were interesting lessons to be learned about avoiding the Big Design Up Front thing.

&lt;p&gt;This time, it&#39;s the pragmatist&#39;s approach - get a huge amount of mirroring going, get the &lt;a href=&quot;http://groups.google.com/group/gitorious/msg/cbf425dc9e205a5d&quot;&gt;minimum indexing going&lt;/a&gt; that will allow for a peer-to-peer cloud to push to each other, and leave it at that.  Based on the earlier findings, I have no reason to believe that the key goals CPAN6 design would not cleanly fit on top of these open Git repository mirrors, with a very thin veneer - the veneer itself not even requiring any support from the hosting providers.

&lt;p&gt;The upshots of this should be vastly reduced barriers to entry of people&#39;s code into free software projects.  No longer will you have to convince software authors that your feature is worthwhile - you can just make a feature branch of your own, and upload to the git hosting cloud.  So, &quot;hit and run&quot; patches are more likely to be created, and works in progress shared more easily.  Especially in light of the very interesting cross-distro effort spearheaded by madduck - &lt;a href=&quot;http://vcs-pkg.org/&quot;&gt;vcs-pkg&lt;/a&gt;.  Just imagine - if a mainstream distro such as FreeBSD&#39;s Ports, or Debian&#39;s source archive were to &quot;piggy back&quot; on one of the other sites, or more likely start their own, then it should be possible to distrubute all of these different FLOSS systems using the same bandwidth and mirror space.  These sites could easily let you create a fork, with a lightweight fork for just submitting a patch a very simple case of that.

&lt;p&gt;This might seem like &quot;git madness&quot; - but to those who understand that git is a different class of software to the other VCS systems out there, this sort of thing is what the excitement was about all along.



</description>


	<comments>/comp/git/free_providers.html#comments</comments>

</item>
<item>

	<title>svn ohloh review</title>


	<guid isPermaLink="false">http://sam.vilain.net//comp/git/svn_ohloh_review.html</guid>

	<link>http://sam.vilain.net//comp/git/svn_ohloh_review.html</link>


	<category>git</category>

	<category>svn</category>


	<pubDate>Mon, 07 Jan 2008 14:00:00 +0000</pubDate>
	<dcterms:modified>2011-02-26T06:41:11Z</dcterms:modified>

	<description>&lt;h1&gt;Scathing review of Subversion on OHLOH being ++&#39;d&lt;/h1&gt;

&lt;p&gt;Ok, so I wrote a &lt;a href=&quot;http://www.ohloh.net/projects/1/reviews&quot;&gt;review&lt;/a&gt; (see my other post &lt;a href=&quot;http://sam.vilain.net/comp/git/svn_in_review.html&quot;&gt;Why are you still using Subversion&lt;/a&gt;) of Subversion which bordered on ranting.  I even advertised it on #git, but relatively few people marked it as &quot;useful&quot;.  While it was sitting on &quot;3 of 8 found this useful&quot;, I was going to delete it, but revisiting it, I found that it&#39;s been consistently getting marked as &quot;useful&quot; by people.  It&#39;s now sitting on &quot;7 of 13&quot;.  With a few more upvotes, it could hit the front page of Subversion&#39;s user reviews!  :-&gt;

&lt;p&gt;&lt;b&gt;Update:&lt;/b&gt; LOL!  &lt;img src=&quot;http://sam.vilain.net/files/lolsvn.png&quot;&gt;

&lt;p&gt;&lt;b&gt;More:&lt;/b&gt; &lt;a href=&quot;http://blogs.open.collab.net/svn/2007/11/branching-strat.html&quot;&gt;This post&lt;/a&gt; on the Subversion blog makes me think they&#39;re at least catching the drift - I wrote about something like this in my &lt;a href=&quot;http://utsl.gen.nz/talks/git-svn/intro.html#darcs-rulz&quot;&gt;svn departer&#39;s guide&lt;/a&gt;.



</description>


	<comments>/comp/git/svn_ohloh_review.html#comments</comments>

</item>
<item>

	<title>new p4 importer</title>


	<guid isPermaLink="false">http://sam.vilain.net//comp/perl/new_p4_importer.html</guid>

	<link>http://sam.vilain.net//comp/perl/new_p4_importer.html</link>


	<category>git</category>


	<pubDate>Tue, 18 Dec 2007 16:00:00 +0000</pubDate>
	<dcterms:modified>2011-02-24T01:54:12Z</dcterms:modified>

	<description>&lt;h1&gt;My new Perforce Importer is Unstoppable!!&#39;&lt;/h1&gt;

&lt;p&gt;(with apologies to &lt;a href=&quot;http://mnftiu.cc/&quot;&gt;mnftiu&lt;/a&gt;)

&lt;p&gt;Ok, so I&#39;ve been working on this &lt;a href=&quot;http://utsl.gen.nz/gitweb/?p=git-p4raw&quot;&gt;program&lt;/a&gt; to convert the Perl history from Perforce, with the view to tack it on top of the &lt;a href=&quot;http://use.perl.org/~mugwumpjism/journal/34159&quot;&gt;previous conversion&lt;/a&gt; I worked on.

&lt;div style=&quot;text-align: center; float:right; max-width: 40%; padding: 0.5em 1em 0.5em 0.3em&quot;&gt;&lt;a href=&quot;http://sam.vilain.net/files/hist-screenshots/tidy-cross-merging.png&quot; target=&quot;gitk&quot;&gt;&lt;img border=&quot;0&quot; style=&quot;max-width: 90%&quot; src=&quot;http://sam.vilain.net/files/hist-screenshots/tidy-cross-merging.png&quot; /&gt;&lt;/a&gt;&lt;p&gt;&lt;em&gt;&#39;gitk&#39; looking at some of the converted perforce history&lt;/em&gt;&lt;/div&gt;

&lt;p&gt;Today, I had my first successful run that includes converting the integration data.  It&#39;s a huge milestone for this project - which I initially agreed with Nicholas Clark to undertake way way back in August 2006 over a pint on the banks of the river Thames.  I was on my way back from my &lt;a href=&quot;http://planet.catalyst.net.nz/blog/taxonomy/term/17&quot;&gt;Catalyst sponsored visit to Birmingham for YAPC::Europe&lt;/a&gt;.  I certainly didn&#39;t think I&#39;d be still working on it a good 16 months later. 

&lt;p&gt;This latest conversion is based on reverse engineering Perforce &#39;s back-end.  In addition to its binary data files, it keeps a nice, simple journal file and periodically checkpoints all the data in the database, and uses RCS files for holding actual file data.  This makes importing a &quot;simple&quot; matter of loading its checkpoint/journal files into a database, and writing a few queries.  Getting all the file images out was easy enough - and I used the &lt;a href=&quot;http://www.kernel.org/pub/software/scm/git/docs/git-fast-import.html&quot;&gt;git fast-import&lt;/a&gt; interface, and within a few days of working on it came up with a &lt;a href=&quot;http://utsl.gen.nz/gitweb/?p=git-p4raw;a=shortlog;h=take1-peril&quot;&gt;version&lt;/a&gt; that would export all of the files for the 30k-odd revisions in the Perl repository (over 225k distinct images in about 20k RCS files) in about 5 minutes on my laptop.

&lt;div style=&quot;text-align: center; float:right; max-width: 40%; padding: 0.5em 1em 0.5em 0.3em&quot;&gt;&lt;a href=&quot;http://sam.vilain.net/files/hist-screenshots/pretty.png&quot; target=&quot;gitk&quot;&gt;&lt;img border=&quot;0&quot; style=&quot;max-width: 90%&quot; src=&quot;http://sam.vilain.net/files/hist-screenshots/pretty.png&quot; /&gt;&lt;/a&gt;&lt;p&gt;&lt;em&gt;Perforce has supported branching and merging for over a decade&lt;/em&gt;&lt;/div&gt;

&lt;p&gt;Perforce has a lot of model similarities with Subversion - principally, that it tracks branches in the same namespace used by the filesystem of your project (this &quot;branching is just copying&quot; snake-oil), so that finding the list of branches is not a simple operation.  In fact, it may require complicated history analysis.  Another similarity is that they both store file history, rather than say tree history like git or a teetering pile of patches like darcs.  I&#39;d call Perforce&#39;s engineering quality a lot higher than Subversion, though - the design is quite elegant, it just lacks ... well, the underlying simplicity of git.  To compare, git has 5 important object types, if you include references, with very few fields.  Perforce has 38 tables, and though only a handful of those are really required (my script loads 7 of them), the overall complexity of the schema is far higher.

&lt;p&gt;As a result, managing branched development has been an art known well only to an elite minority of users - which I think do include Perforce users.  It&#39;s quite obvious looking through this history that the integration facilities have been there since the repository was imported, and the pumpkings knew how to use them.

&lt;h2&gt;The challenge&lt;/h2&gt;

&lt;p&gt;The principle task at hand was to convert the literally &lt;em&gt;hundreds&lt;/em&gt; of integration records 
held in Perforce, and try to convert it into git native format while not throwing &lt;em&gt;any&lt;/em&gt; information away that might be important.

&lt;div style=&quot;text-align: center; float:left; max-width: 40%; padding: 0.5em 0.3em 0.5em 1em&quot;&gt;&lt;a href=&quot;http://sam.vilain.net/files/hist-screenshots/integrates-clean.png&quot; target=&quot;gitk&quot;&gt;&lt;img border=&quot;0&quot; style=&quot;max-width: 90%&quot; src=&quot;http://sam.vilain.net/files/hist-screenshots/integrates-clean.png&quot; /&gt;&lt;/a&gt;&lt;p&gt;&lt;em&gt;In many instances, the conversion is quite clean&lt;/em&gt;&lt;/div&gt;

&lt;p&gt;The logic was this - if you see a change in the history, such as
 &quot;integrate changes from mainline&quot;, and there are a bunch of integration records that seem to indicate that all of the files on the branch you are merging from that have changed since you last merged, have an integration record for them, then you mark that commit as a merge.  If this single piece of information agrees with hundreds of per-file records, then you have just simplified the repository, without losing any information.  I thought I had this early on with &lt;a href=&quot;http://utsl.gen.nz/gitweb/?p=git-p4raw;a=blob;h=bf632569;hb=1c9c44069;f=git-p4raw#l688&quot;&gt;this monster query&lt;/a&gt;, but I ran into several problems - the answer to the question &quot;which files are outstanding for merging?&quot; is quite difficult to answer.  The &lt;a href=&quot;http://utsl.gen.nz/gitweb/?p=git-p4raw;a=commit;h=2e0af01a6&quot;&gt;first answer I came up with&lt;/a&gt;, to just do an index diff between the two trees, only worked for the first integration.  I needed to write a &lt;a href=&quot;http://utsl.gen.nz/gitweb/?p=git-p4raw;a=commitdiff;h=0e17fa95;hp=8f2963c9&quot;&gt;merge base function&lt;/a&gt; capable of inspecting the history graph.  It comes up with a list of changed files since the merge base, and uses that to decide what integration records &lt;em&gt;should&lt;/em&gt; be present. 

&lt;p&gt;The good news is that it appeared to work very well - a huge portion of the integration information in the history corresponds to complete cross-merges between branches.  

&lt;h2&gt;The impedence mismatch&lt;/h2&gt;

&lt;p&gt;One of the biggest concerns is that the so-called &quot;impedence mismatch&quot;, that is, the presence of important information that simply cannot be represented by git&#39;s model.

&lt;div style=&quot;text-align: center; float:left; max-width: 40%; padding: 0.5em 0.3em 0.5em 1em&quot;&gt;&lt;a href=&quot;http://sam.vilain.net/files/hist-screenshots/integration-impedence-mismatch.png&quot; target=&quot;gitk&quot;&gt;&lt;img border=&quot;0&quot; style=&quot;max-width: 90%&quot; src=&quot;http://sam.vilain.net/files/hist-screenshots/integration-impedence-mismatch.png&quot; /&gt;&lt;/a&gt;&lt;p&gt;&lt;em&gt;Integration FROM an unstable branch&lt;/em&gt;&lt;/div&gt;

&lt;p&gt;In particular, one thing that you can&#39;t do with just tree tracking is to detect cherry picking, where those cherry picks changed the files along the way.  If the cherry picking happened without changes this is easily detectable without any metadata - but if it did change, you either need some kind of fuzzy logic, or you need to record out-of-band what happened.

&lt;p&gt;Git and Perforce couldn&#39;t be more chalk and cheese about this.  Perl&#39;s development model is one that a colleague dubbed the &lt;em&gt;ghetto merge model&lt;/em&gt;, where you have some kind of &quot;hood&quot; where the &quot;flyist features go to battle it out&quot;.  The features left standing are subsequently moved into &lt;em&gt;trailer parks&lt;/em&gt; (aka &lt;em&gt;integration branches&lt;/em&gt;), where they can make a new life for themselves and prove their stability.

&lt;div style=&quot;text-align: center; float:right; max-width: 33%; padding: 0.5em 1em 0.5em 0.3em&quot;&gt;&lt;a href=&quot;http://sam.vilain.net/files/hist-screenshots/integrates-missing.png&quot; target=&quot;gitk&quot;&gt;&lt;img border=&quot;0&quot; style=&quot;max-width: 90%&quot; src=&quot;http://sam.vilain.net/files/hist-screenshots/integrates-missing.png&quot; /&gt;&lt;/a&gt;&lt;p&gt;&lt;em&gt;Perforce seemed to be happy with this merge.  But why did the integration records not mention some mergable changes?&lt;/em&gt;&lt;/div&gt;

&lt;p&gt;Git does support that model - for instance, Junio Hamano&#39;s &lt;a href=&quot;http://repo.or.cz/w/git.git?a=shortlog;h=pu&quot;&gt;proposed updates&lt;/a&gt; branch of git could be considered the &quot;unstable&quot; branch, and changes can even be completely dropped from that one.  There is a branch called &quot;next&quot; for features which are considered going into the &quot;next&quot; minor release, from which changes are not removed.  There is a &quot;maint&quot; branch which bugfixes for existing versions go onto, etc.  The key difference here is that changes go in earliest-first, rather than newest-first, with variations between the revisions of them showing up as the stable branches are re-merged into the newer branches.  Git&#39;s development model has already proven itself to stimulate innovation and experimentation, as well as the core team being able to produce a very stable product, though of course it is no magic bullet to make it all things to all people.

&lt;div style=&quot;text-align: center; float:left; max-width: 40%; padding: 0.5em 0.3em 0.5em 1em&quot;&gt;&lt;a href=&quot;http://sam.vilain.net/files/hist-screenshots/integration-cherry-merging.png&quot; target=&quot;gitk&quot;&gt;&lt;img border=&quot;0&quot; style=&quot;max-width: 90%&quot; src=&quot;http://sam.vilain.net/files/hist-screenshots/integration-cherry-merging.png&quot; /&gt;&lt;/a&gt;&lt;p&gt;&lt;em&gt;I call this &quot;Cherry merging&quot; - cherry picking almost every change in series&lt;/em&gt;&lt;/div&gt;

&lt;p&gt;So, in the face of all of these things, I simply made the program display as much information as it could figure out at each point and ask the presumably infinitely more enlightened user for advice, and as I got bored of answering the questions, built a bit of fuzzy logic into it to make answers based on the rules of thumb that I&#39;d figured out.

&lt;h2&gt;Remaining tasks&lt;/h2&gt;

&lt;p&gt;I&#39;m very grateful to now have direct rsync access to the raw Perforce repository, courtesy of Robrt.  With a little refactoring, this &quot;raw&quot; Perforce importer could be generally useful to other people stuck with Perforce who might find themselves completely unable to find a suitable replacement product.

&lt;p&gt;There are always remaining tasks when it comes to Software Archaeology - more information that can be put in etc.  For instance, I have not even looked in this conversion at doing the p5p archive scanning that I did for Chip&#39;s pumpking series, which allowed me to represent the difference between a patch submitted and a patch applied.  But at some point you have to draw the line and cut a release.

&lt;p&gt;I&#39;ve made some history releases before - but this one is good enough that I know I have to be careful - as it is already highly functional and capable of being used for people who want to track Perl development, or for people with proposed features to develop them on their own feature branches for future pumpkings to manage.  Already there are several of these repositories out there with different histories - and so, I&#39;d like to make sure that the next release I make has as much correct author attribution information in there as possible, so that for instance people&#39;s OHLOH stats are correct.

&lt;p&gt;There is still one large chunk of history which while having dozens of releases is only represented by a handful of commits, and will need the scripts used to import Chip Salzenberg&#39;s 5.003 to 5.004 series patches extended to cover the slight variations in release style used by Tim Bunce.  That I&#39;d also like to have complete first.  It would be nice to have this all finished by the looming 5.10.0 Perl release, but I don&#39;t really want to compromise on correct attributions to do so.

&lt;p&gt;Other than that, I do want to make sure that whenever changes are referenced in commit messages, or the metadata that appears in the commit message, that the corresponding git commit ID is placed in the message - so that you can easily bounce around them in gitk and gitweb.

&lt;p&gt;And then, I think I&#39;ll have achieved what might be the most complex Perforce to Git conversion in the Free Software world to date, as well as liberating the source code for a project which is dear to many people.



</description>


	<comments>/comp/perl/new_p4_importer.html#comments</comments>

</item>
<item>

	<title>microbranching</title>


	<guid isPermaLink="false">http://sam.vilain.net//comp/git/microbranching.html</guid>

	<link>http://sam.vilain.net//comp/git/microbranching.html</link>


	<category>git</category>

	<category>perforce</category>

	<category>perl</category>


	<pubDate>Fri, 24 Aug 2007 14:00:00 +0100</pubDate>
	<dcterms:modified>2011-03-01T02:13:52Z</dcterms:modified>

	<description>&lt;h1&gt;History is not linear - the case for micro-branching&lt;/h1&gt;

&lt;p&gt;During the &lt;a href=&quot;http://git.catalyst.net.nz/gitweb/?p=perl.git&quot;&gt;Perl history conversion&lt;/a&gt;, I have found there are very few patches from the p5p archives which I have found I couldn&#39;t apply.  However, sometimes a pumpking will have integrated a patch which was posted relative to an older version of Perl.  How am I representing that?

&lt;p&gt;&lt;img src=&quot;http://sam.vilain.net/files/perl-5.003_21-closeup.png&quot; alt=&quot;view of gitk on perl.git&quot;&gt;

&lt;p&gt;The patch &quot;Forbid ++ and -- on readonly values&quot; was relative to 5.003_08, and would not cleanly apply to any closer version than that.  So, I made a new microbranch, applied the patch there, and included it as a parent to the perl-5.003_21 release.  There is also a series of patches, &quot;Fix for anon-lists with tied entries coredump&quot; .. &quot;Full documentation generation patch&quot;.  These were &lt;em&gt;all&lt;/em&gt; successfully applied based on messages from the p5p archives.  The result is a whole bunch of changes, followed by a mega-commit which is a &quot;merge&quot; of the loose ends.  It&#39;s also a very good representation of what really happened.  You can also see little &quot;diamonds&quot; in the history (eg, &quot;Re: MakeMaker and &#39;make uninstall&#39;&quot;), where one micro-branch has the version posted to p5p, and the other the version included in the final release (if they resulted in the same thing, no diamond was formed).

&lt;p&gt;With SVN, how would I do all that?  Well, I&#39;d make a new branch, pulling a name out of thin air.  Apply the change on that branch, then merge back to trunk using the experimental svnmerge tool, and then delete the branch.  What a PITA.  Well, unless &lt;a href=&quot;http://utsl.gen.nz/git/git-svnserver.txt&quot;&gt;git-svnserver&lt;/a&gt; were to do it automatically from a git master...




</description>


	<comments>/comp/git/microbranching.html#comments</comments>

</item>
<item>

	<title>git svn intro</title>


	<guid isPermaLink="false">http://sam.vilain.net//comp/git/git_svn_intro.html</guid>

	<link>http://sam.vilain.net//comp/git/git_svn_intro.html</link>


	<category>git</category>

	<category>svn</category>


	<pubDate>Wed, 28 Feb 2007 18:00:00 +0000</pubDate>
	<dcterms:modified>2011-03-13T16:00:31Z</dcterms:modified>

	<description>&lt;h1&gt;An introduction to git-svn for Subversion/SVK users and deserters&lt;/h1&gt;

&lt;p&gt;&lt;i&gt;[note Feb 2011: this is the first version of the successful
&lt;a href=&quot;http://utsl.gen.nz/talks/git-svn/intro.html&quot;&gt;introduction to git-svn for Subversion/SVK users and
deserters&lt;/a&gt; which saw
substantial revision from this relatively unstructured rant.  I found
it in my blogpile and thought it might be worth preserving.]&lt;/i&gt;

&lt;p&gt;This article is aimed at people who want to contribute to projects who are using Subversion as their code-wiki.  It is particularly targetted at SVK users, who are already used to the disconnected operation work-flow.

&lt;p&gt;People who are responsible for Subversion servers and are converting them to git in order to lay them down to die are advised to consider the one-off &lt;tt&gt;git-svnimport&lt;/tt&gt;, which is useful for bespoke conversions where you don&#39;t necessarily want to leave SVN/CVS/etc breadcrumbs behind.

&lt;h2&gt;Step 1. track the upstream repository&lt;/h2&gt;

&lt;p&gt;There are lots of options here.  We can just check out the head, we can do a full import from the master server, or we can convert our existing SVK mirror paths. 

&lt;h3&gt;Quickest, Easiest - find a git-svn conversion&lt;/h3&gt;

&lt;p&gt;You can check out the entire history of the project with:

&lt;pre&gt;$ &lt;b&gt;git-clone git://utsl.gen.nz/parrot&lt;/b&gt;
Initialized empty Git repository in /home/samv/tmp/parrot/.git/
remote: Generating pack...
remote: Done counting 152636 objects.
remote: Deltifying 152636 objects.
remote:  100% (152636/152636) done
Indexing 152636 objects.
remote: Total 152636, written 152636 (delta 102789), reused 152478 (delta 102789)
 100% (152636/152636) done
Resolving 102789 deltas.
 100% (102789/102789) done
Checking files out...
 100% (2990/2990) done&lt;/pre&gt;

&lt;p&gt;Great!  Didn&#39;t take long, and that was the same as the whole &lt;tt&gt;svk co&lt;/tt&gt; sequence - add mirror, sync revisions, and checkout.  If you are close enough on the network to &lt;tt&gt;utsl.gen.nz&lt;/tt&gt;, that may have taken less than a minute.  You can proceed to &lt;a href=&quot;http://sam.vilain.net//tags/git.html#using&quot;&gt;using your git-svn git repository&lt;/a&gt;, below, if you just want to play with it and not worry about the painful migration part.

&lt;p&gt;Of course if svn.perl.org is down and you don&#39;t have an SVK mirror then this is your only option.

&lt;h3&gt;Building &lt;tt&gt;git&lt;/tt&gt;&lt;/h3&gt;

&lt;p&gt;Though you don&#39;t need it to follow most of this tutorial, a version of &lt;tt&gt;git-core&lt;/tt&gt; more recent than the one provided by your distribution is almost certainly going to give you fewer issues in the long run.

&lt;p&gt;Get yourself a tarball release of git - &lt;a href=&quot;http://repo.or.cz/w/git.git?a=snapshot;h=maint&quot;&gt;repo.or.cz&lt;/a&gt; is probably a good place to go.  git itself is fairly simple to build (apart from the docs, which have a dependency called &lt;tt&gt;asciidoc&lt;/tt&gt; - but just read the &lt;tt&gt;.txt&lt;/tt&gt; files under &lt;tt&gt;Documentation/&lt;/tt&gt; in lieu of getting that if you have difficulty).  You&#39;ll also need to install the Subversion SWIG bindings to get &lt;tt&gt;git-svn&lt;/tt&gt; to work, which of course is a world of pain that I won&#39;t go into here.  You&#39;ll also want Tk installed for one of the most important GUIs.

&lt;h3&gt;Checking out trunk from SVN&lt;/h3&gt;

&lt;p&gt;It is probably second fastest to just check out the SVN head using &lt;tt&gt;git-svn&lt;/tt&gt;; this is a bit like setting up a mirror path with &lt;tt&gt;svk mirror&lt;/tt&gt;, then syncing only to the head revision using &lt;tt&gt;svk sync -s NNN&lt;/tt&gt; (where &lt;tt&gt;NNN&lt;/tt&gt; is the head revision, found below using &lt;tt&gt;svn log&lt;/tt&gt;):

&lt;pre&gt;$ svn log https://svn.perl.org/parrot/trunk|head
------------------------------------------------------------------------
r17048 | bernhard | 2007-02-19 07:32:13 +1300 (Mon, 19 Feb 2007) | 3 lines

Remove the PIR.pg and bc.pg examples as they are
now covered by languages/abc and languages/PIR.

------------------------------------------------------------------------
r17047 | bernhard | 2007-02-19 07:09:00 +1300 (Mon, 19 Feb 2007) | 5 lines

[languages/PIR]
$ &lt;b&gt;mkdir parrot&lt;/b&gt;
$ &lt;b&gt;cd parrot&lt;/b&gt;
$ &lt;b&gt;git-svn init https://svn.perl.org/parrot/trunk&lt;/b&gt;
Initialized empty Git repository in .git/
git-svn Using higher level of URL: https://svn.perl.org/parrot/trunk =&amp;gt; https://svn.perl.org/parrot
$ &lt;b&gt;git-svn fetch -r17048&lt;/b&gt;
        A       DEPRECATED.pod
        A       debian/libparrot-dev.install
        A       debian/parrot-doc.install
...
        A       examples/streams/ParrotIO.pir
        A       examples/streams/Include.pir
        A       examples/streams/Filter.pir
r17048 = a57c09abef48d73f3c74c6a307793301b5956bfd (git-svn)
Checking files out...
 100% (2959/2959) done
Checked out HEAD:
  https://svn.perl.org/parrot/trunk r17048

$&lt;/pre&gt;

&lt;p&gt;Well, that was almost as quick - under 2 minutes for a head checkout; it had to download about as much as a release tarball.  If you like, from here you can proceed to &lt;a href=&quot;http://sam.vilain.net//tags/git.html#using&quot;&gt;using your git-svn git repository&lt;/a&gt;.

&lt;p&gt;But people who use git are used to treating their repositories as a &lt;em&gt;revision data warehouse&lt;/em&gt; which they use to &lt;em&gt;mine useful information&lt;/em&gt; when they are trying to understand a codebase. 

&lt;p&gt;We can&#39;t do that, but once your &lt;em&gt;git-fu&lt;/em&gt; is strong, you can see it is easy to graft on the earlier history if you want to, using &lt;em&gt;history rewriting&lt;/em&gt;.  I&#39;ll briefly mention &lt;a href=&quot;http://sam.vilain.net//tags/git.html#grafting&quot;&gt;grafting&lt;/a&gt; (and its drawbacks) later on.

&lt;h3&gt;Convert your SVK depot&#39;s mirror path&lt;/h3&gt;

&lt;p&gt;So, it is better to have the complete project history converted, but you probably won&#39;t want to wait the day or two it can take to replay a moderately sized Subversion repository using SVK (can &lt;em&gt;anyone&lt;/em&gt; mirror the 48GB KDE Subversion repository?).

&lt;p&gt;The support for this isn&#39;t yet in a released git, so until it gets merged you&#39;ll need to clone &lt;tt&gt;git://git.bogomips.org/git-svn.git&lt;/tt&gt; and build and install that.  Look for &lt;tt&gt;--useSvmProps&lt;/tt&gt; in &lt;tt&gt;git-svn init -h&lt;/tt&gt; to see if your &lt;tt&gt;git-svn&lt;/tt&gt; is new enough..

&lt;p&gt;First, &lt;tt&gt;svk mi -l&lt;/tt&gt; will tell us where the mirror paths are.

&lt;pre&gt;$ &lt;b&gt;svk mi -l | grep parrot&lt;/b&gt;
/parrot/master               https://svn.perl.org/parrot
$ &lt;/pre&gt;

&lt;p&gt;That&#39;s everything we need to get started.  Now we just need to convert &lt;tt&gt;/parrot/master&lt;/tt&gt; to an SVN url; the &lt;em&gt;depot&lt;/em&gt; is everything up to the second &quot;&lt;tt&gt;/&lt;/tt&gt;&quot;, and most SVK users will just be using a single depot with an empty name, &lt;tt&gt;//&lt;/tt&gt;

&lt;pre&gt;$ &lt;b&gt;svk depotmap -l | grep &#39;/parrot/&#39;&lt;/b&gt;
/parrot/                /home/samv/.svk/parrot
$ &lt;/pre&gt;

&lt;p&gt;So, I take the depot path and add on the rest of the mirror path, I should be able to look at the path using plain &lt;tt&gt;svn&lt;/tt&gt;;

&lt;pre&gt;$ &lt;b&gt;svn pl file:///home/samv/.svk/parrot/master&lt;/b&gt;
Properties on &#39;file:///home/samv/.svk/parrot/master&#39;:
  svm:source
  svm:uuid
  svk:merge
$ &lt;b&gt;svn ls file:///home/samv/.svk/parrot/master&lt;/b&gt;
branches/
tags/
trunk/
$ &lt;/pre&gt;

&lt;p&gt;Great!  The &lt;tt&gt;pl&lt;/tt&gt; (&lt;tt&gt;proplist&lt;/tt&gt;) command was important - the properties there, particularly &lt;tt&gt;svm:source&lt;/tt&gt; and &lt;tt&gt;svm::uuid&lt;/tt&gt;, must be there for &lt;tt&gt;git-svn&lt;/tt&gt; to convert this repository correctly.  We use the &lt;tt&gt;--useSvmProps&lt;/tt&gt; option to set up the repository rewriting:

&lt;p&gt;Set up the fetch using &lt;tt&gt;git-svn init&lt;/tt&gt;:

&lt;pre&gt;$ &lt;b&gt;git-svn init -t tags -b branches -T trunk \
          --useSvmProps file:///home/samv/.svk/parrot/master&lt;/b&gt;
Initialized empty Git repository in .git/
Using higher level of URL: file:///home/samv/.svk/parrot/master =&amp;gt; file:///home/samv/.svk/parrot
$ &lt;/pre&gt;

&lt;p&gt;&lt;tt&gt;git-svn&lt;/tt&gt; is quite capable of tracking multiple Subversion repositories that hold mirrors of the same project, though of course probably most people actually doing that are SVK users, and the &quot;other repository&quot; is your local depot.  The above command set up a git-svn remote with the default name of &quot;&lt;tt&gt;svn&lt;/tt&gt;&quot;.  Take a look at what was configured by running &lt;tt&gt;cat .git/config&lt;/tt&gt;.

&lt;pre&gt;$ &lt;b&gt;cat .git/config&lt;/b&gt;
[core]
        repositoryformatversion = 0
        filemode = true
        bare = false
        logallrefupdates = true
[svn-remote &quot;svn&quot;]
        url = file:///home/samv/.svk/parrot
        fetch = trunk:refs/remotes/trunk
        branches = branches/*:refs/remotes/*
        tags = tags/*:refs/remotes/tags/*
$ &lt;/pre&gt;

&lt;p&gt;All look good?  So, 

&lt;pre&gt;$ &lt;b&gt;git-svn fetch --repack 1000 --useSvmProps&lt;/b&gt;
        A       README
r2 = 5c2dbc76df3fc7569d0b779841427d5ddf406e9d (trunk)
        M       README
r3 = 9aa2f03a26ed9617cf7002bbe4acae5d3d24dadf (trunk)
  ...

$ &lt;/pre&gt;

&lt;p&gt;So once that&#39;s all complete what did we win so far?

&lt;pre&gt;$ &lt;b&gt;du -sk //home/samv/.svk/parrot .git&lt;/b&gt;
353576  //home/samv/.svk/parrot
155245  .git
$ &lt;/pre&gt;

&lt;p&gt;Well, that&#39;s a bit of savings.  git saved half the space compared to Subversion fsfs.  But it turns out that a lot of it is just &lt;tt&gt;git-svn&lt;/tt&gt; metadata.  And we can compress it more; I&#39;ve got CPU to burn so I ran this command:

&lt;pre&gt;$ &lt;b&gt;git-repack -a -d -f --window 100&lt;/b&gt;
Generating pack...
Done counting 131402 objects.
Deltifying 131402 objects.
 100% (131402/131402) done
Writing 131402 objects.
 100% (131402/131402) done
Total 131402 (delta 99440), reused 31385 (delta 0)
Pack pack-079a95f55810fc1eea600bc89c911a2bf85c1add created.
$ &lt;b&gt;ls -l .git/objects/pack/&lt;/b&gt;
total 33745
-r--r--r-- 1 samv samv  3154712 2007-02-20 16:00 pack-079a95f55810fc1eea600bc89c911a2bf85c1add.idx
-r--r--r-- 1 samv samv 31360284 2007-02-20 16:00 pack-079a95f55810fc1eea600bc89c911a2bf85c1add.pack
$ &lt;/pre&gt;

&lt;p&gt;You may be wondering, &quot;353MB of Subversion repository squeezed into 31MB of git pack?  That&#39;s smaller than an SVN head checkout!  Have not all the revisions been copied?  Did something get missed?&quot;

&lt;p&gt;It turns out that &lt;tt&gt;git&lt;/tt&gt; is just being incredibly space-efficient.  More incredible stories about shrunken repositories can be found all over the internet.  Talk to the GCC, Mozilla and KDE folk for the most impressive ones.

&lt;p&gt;Now, in theory, we could keep using SVK to mirror revisions, and keep using &lt;tt&gt;git-svn fetch&lt;/tt&gt; to copy them into the git repository.  But we want some more space on our laptop to hold more MP3s, so we&#39;ll eventually delete it.  Ideally we also want to convert our local branches - getting this working is still on my TODO list, but the intention is that &lt;tt&gt;git-svn&lt;/tt&gt; will be extended to perform this functionality.

&lt;h3&gt;Converting the upstream repository from SVN, with branches&lt;/h3&gt;

&lt;p&gt;This procedure is the same as the SVK one above, but we can just use the published repository URL.

&lt;pre&gt;$ mkdir parrot
$ cd parrot
$ git-svn init -t tags -b branches -T trunk https://svn.perl.org/parrot
Initialized empty Git repository in .git/
$ git-svn fetch
  ... &lt;/pre&gt;

&lt;p&gt;I didn&#39;t test this one - I have already waited the many hours it took to sync the first time.  Doing this for FAI took &lt;em&gt;days&lt;/em&gt;.  And the repository had the sheer indencency to end up &lt;a href=&quot;http://git.catalyst.net.nz/fai.git/objects/pack/&quot;&gt;tiny&lt;/a&gt;.

&lt;a name=&quot;using&quot;&gt;&lt;/a&gt;&lt;h2&gt;Using your git-svn checkout&lt;/h2&gt;

&lt;p&gt;What I&#39;ll do now is go through the SVK tutorials and convert them to &lt;tt&gt;git-svn&lt;/tt&gt;, then introduce some examples of stuff that you can do with git that is difficult to get right using SVK.  Actually why not get onto some cool stuff first.

&lt;h3&gt;Visualisation&lt;/h3&gt;

&lt;p&gt;This is your all-seeing eye.  You can crank open &lt;tt&gt;gitk&lt;/tt&gt; on it and click on commits, see their patches and the state of the tree at that point in time.

&lt;pre&gt;$ &lt;b&gt;gitk --all&lt;/b&gt;&lt;/pre&gt;

&lt;p&gt;&lt;tt&gt;gitk&lt;/tt&gt; does some really cool things but is most useful when looking at projects that have cottoned onto feature branches (see &lt;a href=&quot;http://sam.vilain.net//tags/git.html#feature-branches&quot;&gt;feature branches&lt;/a&gt;, below).  If you&#39;re looking at a project where everyone commits largely unrelated changes to one branch it just ends up a straight line, and not very interesting.

&lt;h3&gt;The &quot;depot map&quot; is gone&lt;/h3&gt;

&lt;p&gt;So far we&#39;ve got as far as the equivalent of &lt;tt&gt;svk mirror&lt;/tt&gt; and &lt;tt&gt;svk sync&lt;/tt&gt;.  Didn&#39;t we miss &lt;tt&gt;svk depotmap&lt;/tt&gt;?

&lt;p&gt;Git normally stores its repository information under &lt;tt&gt;.git&lt;/tt&gt; at the top level of your checkout.  But everything&#39;s compressed and the filenames don&#39;t resemble the files in your checkout so &lt;tt&gt;grep -r&lt;/tt&gt; and &lt;tt&gt;find&lt;/tt&gt; etc don&#39;t hate you.  You can set &lt;tt&gt;GIT_DIR&lt;/tt&gt; to get all the tools to look somewhere else if you really care, but for most people this system works very well.  &lt;tt&gt;GIT_DIR&lt;/tt&gt; doesn&#39;t work the same as &lt;tt&gt;SVKROOT&lt;/tt&gt; in SVK, it&#39;s a per-checkout path, not pointing to a central place.

&lt;p&gt;I don&#39;t know about you but I was always running into situations where my &lt;tt&gt;~/.svk/config&lt;/tt&gt; didn&#39;t match reality, and there were no breadcrumbs left in the checkout to do anything with it.  I much prefer these floating repositories and I hear that they have been recently added to SVK.

&lt;h3&gt;Making a &#39;local&#39; branch&lt;/h3&gt;

&lt;p&gt;One of the nice things about git (and darcs and bzr and ...) is that to make branches is simple.  Say you want to take a directory, and work on it somewhere else in a different direction, you can just make a copy.  Contrast this with Subversion, where you have to do some crap with the &lt;tt&gt;branches/&lt;/tt&gt; paths and &lt;tt&gt;svn cp&lt;/tt&gt;, &lt;tt&gt;svn switch&lt;/tt&gt;, etc, and worry about whether you branch on the mirror path or the local path and what effect that would have, etc.  &lt;em&gt;And&lt;/em&gt; put up with Subversion followers saying that was natural and easy.  Whatever.

&lt;pre&gt;$ &lt;b&gt;cp -a parrot parrot.my-branch&lt;/b&gt;
$ &lt;/pre&gt;

&lt;p&gt;Each of those copies is fully independent, as if you gave them to someone else.  You can easily push and pull changes between them without tearing your hair out.  But that was too slow and heavy.  We want to create new branches at the drop of a hat (trust me on that for now, yer just &lt;em&gt;do&lt;/em&gt;, OK?).  Maybe you don&#39;t want to copy the actual repository, just make another checkout.  We can use &lt;tt&gt;git-clone&lt;/tt&gt; again;

&lt;pre&gt;$ &lt;b&gt;git-clone -l parrot parrot.my-branch&lt;/b&gt;
Initialized empty Git repository in /home/samv/.svk/parrot.clone/.git/
0 blocks
Checking files out...
 100% (2815/2815) done
$ &lt;/pre&gt;

&lt;p&gt;The &lt;tt&gt;-l&lt;/tt&gt; option to &lt;tt&gt;git-clone&lt;/tt&gt; told &lt;tt&gt;git&lt;/tt&gt; to hardlink the objects together, so not only are these two sharing the same repository but &lt;em&gt;they can still be moved around independently&lt;/em&gt;.  Cool.

&lt;p&gt;But all that&#39;s a lot of work and most of the time I don&#39;t care to create lots of different directories for all my branches.  I can just make a new branch and switch to it immediately with &lt;tt&gt;git-checkout&lt;/tt&gt;:

&lt;pre&gt;$ &lt;b&gt;git-checkout -b localbranch remotes/trunk&lt;/b&gt;
$ &lt;/pre&gt;

&lt;p&gt;But wait, you say, don&#39;t I have to enter a commit message for this new branch?

&lt;p&gt;Well, a branch in git is just a pointer to a commit.  If you look at &quot;gitk&quot; now, you&#39;ll see a new green label on the same commit as &quot;&lt;tt&gt;remotes/trunk&lt;/tt&gt;&quot; called &quot;&lt;tt&gt;localbranch&lt;/tt&gt;&quot;.  They&#39;re like little &quot;post-it&quot; notes - with a new enough &lt;tt&gt;gitk&lt;/tt&gt; you can pepper your history with them wherever you like with a click and then typing the name in.  They generally don&#39;t form a part of the permanent history - it&#39;s the actual commits, the changes to the code, that are the history.

&lt;h3&gt;Making changes to your local branch&lt;/h3&gt;

&lt;p&gt;Once you have some edits you want to commit, you can use &lt;tt&gt;git-commit&lt;/tt&gt; to commit them.  Nothing (not even file changes) gets committed by default; you&#39;ll probably find yourself using &lt;tt&gt;git-commit -a&lt;/tt&gt; to get similar semantics to &lt;tt&gt;svn commit&lt;/tt&gt;.

&lt;p&gt;This is because git has a powerful concept called the &lt;em&gt;staging area&lt;/em&gt; (old name: &quot;index&quot;, hence commands like &lt;tt&gt;git-update-index&lt;/tt&gt;), which is where you can prepare your changes before you actually save the commit.

&lt;pre&gt;$ &lt;b&gt;vi CREDITS&lt;/b&gt;
$ &lt;b&gt;git-commit -a&lt;/b&gt;
committed tree 6b513546099f01826c5cc7bc25042d00bc2560b0
$ &lt;/pre&gt;

&lt;p&gt;Interactive commit is not really there, unless you use Cogito&#39;s &lt;tt&gt;cg-commit -p&lt;/tt&gt; with &lt;a href=&quot;http://utsl.gen.nz/gitweb/?p=cogito;a=commitdiff;h=3be47bdb&quot;&gt;this patch&lt;/a&gt; (which fixes the way that editing a patch for a change before it gets committed works).  Normally people just use the staging area functionality, though.  This is certainly one area where SVK&#39;s UI is better than &lt;tt&gt;git-core&lt;/tt&gt;.  But UI sophistication is usually only a temporary problem, especially for a project with as much energy pouring into it as git.

&lt;b&gt;Update:&lt;/b&gt; oh, dear, I&#39;m just &lt;a href=&quot;http://repo.or.cz/w/git.git?a=commit;h=5cde71d64aff03d305099b4d239552679ecfaab6&quot;&gt;so out of touch&lt;/a&gt;.

&lt;h3&gt;Correcting changes in your local branch&lt;/h3&gt;

&lt;p&gt;Did you mess up a change?  Commit something poorly?  Well, no worries, there are lots of ways to fix it.

&lt;p&gt;Again, we&#39;re diverging from things that SVK supports well, but I think they&#39;re important to get a taste for how things are different. According to one source, lack of support in SVK for this is a &quot;philosophical&quot; stance.  I really don&#39;t understand this - I make mistakes all the time and it&#39;s better that I correct the ones I catch early so other people don&#39;t waste their time on them.

&lt;p&gt;If it&#39;s the top commit, you can just add &lt;tt&gt;--amend&lt;/tt&gt; to your regular &lt;tt&gt;git-commit&lt;/tt&gt; command to, well, amend the last commit.

&lt;p&gt;You can also &lt;em&gt;uncommit&lt;/em&gt;.  It&#39;s such a crude thing to do that there isn&#39;t a command for it (if you add the cogito wrappers, you can use &lt;tt&gt;cg-admin-uncommit&lt;/tt&gt;).

&lt;pre&gt;$ &lt;b&gt;git-update-ref refs/heads/localbranch HEAD~1&lt;/b&gt;
$ &lt;/pre&gt;

&lt;tt&gt;HEAD~1&lt;/tt&gt; is a special syntax that means &quot;one commit before the reference called &lt;tt&gt;HEAD&lt;/tt&gt;&quot;.  I could have also put a complete revision number, a partial (non-ambiguous) revision number, or something like &lt;tt&gt;remotes/trunk&lt;/tt&gt;.  See &lt;cite&gt;git-rev-parse(1)&lt;/cite&gt; for the full list of ways in which you can specify revisions.

&lt;p&gt;And just like that, your most recent commit was unlinked.  If it really was garbage, that was what you wanted.  Actually, it isn&#39;t completely gone;

&lt;pre&gt;$ &lt;b&gt;git-fsck&lt;/b&gt;
dangling commit 2ef718cf5434eeb8fdec74e69968f64fadd28761
$ &lt;/pre&gt;

&lt;p&gt;If you wanted, you could see it with, eg, &lt;tt&gt;gitk 2ef718&lt;/tt&gt;.  I sometimes write commands like `&lt;tt&gt;gitk --all `git-fsck | awk &#39;/dangling commit/ {print $3}&#39;`&lt;/tt&gt;&#39; to see all the commits in the repository, not just the ones with &quot;post-it notes&quot; (aka references) stuck to them.

&lt;p&gt;But that aside, uncommitting really is a primitive mode of operation, and you&#39;d probably end up getting confused by the fact that &lt;tt&gt;git-update-ref&lt;/tt&gt; didn&#39;t change the staging area.  This is because &lt;tt&gt;git-update-ref&lt;/tt&gt; is a &lt;em&gt;plumbing&lt;/em&gt; command; it does one thing, and does it quickly and well.  Commands like &lt;tt&gt;git-commit&lt;/tt&gt; are considered &lt;em&gt;porcelain&lt;/em&gt; - that is, designed for user interface.  So, the technical name for the above dangling commit is &lt;em&gt;spillage&lt;/em&gt;.  This analogy doesn&#39;t seem to extend far enough to make &lt;tt&gt;git-prune&lt;/tt&gt; (which would delete that commit) called something like &lt;tt&gt;git-flush&lt;/tt&gt; or &lt;tt&gt;git-pull-chain&lt;/tt&gt;, however.

&lt;p&gt;Git&#39;s more of a toolkit for writing VCS than a VCS in its own right, and if you&#39;re the sort of person who doesn&#39;t like too many commands, then try Cogito - its simplified feature set is much easier for beginners.  However, I&#39;m not one of those people - git is like that little engine that one day you realised you could take apart completely yourself and understand what each part does (at least in principle).  But I still prefer Cogito commands much of the time.

&lt;p&gt;So, anyway, there are other tools for revising commits, and to be the king of patch revisioning is &lt;a href=&quot;http://www.procode.org/stgit/&quot;&gt;Stacked Git&lt;/a&gt;.

&lt;p&gt;Say I discover a change that I actually wanted to apply three commits ago.  Assuming that I haven&#39;t sent the patches out yet, then I can just go ahead and change them; no-one need know.  In fact I can anyway, it&#39;s just that the longer ago you change things the more antisocial the behaviour becomes.  In this scenario, we&#39;ll assume that what I&#39;m currently working on isn&#39;t finished, either - and I don&#39;t want to have to finish it first.

&lt;pre&gt;$ &lt;b&gt;stg init&lt;/b&gt;
branch &#39;localbranch&#39; initialised
$ &lt;b&gt;stg new -m &quot;WIP.&quot; new-commit&lt;/b&gt;
...
$ &lt;b&gt;stg uncommit -n 3&lt;/b&gt;
...
$ &lt;/pre&gt;

&lt;p&gt;Now, &lt;tt&gt;stg uncommit&lt;/tt&gt; didn&#39;t do the same thing as &lt;tt&gt;cg-admin-uncommit&lt;/tt&gt;; specifically, it didn&#39;t unlink any patches.  They&#39;ve just moved onto the &lt;em&gt;patch stack&lt;/em&gt;, which I can jump around with using &lt;tt&gt;stg&lt;/tt&gt; commands.  First I&#39;ll extract the current patch with &lt;tt&gt;stg diff&lt;/tt&gt;, edit it, then apply it a few revisions up.

&lt;pre&gt;$ &lt;b&gt;stg diff -r /bottom &amp;gt; this_commit.patch&lt;/b&gt;
$ &lt;b&gt;vi this_commit.patch&lt;/b&gt;
$ &lt;b&gt;stg pop -n 2&lt;/b&gt;
now at patch &#39;do_something_interesting&#39;
$ &lt;b&gt;patch -p1 &amp;lt; commit.patch&lt;/b&gt;
patching file foobar.c
$ &lt;b&gt;stg refresh&lt;/b&gt;
$ &lt;b&gt;stg push -n 2&lt;/b&gt;
now at patch &#39;do_something_else_interesting&#39;
$ &lt;b&gt;stg commit&lt;/b&gt;
$ &lt;b&gt;stg push&lt;/b&gt;
now at patch &#39;new-commit&#39;
$ &lt;b&gt;vi foo.c&lt;/b&gt;
$ &lt;b&gt;stg refresh -e&lt;/b&gt;
$ &lt;b&gt;stg commit&lt;/b&gt;
$ &lt;b&gt;stg clean&lt;/b&gt;
No patches applied
$ &lt;/pre&gt;

&lt;p&gt;But this isn&#39;t a tutorial on stacked git.  See the Stacked Git homepage for that.

&lt;p&gt;&quot;Another&quot; way to revise commits is to make a branch from the point a few commits ago, then make a new series of commits that is revised in the way that you want.  This is the same scenario as before.

&lt;pre&gt;$ &lt;b&gt;git-commit -a -m &quot;WIP.&quot;&lt;/b&gt;
committed tree 5ef9339c5b5bc6572b69ff61cdb1dd4af4603f0b
$ &lt;b&gt;git-checkout -b tempbranch HEAD~4&lt;/b&gt;
$ &lt;b&gt;git-cherry-pick --no-commit -r localbranch~3&lt;/b&gt;
...
$ &lt;b&gt;vi foobar.c&lt;/b&gt;
$ &lt;b&gt;git-commit -a&lt;/b&gt;
$ &lt;b&gt;git-cherry-pick -r localbranch~2&lt;/b&gt;
...
$ &lt;b&gt;git-cherry-pick -r localbranch~1&lt;/b&gt;
...
$ &lt;b&gt;git-cherry-pick --no-commit -r localbranch&lt;/b&gt;
...
$ &lt;/pre&gt;

&lt;p&gt;This technique is called &lt;em&gt;rebasing&lt;/em&gt; commits.

&lt;p&gt;There are many, many ways to skin this cat.  To tell the truth a lot of them don&#39;t play well together, for example you&#39;d better remember to use &quot;&lt;tt&gt;stg clean&lt;/tt&gt;&quot; before committing with something else.  It&#39;s the old Cathedral vs. Bazaar thing.  Using Git opens the door to a bazaar of VCS tools rather than sacrificing your projects at the altar of one.  That said, these situations are usually easy enough to recover from in practice, especially by asking for help in &lt;tt&gt;#git&lt;/tt&gt; on freenode.  

&lt;h3&gt;Tracking updates to the upstream Subversion server&lt;/h3&gt;

&lt;p&gt;If you pulled from my source, you can update the latest Subversion revisions I&#39;ve put there using the native &lt;tt&gt;git&lt;/tt&gt; command:

&lt;pre&gt;$ &lt;b&gt;git-fetch&lt;/b&gt;
...
$ &lt;/pre&gt;

&lt;p&gt;This command completes very quickly even when pulling thousands of new revisions, modulo bugs for obscure corner cases like repositories with a huge number of non-overlapping revisions.

&lt;p&gt;On the other hand, if you pulled from the Subversion Server - the slowest option above - or you are continuing to use SVK to do the real fetching (and have just run &lt;tt&gt;svk sync&lt;/tt&gt;), you can just use:

&lt;pre&gt;$ &lt;b&gt;git-svn fetch&lt;/b&gt;
...
$ &lt;/pre&gt;

&lt;p&gt;If you converted the repository from your SVK depot, and you don&#39;t want to continue using SVK, then the safest thing to do is first clean out the &lt;tt&gt;git-svn&lt;/tt&gt; metadata; but look out for &lt;tt&gt;git-svn&lt;/tt&gt; updates that do this in a smarter way.

&lt;pre&gt;$ &lt;b&gt;rm -r .git/svn&lt;/b&gt;
$ &lt;b&gt;vi .git/config&lt;/b&gt;
$ &lt;/pre&gt;

&lt;p&gt;If you copied the repository somewhere else (eg, from me) via &lt;tt&gt;git-clone&lt;/tt&gt;, then you won&#39;t have any SVN metadata - just commits.  In that case, you need to rebuild your SVN metadata, using the same command as in the earlier section, but with the upstream URL:

&lt;pre&gt;$ &lt;b&gt;git-svn init -t tags -b branches -T trunk \
          https://svn.perl.org/parrot&lt;/b&gt;
...
$ &lt;/pre&gt;

&lt;p&gt;You should see a stream of messages saying &quot;&lt;tt&gt;r1234 = e79d0a84830becb10f6f6d24a9e0b7e3663c2921&lt;/tt&gt;&quot; (etc) as &lt;tt&gt;git-svn&lt;/tt&gt; scans through the commits and makes its index.  After you&#39;ve rebuilt the index, the above &lt;tt&gt;git-svn fetch&lt;/tt&gt; command should do the trick.

&lt;h3&gt;Keeping your local branch up to date with Subversion updates&lt;/h3&gt;

&lt;p&gt;The recommended way to do this for people familiar with Subversion is to use &lt;tt&gt;git-svn rebase&lt;/tt&gt;.  You actually don&#39;t need to use &lt;tt&gt;git-svn fetch&lt;/tt&gt; separately; it will automatically fetch new revisions first.

&lt;pre&gt;$ &lt;b&gt;git-svn rebase&lt;/b&gt;
...
$ &lt;/pre&gt;

&lt;p&gt;This command is doing something similar to the above commands that used &lt;tt&gt;git-cherry-pick&lt;/tt&gt;; it&#39;s copying the changes from one point on the revision tree to another, just like &lt;tt&gt;svk sm -Il&lt;/tt&gt;.
 
&lt;h3&gt;Pushing back to Subversion&lt;/h3&gt;

&lt;p&gt;The command to use is &lt;tt&gt;git-svn dcommit&lt;/tt&gt;.  The &lt;tt&gt;d&lt;/tt&gt; stands for delta (there used to be a &lt;tt&gt;git-svn commit&lt;/tt&gt; command that has since been renamed to &lt;tt&gt;git-svn set-tree&lt;/tt&gt; because its behaviour was considered a little surprising for first-time users).

&lt;p&gt;&lt;tt&gt;git-svn&lt;/tt&gt; won&#39;t let the server merge revisions on the fly; if there were updates since you fetched / rebased, you&#39;ll have to do that again.  People are not used to this, thinking somehow that if somebody commits something to file A, then somebody else commits something to file B, both changes should survive despite none of the people committing having a local copy with both changes.

&lt;h3&gt;Sending patches to mailing lists or RT instances&lt;/h3&gt;

&lt;p&gt;Again there are lots of ways to do this.  Let&#39;s say we&#39;ve made some changes and want to make patch files for all of the ones since trunk:

&lt;pre&gt;$ &lt;b&gt;git-format-patch remotes/trunk&lt;/b&gt;
...
$ &lt;/pre&gt;

&lt;p&gt;A command like &lt;tt&gt;git-log remotes/trunk..HEAD&lt;/tt&gt; would show you the commits that this involves.  You can then take those patch files and attach them to e-mails or whatever.

&lt;p&gt;If the project uses the kernel patch submission policy, which strangely enough is very similar to best practices for sending patches to usenet etc since &#39;patch&#39; was invented, then you probably don&#39;t want to use &lt;tt&gt;--attach&lt;/tt&gt;.

&lt;p&gt;If the upstream applies your patch without changes, then if you later merge, the changes shouldn&#39;t need to re-merged.  git will notice that there has been a revision since the &quot;merge base&quot; that an identical change was applied and realise it has already been done.

&lt;a name=&quot;benefits&quot;&gt;&lt;/a&gt;&lt;h2&gt;The real tangible benefits of using Git&lt;/h2&gt;

&lt;p&gt;We&#39;ve shown a lot of stuff so far that shows that &lt;tt&gt;git-svn&lt;/tt&gt; can do everything that we expected of SVK in order to have a much better Subversion client.  What else did we win?

&lt;p&gt;I&#39;ve already talked a bit about the fact that git is a toolkit for writing VCS systems.  As a result, one huge benefit is a flexibility and wide range of tools to choose from.  Writing a tool to do something that you want is often quite a simple matter of plugging together a few core commands.  The git repository model is also simple enough that there are even alternate git implementations you can draw upon.

&lt;p&gt;I&#39;ve also talked about patch revising using stacked git, touched on rebasing, and I&#39;m sure you can read between the lines that dropping commits is also possible.

&lt;p&gt;Then there was the repository efficiency, which affects everything - the virtual memory footprint while mining information from the repository, how much data needs to be transferred during &quot;push&quot; and &quot;pull&quot; operations, and so on.

&lt;p&gt;But really, what does git win you?  For a start...

&lt;h3&gt;Publishing your changes for others to pull&lt;/h3&gt;

&lt;p&gt;You can easily publish your changes for others who are switched on to git to pull.  At a stretch, you can just throw the &lt;tt&gt;.git&lt;/tt&gt; directory on an HTTP server somewhere and publish the path.  You don&#39;t need any silly Web-DAV extensions built into the web server just to share revisions.

&lt;p&gt;There are also sites like &lt;a href=&quot;http://repo.or.cz/&quot;&gt;repo.or.cz&lt;/a&gt; which will let anyone start a new project (or publish their fork of an existing project).

&lt;p&gt;There&#39;s also the &lt;tt&gt;git-daemon&lt;/tt&gt; for more efficient serving of repositories (at least, in terms of network use), and &lt;tt&gt;gitweb.cgi&lt;/tt&gt; to provide a visualisation of a git repository.

&lt;p&gt;This means you can...

&lt;h3&gt;Break free from the &quot;star&quot; pattern&lt;/h3&gt;

&lt;p&gt;With Subversion, everyone has to commit their changes back to the central wiki, I mean repository, to share them.  SVK claims to be &lt;em&gt;distributed&lt;/em&gt;, but this is, at best, demonstrating misunderstanding of what being ditributed means.  By almost all definitions SVK merely offers &lt;em&gt;disconnected operation&lt;/em&gt;.  If I meet you in the middle of a cruise and we both have a copy of a subversion repository, I can&#39;t easily share my local branch with you if we&#39;re both on SVK.  Doing this has come up on the SVK list before to a resounding &quot;dunno, never tried, might work...&quot;

&lt;p&gt;With Git (actually this is completely true for other distributed systems), it&#39;s &lt;em&gt;trivial&lt;/em&gt; to push and pull changes between each other.  If what you&#39;re pulling has common history then git will just pull the differences.

&lt;p&gt;So I&#39;d just copy my repository to a USB key, stick it into the target machine, then run:

&lt;pre&gt;$ &lt;b&gt;git-pull /media/usbdisk/project.git&lt;/b&gt;
...
$ &lt;/pre&gt;

&lt;p&gt;Sure, a USB stick isn&#39;t as gimmicky as a peer to peer wireless protocol featuring autodiscovery.  But frankly I&#39;ll put up with that for sane branching support in the first place.

&lt;p&gt;If the person publishes their repository as described above, using the &lt;cite&gt;git-daemon(1)&lt;/cite&gt;, &lt;tt&gt;http&lt;/tt&gt; or anything else that you can get your kernel to map to its VFS, then you can set it up as a &quot;remote&quot; and pull from it;

&lt;pre&gt;$ &lt;b&gt;cat &gt; .git/remotes/friend &amp;gt;&amp;gt;EOF
URL: file:///net/friend/git/project
Pull: refs/heads/*:refs/remotes/friend/*
EOF&lt;/b&gt;
$ &lt;b&gt;git-fetch friend&lt;/b&gt;
&lt;/pre&gt;

&lt;p&gt;Here we&#39;re configuring all of the &lt;tt&gt;heads&lt;/tt&gt; (aka branches) of the repository which appears at &lt;tt&gt;/net/friend/git/project&lt;/tt&gt; to appear as &lt;tt&gt;remotes/friend/XXX&lt;/tt&gt; in our repository.

&lt;h3&gt;Merging works better (TITLEFIXME)&lt;/h3&gt;

&lt;pre&gt;$ &lt;b&gt;git-merge remotes/trunk&lt;/b&gt;
...
$ &lt;/pre&gt;

&lt;p&gt;If git&#39;s history-sensitive merging doesn&#39;t automatically resolve things like patches applied in a different order, you end up with conflicts.  The local file gets conflict markers - which might sound apalling, but the &quot;ancestor&quot;, &quot;left&quot; and &quot;right&quot; versions of the file are nearby in the staging area.  I like to use &lt;tt&gt;ediff-merge-files-with-ancestor&lt;/tt&gt; to merge, so my &lt;a href=&quot;http://utsl.gen.nz/scripts/smartmerge&quot;&gt;merge script&lt;/a&gt; handles starting this for me to make merging easy.  And I don&#39;t have to worry about breaking out of a merge aborting the whole thing and throwing away work.

&lt;p&gt;No doubt some will say that SVK&#39;s UI is better because it lets you make per-file decisions as you make the merge.  I see that as an easily possible addition to the git-commit interface.  It&#39;s just I&#39;ve been quite happy to resolve using the facilities of the staging area.

&lt;p&gt;When you look at the commits you make using &lt;tt&gt;git-merge&lt;/tt&gt; in gitk, you&#39;ll see something interesting - the new commit has two lines coming back from it.  It has &lt;em&gt;two parents&lt;/em&gt;.  It&#39;s something equivalent to SVK&#39;s &lt;em&gt;merge tickets&lt;/em&gt;, except that the information doesn&#39;t become worthless when pushed back to a server.

&lt;a name=&quot;feature-branches&quot;&gt;&lt;/a&gt;&lt;h3&gt;Feature branches - the &quot;stable&quot; development model&lt;/h3&gt;

&lt;p&gt;This is an interesting one.  Some repositories, for instance the Linux kernel, run a policy such as &lt;em&gt;no commit may break the build&lt;/em&gt;.

&lt;p&gt;Because you can easily separate your repositories into stable branches, temporary branches, etc, then you can easily set up programs that only let commits through if they meet criteria of your choosing.

&lt;p&gt;You might use a &lt;em&gt;continual integration server&lt;/em&gt; to check that no commit happens that breaks your build.  You might say that no merge can happen unless the branch added tests, and that tests pass.  You might say that commits either have to add tests or make tests pass.

&lt;p&gt;Your &quot;trunk&quot; becomes merely a point where branches considered stable are merged into.  Each of your feature branches can merge &lt;em&gt;from&lt;/em&gt; the trunk easily, which means that an immediate merge back in the other direction will involve no actual changes (and, in fact, no extra commit will be made in such a case - the head pointer will just be moved).

&lt;p&gt;Bazaar comes with some great utilities like the Patch Queue Manager which helps show you your feature branches.  With PQM, you just create a branch with a description of what you&#39;re trying to do, make it work against the version that you branched off, and then you&#39;re done.  The branch can be updated to reflect changes in trunk, and eventually merged and closed.

&lt;h3&gt;Mirroring, resilience and distribution&lt;/h3&gt;

&lt;p&gt;Your SVN server going down doesn&#39;t kill your team&#39;s group development if people use systems like &lt;a href=&quot;http://repo.or.cz&quot;&gt;repo&lt;/a&gt; to mirror and track each other&#39;s repositories.  They just stop pushing to the published branch and push to each other for a bit.

&lt;h2&gt;Git&#39;s limitations&lt;/h2&gt;

&lt;p&gt;Of course if I didn&#39;t mention these then I&#39;d have people ranting about how I was biased and partisan etc.  But there are many shortfallings in git.

&lt;p&gt;Not least is that it doesn&#39;t support two popular styles of developing

&lt;h3&gt;Brain melt integration development model&lt;/h3&gt;

&lt;p&gt;This is where instead of merging in patches completely, you merge bits of them in on a file-by-file basis, and expect the VCS to tell you what you did.

&lt;h3&gt;Ghetto development model&lt;/h3&gt;

&lt;p&gt;This is where you send new features into the &lt;em&gt;ghetto&lt;/em&gt; so that they can &#39;battle it out&#39;.  The last features standing get re-integrated into another branch known as the &lt;em&gt;trailer park&lt;/em&gt; to try to find a new life for themselves.

&lt;p&gt;Note that &lt;em&gt;ghetto&lt;/em&gt; is frequently called &lt;tt&gt;trunk&lt;/tt&gt;, and the &lt;em&gt;trailer park&lt;/em&gt; something like &lt;tt&gt;releng&lt;/tt&gt;.

&lt;h2&gt;Summary&lt;/h2&gt;

&lt;p&gt;We have the tools we need to break away from centralisation!  Now, we just need to convert the 10,000 projects...

&lt;a name=&quot;grafting&quot;&gt;&lt;/a&gt;&lt;h2&gt;Epilogue on history rewriting&lt;/h2&gt;

&lt;p&gt;Earlier in this article I referred to history rewriting in passing.  I include this as a pointer for the keen, but bear in mind that this falls into the class of &quot;history munging&quot;, and for various reasons is best done in the privacy of an unpublished project.

&lt;p&gt;Let&#39;s say that we have a branch (the current one) that contains all the patches that we want to move to a rebased history.

&lt;p&gt;We manually find a common commit (possibly using &lt;tt&gt;gitk&lt;/tt&gt;).  Let&#39;s say it was commit &lt;tt&gt;7cbf53525bc6387495edd574ecdb248e1e4f872a&lt;/tt&gt;, which became &lt;tt&gt;aa3e7febb0477e15257c89126d037f6f81a7974c&lt;/tt&gt;.  You&#39;d re-write that using the cogito command:

&lt;pre&gt;$ &lt;b&gt;cd-admin-rewritehist -k 7cbf53 \
    --parent-filter &quot;sed -e &#39;s/7cbf53525bc6387495edd574ecdb248e1e4f872a/aa3e7febb0477e15257c89126d037f6f81a7974c/&#39;&quot; \
    new-branch&lt;/b&gt;&lt;/pre&gt;

&lt;p&gt;That&#39;s a one-line history graft.  You now need to go through all of your refs that point to the old commit IDs and point them at the new ones.

&lt;p&gt;Be careful with this kind of history munging, you might just end up with somebody wondering why their &quot;git-pull&quot; is taking so long to negotiate which commits it has and hasn&#39;t got.



</description>


	<comments>/comp/git/git_svn_intro.html#comments</comments>

</item>

</channel>
</rss>
