OT: git or hg? Why?

So, colleagues… which DVCS wins for you? Why? Why would choosing the other one be a major mistake?

Tell us your tales… No votes for P4 or CVS allowed. Just git and hg, please.

Peter
OSR
@OSRDrivers

well that is actually a ridiculous question. mercurial is a dead end. It has been superseded by git. hg would be an imposed legacy code base requirement, like the awful p4, not a choice.

In other words: use the git, luke.

Oh and git is pretty fabulous, although it does take a while to git the hang of it. Plus there are a wealth of resources out there for hosting your repos in the cloud, for hosting them internally, of integrating your SCCS with code review, bug tracking, agile development, automated building and testing, and automated deployment if the spirit moves you.

I’ve used both in professional environments. The two are remarkably similar. I worked with Hg before Git, so there are some aspects of Git that I feel annoyances about.

Mark’s statements are very much in line with my opinions.

I’m not a big fan of git submodules though.

My 2p: if you don’t know what you’re doing, you’ll be doing it wrong with Git. You’ll be hating it. But then you’ll read documentation, will stop worrying and learn to love The Bloody Git.

Also, Git in Visual Studio is very handy for most of use. Doesn’t support some exotic (and possibly useful) new features, though, such as “git worktree” command. Don’t know if it supports submodules.

Well… overstatement much? But, point taken.

I just read a quote that called hg “the Betamax of DVCSs” which I liked…

P

I moved from P4 to Git over the very bumpy road of migrating the P4 databases with GitFusion (now surely called something else).

After about six months of bitching that ?Git sucks? and ?Git is stupid? my brain slowly rewired to actually understand the thing and now after a few years I cannot remember how we ever got along with CVS, SVN, P4, or (shudder) ClearCase. Don?t get me wrong, I loathed ClearCase, tolerated CVS, liked SVN, and really liked P4. But they were all essentially the same thing from a ?how you work with them?.

Git (no experience with Hg) really sucks if you do not first understand that you need to change how you think about VCS (if you are a classic P4 aficionado).

Git is not just a repository system; it really only seems to work well if you change how you work to meet its model of project workflow. Otherwise you fight it and, well, it wins ? or you loose.

Be sure to take some time to just go learn the ways of the New World Order before trying to manage a serious project (as in a paying customer?s project). It will likely save you some headache and do-over.

Good Luck,
Dave Cattley

>I just read a quote that called hg “the Betamax of DVCSs” which I liked…

P

From what I remember… Mercurial was started as a backup plan in case things will go awry with git. People were once scared by the GPL. Managers believed that GPL can jump out of the git, infect all their dear proprietary software, and crazy things would break out.

The world has changed a lot since then, crazy things indeed broke out - but GPL was not one of them. Or the least of them. Many years later, we see Microsoft feasting on Github, git built into VS, and bash baked into Windows 10 /* how they closed the deal with the exclusive owner of both Linux and git names? A mystery! */

So what happened to Hg… Most of its goodies went into git, or transformed into user-friendly procedures (such as “Git Flow”). Some features went lost (such as sequential version numbers) but that is compensated by UI. Git itself has evolved. So, there’s nothing to regret. Besides of few nights spent installing and learning Mercurial))

There still is something interesting and IMHO worth seeing: Fossil. It is very small and cute SCM, packed into just one exe file. And the whole repository is packed into a single database file. You can carry everything on a flash drive. Otherwise, it looks quite similar to git - or Mercurial. It even can be its own web server, like Mercurial does. I use it for small personal projects and for quick sprints at work.

Regards,
– pa

I use a git friend, SourceTree. It’s free, it’s awesome, and if you need to integrate to a real project management system you can purchase Jira.

An incredibly better architectural feature of git (and friends) is file identity is tracked based on content (a hash) not a filename. This means you can take a tree of files, move them around, rename them, and do a commit, and git perfectly keeps the history of each file. Lots of version control systems handle file renames/directory structure changes poorly. In git, the directory structure is more an attribute of a commit. This also means you get almost perfect deduplication of identical files.

Another feature of git is often you will clone the whole repository when making a local copy. This means you have the whole history in your local machine, so you are not stuck twiddling your thumbs if you need to look at history and the central repository server is inaccessible. You also can do commits, or make branches, or whatever, without a connection to a central server, and then later push the commits when the server is accessible again. This means you can make a branch, change code, make nice bite size commits to that code, all while flying over the ocean in a plane or vacationing far away from the world (or not).

Other pluses of git and friends, VisualStudio has built in support with a GUI interface. The GUI interface is not as good as SourceTree, except in the one case where you want to see the deltas for a Unicode file, VS transparently displays it and SourceTree views them as binary not text. Git and friends also don’t need a remote repository server, you can have local repositories. Actually, git always uses a local repositories, and you may or may not sync to a remote repository. For almost every code project I make, even “experiments”, I immediately create a local git repository to track changes, which takes like 30 seconds or less of work. This allows you to snap the current state, or make a branch, try some experiment, and then discard the changes if they were a bad idea.

Even better, SourceTree (and perhaps git) has a facility called cherry picking, which means you can instantly look through the changes to you code, and revert or commit individual change blocks, not just the whole file. You can have your debug prints in your code, do a cherry pick commit, and not commit your debugging code, even though it’s still in your working copy. Cherry picking is incredibly useful when you are going along focused on change A, and for whatever reason make a change that really should be in change B. When you go to commit, you can easily split things into multiple commits, so you can keep your commits small and focused on one logical change. This is a VAST improvement from where you can only easily commit a whole file.

Git and SourceTree are also FAST. Like a couple years ago I checked out the Linux kernel source tree (I believe without history) in 10 minutes, over the Internet. SourceTree pretty much shows changes instantly.

And still another great architectural feature of git and friends is than each commit uses a hash on previous commit hashes, which makes a chain that can’t be modified. This means you can’t go back in the history, make a change, and have the following commit hashes match. This means git is intrinsically protected against tampering of the history, and can also easily check the integrity of the history. I’m not sure this level of integrity verification would stand up in court, as I suppose you could fake timestamps and rebuild the whole history (with different hashes), but if you have a commit hash value, that commit hash can practically speaking only be created with a specific set of commits.

Git and friends do have a few things I’m not fan of. High on the list is there is no incrementing version or commit number. A commit identifier is a hash, and has no relationship to the previous commit identifier. This means you can’t really use source code commit identifiers as a customer facing version identifier. I suppose you can put a commit identifier in your binary, and it will refer to the commit that matches the source code the binary was built with. You can make your own branches and tags for customer facing version numbers. It’s sometimes a little annoying that I can’t fix a typo in a previous commit comment, but that’s just part of the architecture than assures repository integrity, so I have accepted it.

What’s odd is I was not such a fan of git when I first started to use it a tiny bit. I’m not sure I would be such a fan if I didn’t have the SourceTree interface available. Now, I better understand some of the vastly better architectural features of git, so am a huge fan. Did I also mention the git core is open source, and there are multiple clients available from multiple vendors, using the same repository format, so I think chances are very good any repository will not become inaccessible because some vendor stopped supporting a product. I would be sad if SourceTree went away.

Jan

On 3/18/17, 11:03 AM, “xxxxx@lists.osr.com on behalf of xxxxx@osr.com” wrote:

So, colleagues… which DVCS wins for you? Why? Why would choosing the other one be a major mistake?

Tell us your tales… No votes for P4 or CVS allowed. Just git and hg, please.

Peter
OSR
@OSRDrivers

Love your write-up, Jan. Thanks.

About this, though:

[quote]
It?s sometimes a little annoying that I can?t fix a typo in a previous commit
comment, but that?s just part of the architecture than assures repository
integrity, so I have accepted it

On of the things I hate about git is that the architecture does nothing to ensure repo integrity. Wanna re-write history? Change the commits that were actually done to make the process more aesthetically pleasing? No prob… it’s git.

You can re-write any commit message you want:

You can use “commit --amend” for the last commit.

For older commits, you can take out the big “rebase -i” hammer

(I would be happier if git HAD no “rebate” command… but that’s life)

Peter
OSR
@OSRDrivers

On Mar 18, 2017, at 11:03 AM, xxxxx@osr.com wrote:

So, colleagues… which DVCS wins for you? Why? Why would choosing the other one be a major mistake?

Tell us your tales… No votes for P4 or CVS allowed. Just git and hg, please.

I was a huge fan of Mercurial, and for single-user projects, I still think it is a better solution than git.

However, as soon as you add a second programmer, Mercurial becomes a nightmare of useless and confusing merges. You spend half your time merging checkins from other programmers that aren’t even in the same directory as yours. Git is smart enough to automatically merge when the changes in the remote repository don’t impact your code.

So, for a single programmer, Mercurial. For > 1 programmer, Git.

Tim Roberts, xxxxx@probo.com
Providenza & Boekelheide, Inc.

@Jan Bottorff

“An incredibly better architectural feature of git (and friends) is file identity
is tracked based on content (a hash) not a filename. This means you can take a
tree of files, move them around, rename them, and do a commit, and git perfectly
keeps the history of each file.”

That statement is not quite correct. A file contents is stored based on its hash. A file identity if tracked by means of “rename” operation in commit records. You can have identical files in different directories of your worktree at the same time, or at the different times of history, they all will be stored as a single blob in your repository, but their identity will not be linked by any means.

Commits can be signed with GPG (which is not timestamped by a server). You can make a separate time-stamped signature for a commit by your own means.

Note that Git is using SHA-1 for which there are ways to generate collisions. This means SHA-1 may not stand in a court as a proof of untampered contents.

@Jan Bottorff

Also, for cleaning your local development history to make a clean push, “git rebase -i” is your friend.

> Note that Git is using SHA-1 for which there are ways to generate collisions.

This means SHA-1 may not stand in a court as a proof of untampered contents.

Git actually uses a tuple of (sha1,length) for identification, so they collision would have to occur while keeping the file length the same, IIRC. Also, git has just started work on refactoring the codebase to make it hashfunction-independent.

Mahmoud Al-Qudsi
NeoSmart Technologies

@Mahmoud Al-Qudsi

Git actually uses a tuple of (sha1,length) for identification
Not quite. Any object is uniquely described (and fetched) by its SHA-1 name. References to tree objects in commits, and to blobs in tree objects don’t include their lengths.

But an object’s length is a part of the object header, which is included to the object hash.

Thanks for the clarfication, Alex.