Why Git Needs an Editable History

Once upon a time, I was work­ing on a per­sonal pro­gram­ming project which I kept in a pub­lic repos­i­tory on Github when I unthink­ingly com­mit­ted some sen­si­tive infor­ma­tion into git and pushed it to the repository.1 It was a remark­ably hare­brained mis­take I admit, but these things hap­pen to every­body sooner or lat­er. Any­way, I did­n’t notice my mis­take right away and I pushed a lot more changes to the pub­lic repos­i­tory before notic­ing what I had done, sev­eral mon­th’s worth.

Nat­u­ral­ly, I pan­icked. What could I do? Luck­ily the project was unre­mark­able and only use­ful to me, so it was unlikely any­one had noticed. How­ev­er, I still needed to get rid of the data. I could take the project down of course, but that meant that the data was still in the local repository; I would­n’t be able to put it back up until I had some­how removed the info, not only in the cur­rent com­mit, but from the entire project his­to­ry. What I really need to do was to go back in time, and change his­to­ry, but I did­n’t want to lose any of the work I com­pleted since then.

For­tu­nate­ly, I was using Git, and Git lets you do that. Git is like a movie time machine. You can go back in time, kill Hitler, and then return to future con­fi­dent that not only did you pre­vent the Shoah and WWII, but you did so in a way which did­n’t inter­fere with any of the good things that have hap­pened since then. So it’s kind of like mag­ic. Git saved my ass is what I’m saying.2

Now, for some rea­son, there is a large con­tin­gent of peo­ple who seem to think that com­mands like “git com­mit –amend” and “git rebase” are a bad thing. These peo­ple take a fun­da­men­tal­ist view of project his­to­ry. For them, his­tory is invi­o­lable. It is vitally impor­tant that they every bug­gy, incom­plete, or just plain bro­ken com­mit ever made through­out the entire his­tory of a project be pre­served no mat­ter what. Ver­sion his­tory is a sacred pact and any­thing short of a per­fect and com­plete rep­re­sen­ta­tion of all com­mits ever made is lit­tle more than a lie.

This notion baf­fles me, yet there is a cer­tain logic to it. If a pub­lic repos­i­tory has it’s his­tory changed sud­den­ly, it will be impos­si­ble for other repos­i­to­ries to pull from it with­out chang­ing their ver­sion his­tory too. Mul­ti­ple squashed com­mits might leave out infor­ma­tion about the devel­op­ment of a fea­ture. While it’s per­fectly pos­si­ble to avoid these pit­falls while retain­ing the abil­ity to change repos­i­tory his­to­ry, these pit­falls do exist.

How­ev­er, I can’t see why this is rea­son enough for abol­ish “git com­mit –amend” and “git rebase.” Even if 99% of the time “git rebase” causes more harm than good, that still leaves 1% of the time where it causes more good than harm and that even leaves open the pos­si­bil­ity occa­sion­ally that it might be the only way to solve an impor­tant prob­lem. That was cer­tainly the case for me. Sure, I could have rolled back my changes and lost mon­th’s worth of change his­to­ry, or I could have started a new repos­i­tory from scratch and lost all of my change his­to­ry, but either of the solu­tions would have been worse from a his­tory fun­da­men­tal­ist per­spec­tive than sim­ply purg­ing a file from the repos­i­tory his­to­ry, not bet­ter.

What it really comes down to is devel­op­ers assum­ing they know bet­ter how tools should be used than the peo­ple actu­ally using the tools. Ulti­mate­ly, the choice should always be that of the user. Don’t like “re­base”? Don’t use it. But, don’t sug­gest its removal because oth­ers’ needs are dif­fer­ent than yours and you can’t antic­i­pate every need or use-­case. “git com­mit –amend” and “git rebase” are killer fea­tures, even if they could con­ceiv­ably some­times be mis­used.

  1. It was a database login. I know that I could have just changed the login; I did so, and later deleted he whole database. That’s not the point. It’s impossible that you didn’t use a similar password elsewhere and even if you didn’t, it could still reveal information about your password creation process. It’s better to restrict that kind of information as much as possible and not get sloppy. 
  2. The way to do this is documented here on Github. 

Last update: 29/08/2012

blog comments powered by Disqus