Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I'm in the camp that rebasing is unnecessary and useless for large projects. I work in a code base with 50 other devs, no one is reading the commit log line by line, there are 100 commits per day. We use JIRA and have to prefix our commits with TICKET-<ID>. If I want to see the commits for a ticket, I filter the git log on that. If I want to see the combined diff for the commits, I open up the PR in JIRA.

Squashing public commits I'm also against. People say it 'cleans' the history. I would say it erases it. Who, when, why - all gone. For what? to save a few commit lines? Like I said, our commits are tagged with the ticket number already, if you want a summary of what the changes were and all the commits that were part of it, you open the ticket, or filter the git log on the ticket. No need to move commits around in the history, they show in the same order when you apply a filter.



You've just described the problem that rebasing fixes. Because your git history is a mess you have to look up PRs in Jira rather than having your git history be something useful by itself.


What he says is that they can achieve what they want, by other means than looking at the git history by itself, and it sounds like it's less work and scales better with the number of committers.


Except for when you change from JIRA to some other project tracking tool. I think this is worth considering since project and people management trends tend to suffer from more churn than version control systems.

In my book, it's easier to auto-import sensible git logs into JIRA than trying to do the reverse. Then again, I'm of the opinion that git history matters, specifically for commit messages. Otherwise, you might as well create a pre-commit hook pointing to http://whatthecommit.com


...until you change your organisation tools.

If you keep your development rationale with the code, you can always change or rearrange the rest of your tooling without losing vital information. If you put your edit history in a loosely-coupled system, you've now made that system as essential to your codebase's survival as the code itself.


How does rebasing fix any of that? Rebasing changes your history, and can just as easily make a mess of your history as it can make it prettier. I use it sparingly, but I don't mind people avoiding it altogether. Rebasing fixes nothing.


It's not a mess at all, very clear actually. We enforce commit messages to be prefixed with the ticket number which makes it easy to query git log for all changes related to any PR with full history of who, when, and why each change was made.


> no one is reading the commit log line by line, there are 100 commits per day.

I hardly ever refer to the commit log. However, I do often run `git blame` to figure out what changes were done to the code base together with a specific modification to a specific line of code. That works so much better if people have actually spent some effort to make sure related work is in a single commit.

(Note that this doesn't mean "squash every PR together" - it means rebasing before submitting a PR to make sure the commits also work as documentation. The result of that rebase might very well be several commits.)


What is history for? You already said yourself that nobody is reading it. So why not squash the whole master branch every time? History is an artifact of development and should be maintained or disposed of. It turns out history is useful for at least one thing: tracking regressions. And that's a hell of a lot easier to do if you can bisect a clean, linear history.


People are definitely looking at history when running git blame on a file. We want to know exactly who, when and why a change was made.

Squashing down a PR to a single commit erases all of that history and we have no idea when the change was made, who made it, who to talk to, or why it was made other than it was part of a larger feature set.

It might work fine for smaller PRs, but where I am we have multiple teams each working on a task that can take months to complete with lots of testing before it's ready to merge into a release branch.


Git blame is very useful, yes. But what do you need to know? Do you need to know that Josh made a commit called "fixes" at 3pm on a Wednesday (while it was raining), or do you want to know what feature this commit was part of?

I don't argue for squashing down the whole branch into one commit. That's silly and could be throwing away valuable information. But cleaning up the history using rebasing is still possible. We use "fixup!" commits so that a simple autosquash does the right thing most of the time.

I've actually had developers undo over-eager squashing before and introduced them to both the reflog and interactive rebasing. They are shocked the first time they learn to "undo" a rebase, but it's a liberating experience to actually understand git.


When people talk about squashing branches, it means squashing all commits into a single commit (git merge --squash, or using the squash feature of github / gitlab / hosted-git-solution).

The reason you use blame is to find the exact commit that changed a specific line, and hope to find the rational / decisions that lead up to this change, assuming people are disciplined enough to document that in their commit messages. If not, maybe you could even ask Josh if they still remember why they changed things a certain way.

Many developers commit append-only until their feature works, which means there are a lot of noise commits in between, which they don't (know how to) clean up. Rather than having all this noise, projects opt to squash the commits instead.


My team also uses feature branches, with a squashed merge commit. They used the argument that it's easy to revert that commit.

But I didn't want to lose the detailed history either, so our squash commit message contains the complete list of squashed messages, and we keep the old feature branch alive forever on a separate remote.


It’s equally easy to revert a merge commit.


That's good to know, I didn't challenge the argument when it was presented.


I'm curious how your commit history looks. Doesn't it have tons of useless commits like "removing comments", "forgot semicolon" etc?

Here's Linus Torvalds' thoughts on rebasing. https://yarchive.net/comp/linux/git_rebase.html


I agree with Linus, "you really shouldn't rebase stuff that has been exposed anywhere outside of your own private tree"

I have no problem if you want to locally squash your commits. Just don't rewrite public history. It's not worth it. At our commit rate, no one is reading the combined history logs without filters first.


You never run git log on a single file or directory?


That would qualify as a filter...


Yes. And it should. It's your commit history, so it needs to be a history of your commits.

One day you'll find a commit labeled "removing comments" that breaks something because the dev palmed the trackpad while hitting the delete key and lopped off a line of code. You want that to be something you can find with a bisect.

It amazes me to see a bunch of discussion here from people arguing that one should throw away history so that some silly graph view is prettier.

History isn't meant to be pretty. It's meant to be history.


Those aren't his thoughts on rebasing, they're his thoughts on rebasing "somebody else's work".


I don’t rebase and I don’t have these sorts of commits. I do use commit —amend when I see a typo in code I just checked in, though, and git stash if I need to set my current task aside for a moment to fix or prepare something else.


Very rarely, I undo and redo a commit to fix a problem with it, in cases that are too complex for --amend.

But once history has been shared (pushed, generally), it really shouldn't be touched anymore.


Sometimes the cure is worse than the disease.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: