How to merge a subfolder git into its parent’s git repo

Since years ago I have this centralized repo where I kept notes, code snippets, diagrams, lab codes and basically everything I have learned that I think will be valuable. I do that because I know I would like to come again someday there and regain knowledge quickly. I find this organization to be extremely helpful for me.

So, naturally, when I took the CSCI-E95 last semester, which was awesome, I wanted to have whatever I will learn to be inside that centralized repo as well. And of course, the code artifact is of utmost importance.

However, the class has its own working repo where us students will submit our works to. So, even if I pull that class repo into my primary repo, the class repo will always be a distinct repo by its own right. That is, even if my folder looks like this:

1
2
3
4
5
6
7
root
├── labs
├── languages
├── ...
└── classes
    └── hes
        └── e95-spring-2022-adamnoto

The e95-spring-2022-adamnoto folder lived on its own repo, detached from whatever the root folder’s repo is.

How should I merge it? …that’s what I wondered.

The easiest thing I can do, is to simply remove the .git folder inside the E95 repo and merge that into my main repo.

But then, I will lose my carefully crafted commits history:

Commits history of my E95 repo

I can as well make the E95 repo to be a submodule of the parent repo. But, that’s also not that desirable as the E95 repo’s is not in my “domain.” Although I believe I will forever have the access to the repo, or that I can ask to regain the access if I lost it for some reason; the fact stays that I completely will have no prime control over the source repo. What if someone mistakenly force-pushed something and deleted everything, and I have no backup? My hard work may forever be gone. (Yeah, I do have a backup; but still! 😭)

So naturally, I want to:

  • Merge the E95 repo into a repo owned and controlled by me
  • Retain the commits history

I have searched the internet about how to achieve it, and came across this awesome posting that did the trick.

Let’s do it step by step.

First, clone the source repo

Exactly like what’s written in the post: first, I cloned the source repo into a /tmp folder, so I ended up with /tmp/e95-spring-2022-adamnoto. Then, I cd into it, so: cd /tmp/e95-spring-2022-adamnoto.

Surely you will have a different repo with a different name. But the point is to clone it into, let’s just say, a folder under /tmp and then cd into that repo’s folder.

Also, you may want to remove the origin remote channel, so that we won’t commit anything into the source repo by mistake.

1
git remote remove origin

Rewrite commits history on the source repo

Next, we use git filter-branch to rewrite the commits from that source repo. I want my source repo, when merged into the primary, centralized repo, to be located under this folder: classes/hes/. So, I should be able to find my e95-spring-2022-adamnoto from that subfolder inside my primary repo.

To do that, I issue git filter-branch to rewrite the commits from within the E95 repo I cloned into the tmp folder earlier.

1
2
3
4
5
6
git filter-branch --index-filter \
'git ls-files -s | sed "s-\ -&classes\/hes\/e95\-spring\-2022\-adamnoto/-" |
GIT_INDEX_FILE=$GIT_INDEX_FILE.new \
git update-index --index-info &&
mv "$GIT_INDEX_FILE.new" "$GIT_INDEX_FILE"
' HEAD

The command above is slightly different from the original post. For example, instead of \t in sed I issued a real tab by doing Ctrl+V then pressing the tab key, to insert a real tab.

Clone the destination repo

Yes! You heard that right. Although you may have the destination repo somewhere already, please clone a new inside the same tmp folder to make things simple. So by now, we should have both the source repo and the target, or the destination repo inside the tmp folder.

Merging from the source to the target repo

It’s time to merge our source repo into the destination repo. First, we must cd into the folder we cloned our destination repo earlier. Then, add the source repo as a remote:

1
git remote add -f source /tmp/e95-spring-2022-adamnoto

Of course, you will want to change the command above accordingly as the source folder will likely be different.

Then, still within the same directory (that is, the destination repo) we perform git merge :

1
git merge --allow-unrelated-histories source/master

The --allow-unrelated-histories is needed on my version of git. This might not be necessary if you’re using a different platform or an older git, or even perhaps a newer git app.

That’s it! Now we can remove the source using git remote remove source and then we can push our repo.

In my own experience, I did not even need to push force. I also had no conflicts to resolve. Your mileage may vary but I think that should be the case for you too.