The Definitive Deep Dive into the .git Folder

Video size:

Abstract

What’s in the .git folder? How are commits stored? How do branches work? We’ll dive deep into the objects folder, unpack commits, look at the types of DAG nodes, examine object content, and build a complete visualization of the stored content. We’ll also quickly look through Git hooks, Git config, and ref logs. Come experience the zen of git.

Summary

We'll look at every single file inside the git folder. You'll leave here understanding how git works internally. AZ Givecamp brings volunteer developers together with charities to build free software. If you're in Phoenix, join us for the next AZ give camp.
We want to look at every file inside the git folder. This git folder may be hidden by default in your operating system. You may need to come to view options and choose to show hidden files and folders.
Git Explorer is a mechanism of being able to visualize the git history. Each commit includes a commit, one or more tree nodes, and one ormore blob nodes. If only there was a way to visualize these without looking at each one.
Gitlog oneline graph graph will give us this ascii art. Let's pivot from looking at objects to looking at branches and how branches work. Now let's see not only just the red commits, but we also get to see the tree nodes.
Git is a tool that lets you push files to a server. Here are some examples of how to do it. Let's start with a local repository. Now what if we could push this up to a remote server? Let's create hooks.
The git hooks package is in node and is called git hooks. Once you install git hooks, it automatically creates sim links from all of those hooks inside the hooks directory to another folder. It's a great way to create automation.
Next, let's look at configuration. This config file tells us the details that we've built into our repository. If I want to set a default branch in a different way for this repository I could set it here. Gitinstaweb works really well on Linux to just spin up a quick website to look at the repository.
Next, let's hooks at logs. Inside logs is this head file. We were able to look through the objects to be able to see all of the content in our repository. This has been a lot of fun getting to show you deep into the git folder.

Transcript

This transcript was autogenerated. To make changes, submit a PR.

Hi. Have you ever struggled with git commands? Have you ever wondered how it works behind the scenes? Have you ever wondered what all those files are in the git folder? We'll start with an empty terminal. We'll build up come git history, and we'll look at every single file inside the git folder. We'll visualize objects. We'll take a look at branches and tags. We'll look at logs and x's. We'll look at temporary files and hooks. You'll leave here understanding how git works internally, and that'll help you to be much more productive with Git. Here's the part where I tell you, I'm definitely going to post my slides. Well, there are no slides. This is it. You can get to my website@robrich.org click on presentations. You can see the definitive deep dive into the git folder here at the top. Click through to the code and you'll be able to see the code that we're going to look at today. So it is online right now. While you're here, click on about me and you'll get to this spot where we look at some of the things that I've done recently. I'm a Microsoft MVP, a friend of Redgate, a developer advocate for Cyral, and let me tell you about AZ givecamp AZ Givecamp brings volunteer developers together with charities to build free software. We start Friday after work. Sunday afternoon, we deliver that completed software to the charities. Sleep is optional caffeine provided. If you're in Phoenix, come join us for the next AZ give camp. Or if you'd like a give camp here or where you're connecting from, hit me up on email or Twitter and let's get a give camp in your neighborhood too. Some of the other things that I've done I worked on the Gulp team in version two and version three. That was a lot of fun. And I replied to a Net Rocks podcast episode. They read my comment on the air and they sent me a mug. Woohoo. So there's my claim to fame, my coveted net rocks mug. So let's dig into git. Now we want to look at git and our goal is to look at every file inside the git folder. Well, here we have an empty directory, all the way empty. So let's create a new repository here. Now when we create a new repository, we could choose to clone a repository. In this case, we're just going to create a new one. Git init. Now when we say git in it, it creates this new git folder. Now this git folder may be hidden by default in your operating system. You may need to come to view options and choose to show hidden files and folders. Now that you've got it, we've got a lot of stuff here in the git folder already. Now I'm going to open up this folder inside vs. Code. Now I'm not opening up the containing folder, I'm actually opening up the git folder because vs code hides the git folder by default and I really want to look at what's inside. So let's create come content here in our git folder. I'm going to say echo file one and redirect that to file one txt. I now have a file one txt okay, git add file one. Git commit m and I'm going to break convention. Instead of calling this initial commit, I'm going to call it file one. Okay, git log. Now we can see this content. We have this commit. Here's the content in it. Let's take a look at some of the things that are inside of this commit. I'm going to open up the git folder here inside vs code. And let's first look at objects. Now we have a bunch of folders inside objects. And inside each folder we have some files. And what we saw here was that this commit was d 32 D. Let's look for that one. Well, we have a d three folder here inside the objects folder and yep, that's the file. Let's pop it open. Now it's a binary file, so it looks really weird. Well, what's inside this? Well, this is a zlib complete file. So how do we uncompress it? Well, we could pipe it to gunzip, or we could pipe it to a bunch of stuff. Here's a c program that is able to do it. There's a python or there's perl. And so knowing that it's just deflated, I wrote a program here called Unzipper. Now, unzipper is pretty easy. It just unzips it. Let's open the file, let's read the file, let's zlib inflate it and print the contents of it. Okay, so I want to go after that d three file. So let me unzipper gitobjects d three whats file and let's take a look at it. Okay, so here's that commit. It's got all of this content. There's my name and email, and here's the date, time, and time zone of this commit, it says file one. Oh, zero b nine f 29. I remember seeing that here. Zero b nine f 29. Okay, so let's unzipper that. 10 b nine two f nine. And here's the contents of that. It's a tree node and we've got this content here that's kind of interesting. Well, there are built in git commands for us to do this as well. And so we can say git cat file and we will give it a git hash. So like zero d. Oh, that wasn't descriptive enough. Zero b nine f zero b nine f. Oh yes, I need to get cat file and pass in the type. Okay, so let's say t zero b nine f. And we can see that this is a tree node. Okay, if we go look at this one d thirty two d. Let's get cat file t. We will see that this one is a commit node. Well, what type is that? Third one. Whats third one? E two one two git cat file t. E two one two. It is a type blob. So we see that we have commits, types and blobs. Let's take a look at the contents of the commit. Okay, so d three two eight. But instead of saying t, I'm going to say p to get out the contents. Here's the contents of that file. And we saw this before, the author and the committer name, email, date, time and time zone. And it references this tree node. So let's go look at this tree node. Git cat file p. There's that hash. Now why am I not putting in the entire hash? We only need to put in enough to make it unique. So as long as I have three or four characters. But that's why often you'll see git logs truncated to seven characters is we only need enough to make it unique. Okay, so when I looked at this tree node, it includes file one txt and here's the file permissions. It is a blob. And here is that blob e 2129. Hey, that's the last commit that we had, git cat file t. That one. It is of type blob. We knew that because it was a blob here. But let's say p. And we see that here is the contents of that blob file one. Yeah, that's the contents of our file right here. Here in the git folder, we put file one inside that file. Cool. So we get to see all of the objects. We have commit nodes, we have tree nodes and we have blob nodes. Okay, so let's commit some more things. Echo file two. We'll redirect that to file two. Txt git add file two git commits. M file two. And now if we come in here and we refresh the objects folder, now we have six commits. Well, kind of. We have six nodes. We only have two commits. Each commit includes a commit, one or more tree nodes, and one or more blob nodes. So in this case, because we're only committing one file and in the root directory, then it only ended up being three. If only there was a way to visualize these without looking at each one. Well, that's where I'll show you git Explorer and that's the code linked to this talk. Git Explorer is a mechanism of being able to visualize the git history. Once I set my git repo to the path to my repo not the git folder, but rather the folder that includes the git folder, then I can start up this website. So here's my website, let's push refresh and we can see those six nodes. If I click on a node, I can see the content. Ooh, that's the blob node. Here's this node and that's a tree node. Here's another blob node. Let me show it in alphabetical order. Let me show the blob history. And now I can kind of click through each one. Well that's interesting. I can kind of see all of those things, but I'd really like to understand what type it is straight away. Let's show the type. Commits will be red, trees will be blue, logs will be green. Okay, so here's a commit. It references this tree, B 75. Okay, so let's click on B 75. B 75 has these two blobs as part of this commit. Two files are in the repository at this commits. So we have E 21, E 21 is this one and we also have six c four, six c four is this one, and here's that other one. Well, we could look at it this way and that's interesting. But what if we could do kind of a parent child type of relationship, not that unlike what we see in a git log. So let's click on parent build. I'll show the tags again. Here's the commits and let's show lines referencing them. So the red ones are the commit nodes. Here's my initial commits where I commit file one and it references this tree node. Whats includes this one file. Here's that blob that references that content. Then in my next commit I committed this one. Notice how it now has a parent commit where the other one didn't. The parent commit is this one and it also has this tree node right here. This tree nodes says that there are two files in the folder right here. Here's one file file one and here's another file file two. That's cool. The objects folder stores all of the content in the git repository. So as I add new content into git, then I get new nodes. Well let's see when they're created. Let's echo file three to file three. Txt now I have a new file in my repository. Do I still have six nodes? Yep, I still have six nodes. Let's git add file three. Now I've just added that to the repository. Let's refresh and we see that we now have seven nodes. Let's do parent child show the type, show the lines, show the tags. And we have this new blob. Now this new blob isn't part of any trees, it isn't part of any commits, but it is staged in the repository. So we have this blob node. Okay, now let's git commit m file three. And now when we commit it, let's refresh this again. Show the type, parent child, show the lines, show the tags. Now we have this new commit that references that blob. Well it references this tree. Now this tree notes that there are three files in the repository at this point. And each of those files here's one, here's two, here's three. So we don't need to duplicate the files as we change them. That's cool. We saw how adding to the staging area created the blob and how the commit created the tree node and the commit nodes that referenced those, all of those here in this objects folder. So let's do some interesting things. What if we create a file inside of a folder? Let me create a new folder here and I'm going to say this is folder. Now this folder is currently empty, that's totally fine. Let's come back here and we will say echo file one into folder file one. Txt git add folder file one. Git commit mfolder file one. And now I've created a commit inside of the folder. Let's show type parent child, show the lines, show the tags. Here's that fourth commit. Now it references this tree node. Now this tree node says I've got these three files in this folder and I've got this other folder that is a tree node. So that other 10 b nine f is right here and zero b nine f says hey, I'm a folder that includes this one file. Yeah, here's file one and file one is inside this folder. But file one is also at the root of the tree too. We still only have three blob nodes because we have three distinct files. That's pretty cool. So when we create a new commit inside of git, it's going to try to reuse the logs as much as possible and it will try to reuse the tree nodes. Well, let's go change a file. Let me come in here out in this folder and I'm going to say file three, add some code, change some lines. Okay, back in our repository git status we can tell whats we have this file that's modified. Okay, git add file three. Wait a minute, file three was already in the repository. Why am I adding it? Well, what I'm adding is the changes to the staging area git status, and I now see that it is stored. Git commit m modify file three okay, I've modified file three. Let's refresh our visualization. We will show type, parent, child, show the lines, show the tags. Here's that new commit. It modifies file three. So here we have a reference to folder. We have a reference to files one, two and three. But file three is ECA seven. Here's ECA seven with all of the lines in this file. Now what if we weren't done modifying this? What if we noticed, hey, I've got some secrets that I accidentally checked into file three. So let me modify the file, remove the secrets. Maybe I need to add some extra semicolons. Now I could just say let's get add file three. Now I could just create another commit. But we know that that other commit exists and our goal is to remove the secrets. So here, let's say no git commit m file three changed, but let's say amend. By saying amend, we are going to rewrite the previous commit. So we just created a new commit. If I say a git log, let me shorten that git log one line. Now I have file three is changed and it is right on top of folder. Let's refresh our graph. Now unlike git log, we now get to see not only just the red commits, but we also get to see the tree nodes. And we get to see the blob nodes underneath. Here's that commit 30 one f. We saw that right here 30 one f and its parent is 10 f. Here the parent is 10 f. There's that other commit. But notice how this commit is still there and types. The secrets that I checked in are still in my repository. Now here in this commit, this commit references this tree node where file three is f. Six. Here's this one. And that line with the secret is no longer there. But the secret is in my repository right here and right here. If I've pushed this repository into a public spot, or if I've shared this with anyone, even if it's just on my local machine, I should probably consider those secrets as exposed. The commits are still in the repository. Now, the commits will eventually get collected, but they aren't yet. As part of every command, git will start up and it'll say, hey, do I need to do some garbage collection? And if so, it'll run gitGc behind the scenes. Now in this case, we didn't. It hasn't been long enough, but here's that detached head. Okay, so git log one line. Let's check out here. Git check out this one. Ooh, we are in a detached head state. Well, that is an overly scary message. Did we just get into a zombie dimension? No, what it says is that git is not pointing at a branch. Git is pointing at something else. Git log, one line. Head doesn't have main or another branch label here. Well, let's pivot from looking at objects to looking at branches and how branches work. So we saw the objects folder, how we have objects for commits, types, and blobs. Here in the refs folder, we have all of the details for where we are. Here's the main branches. It's at 301 f eight. Okay, well, I didn't see that here. Instead of git logs oneline, let's say gitlog oneline graph graph will give us this ascii art, which is kind of interesting. Let's actually add, decorate, and decorate will add branch labels. Well, in this case, we were getting the branch labels without this because I actually have a setting turned on that shows them to me all the time. Now let's say all shows us not only the commits for our current branches, but also the commits for every branch. So now we see 301. And we see that main is pointed right there. Now the cool part about this file is it is just that commit hash. Now, there is one here called head. Head is at one eight. If I wanted to move head to say this commit, I could say head is over there. Now let's do a git logs oneline graph decorate all and here's head pointed at main well, we kind of messed up our working folder. We still only have the files as if it was right here. Let's undo that and put head back where it is. Okay, types head is back where it is git status and we're still in a detached head state. That means that head is pointing at a commit instead of a branch. Let's check out a branch git checkout b branch now if we do a git log oneline graph decorate alt we see that head is pointing at this branch. Let's pop open the head file and it now says the ref is refs heads branch. Okay, so refs heads oh, we have this new file called branches and branch has one eight git branch names are just name tags. They point at commits so we could move them around. Let's say git merge file three. And now if we do log one line, we see that branch moved from file two to file three. It now says one e 21. The file branch says one e 21. Head still points at ref's head's branch though. That's cool. If I were to say git checkout main now head points at ref's head's main. Now main is still at 30 one, but head points at main instead of pointing at branch. Git checkout branch and now let's commit a new file. Echo file seven redirect whats to file seven txt git add file seven git commits m file seven and now let's go look at our graph and see how this works. Show types, show lines, parent child show tags. Okay, so we've got this new commit that points at nothing. Oh no, that's main. Here's Main. We've got refs has main there. Here's branch refs heads branches and head is there as well. We still have whats dangling commit with our secrets in it. But that's how we got those things split out. We have commits going this way and commits going whats way. Now if we say git log oneline graph decorate all we can now see based on this decorate that it goes in different directions. That's cool. Git merge main and now we can see whats those go back together. Now in the log here we're only showing commit nodes, but we saw how we can see other nodes. Now what if I want to pick up main and put it where branches. Well I can take a look at where is main. Main is right here. Where is branch branch is right here. So I could say I would like to just move you there. Now we saw that bad things happened when we did that last time. So let's do this ninja move git update ref ref heads main to point at branch. We'll look at that log and we just picked up main and put it here. Now that isn't a merge. It doesn't try to reconcile differences. It just literally picks this thing up and puts it over there. Do not pass go, do not collect $200. You may lose history, but update ref refeads main. We saw how that updated this ref refined this file. That's perfect. So now let's tag something git tag v zero one git log oneline graph decorate all and now we see a tag. Well tags are just stored in this tags folder. Here's v zero one and it just happens to point at this commit. The marketplace can help with one files. So here's that tag and it is as branches are just a name tag pointing to other things. Now so far we've only been working with our local directory. Now what if we could push this up to a server? Let's come over here into this folder, this server folder. This folder is empty and let's create a new repository over here. Git in it was how we created a repository before, but let's add a bear in this case. Now bear will create a repository, but it's only the contents of the git folder. There is no git folder in this case. Now that's perfect for a server. When we don't need a working directory, we don't need a checkout mechanism, we just have a server. Okay, so here in this server we have rafts, we have objects, we have all of the things. Let's add this as a remote git remote add origin. Now why is it origin? By convention? We could call it server, we could call it foo. And what is the path to that? I could say HTTPs GitHub.com slash blah blah blah. Or I could point it to a UNC path server name share in this case I'm just going to point it to this relative path here in my folder. So I'll go up a directory server and so I'm going to add that remote there git remote v and I can see that there's my origin server git push origin main I push all of the content from my machine to that remote machine. And now if I do a git log, oneline, graph, decorate all, I can see origin main there as well. Let's hooks inside the git folder at the refs and I now have a remotes folder. Remotes has an origin folder. That's the only remote that I have. And here's main. Here's where I think the server's main is. Now, if other people contribute to this and I push and pull, then this remotes origin main may update. To do that, well, let's push the other branches that we have. Git, push Origin branch, and we'll also push tags. Okay, branch points to branch v 0.1. Points to v 0.1. And so here in my folder now, I have in remote's origin I have branches and main and all of the content is there. Now, I don't have a remote tracking branch for tags, but I do have a remote tracking branch for branches. That's kind of interesting. So we took a look at refs, and refs is how we store our name tags, our labels, our human readable things because, well, these git hashes are too long. Let's next look at hooks. Here's hooks. In the hooks directory we have various files. Now here's the apply patch message sample. We can see that it's just a shell script. It does some interesting things. Here's another one. It's a shell script. Here's another one. Ooh, this is a perl file. That's pretty cool. Here's another shell script. Now, they're all named sample, and they're all named sample because, well, they're just examples. If we were to remove the sample part, then that hooks would be active. Let me go reach into my stash and go grab a copy of all of these that don't have the sample after them. Now, in each of these files I just remove the dot sample and then I also echo the content that is coming in. What are the arguments that are passed to it? So commit message. Let's echo commit message and the arguments. Now, these hooks are great for automating interesting tasks. For example, let's say echo file six to file six. Txt. Did we get any hooks there? No, we're just echoing the file git add file six. Let's add it to the staging area. No hooks there. Git, commit m file six. And now we get some hooks. This is perfect. We get a pre commits hook. We get a prepare commit message hook. We have a commit message hook. And at each step we can do certain actions. For example, at a precommit hook, we might want to run unit tests or linting on all of the files that are staged or perhaps the entire project. Prepare commit message hook we might want to validate the commit message follows our naming conventions. Maybe we require that it references a particular issue. And then the commit hook, let's validate that all the things are there. Git push origin branches let's push this up to the remote repository and we see that we get a pre push hooks. At the pre push hook we could take a look at other content. We could also have hooks on the server that might trigger a remote build or something interesting there. Now these hooks have been really interesting in being able to automate a lot of workflows, so I want to check them into my git repository. Let me say gitad Githooks commit message git status. We can't commit the contents of the git folder. So how do we share the hooks? Right now the hooks only work on my machine. They don't work on your machine. There's a package here, and there are many packages to do this, but this package is in node and is called git hooks. The cool part is once you install git hooks, it automatically creates sim links from all of those hooks inside the hooks directory to another folder. So you can control where the folder is, but by definition it is the hooks folder or git hooks folder. And so now if all of the files are here in this gith folder, now I can commits them to repository and share them. And the cool part is the moment that you NPM install, you get those sim links too. Now the sim links aren't automatic, but it's pretty cool. And there are similar packages in other libraries. If you're in Python, for example, you can use a python library that will do a similar mechanism of moving these hooks into a spot where you can commit them and sim linking back the moment that you pip install. So we've got hooks great ways to automate processes, whether it's unit tests or whether it's other content. It's a great way to create automation. Next, let's look at configuration. Now we have this config file here. Now this config file tells us the details that we've built into our repository. This config file actually overrides the config file that we have in our home directory. Let's go find it. So in my case, because I'm on windows, I'm going to go to c users rob and I'm going to open up the git config file. Now here in this git config file is all of the details that will apply to all repositories on my machine and this one actually overrides another one. So if we come here to open file and we go into in my case I'm going to go into program files. Git, I bet it's in git config. There it is. Now these are the options that I chose when I installed git on my machine. If I reinstall git, it will replace this file with the options that I choose there. Now this is the machine one, this is the system one which is user specific, which overrides it. And then this is the local one to my repository which overrides it still. So let's imagine that I'm working on a business project and well when I started off I created this user name and user email and I put in my personal details, but in this project I want to create it with my work email. So let's override this. Robrich@company.com now this is the specifics for this repository that is now tied to my corporate account. Now I could override anything. I could override the things in my machine config or in my system config here in my system config I've identified my merge tool, my diff tool, long paths and some shortcuts, some aliases. But here's that default branch setting that is really cool. If I want to set a default branch in a different way for this repository I could set it here. Maybe I want to call it trunk. Now that's just the default branch, that isn't the branch that I'm currently on and it doesn't even need to exist. But that's the default branch that will be created when I create a new repository. So I'm able to override the configuration in a really elegant way that's in the config file. There are some other configuration in this repository as well. Inside the info folder there is an exclude folder. Now I could choose to exclude bin, ob nodes, modules, vn, all of the directories that get built as part of my project. Now this exclude file is interesting, but this exclude file is well in my git folder. So rather than messing with this configuration file, let's instead create a new file, echo bin two git ignore. And now let's go grab this git ignore file and we can edit this git ignore file. Now by editing this git ignore file and putting all of those details there, now I can actually commit this one git add git ignore. Now the git ignore is great, because I can share it, but if I want to override that specific to my machine, the info exclude folder can work out great there. Now I'd recommend not doing it in the info exclude file because it's not shared. There was one gig where I worked on where there was one machine whats was possessed and the reason was we had filled up the exclude file with various things and it was only on that machine. And so yeah, as soon as we cleaned out the exclude file everything worked out great. Let's look at other configuration. Here we have the config, we have the exclude. We also have description. Now this description is the details for Git Instaweb. Gitinstaweb works really well on Linux to just spin up a quick website to be able to look at the repository. It doesn't work on Windows and really our websites have kind of caught up and exceeded this. So probably you're using GitHub or BitBucket or GitLab and so you probably don't need to use Instaweb. But if you want to give your repository a description in Instaweb, here's the description file. Next, let's look at this index file. Now this index file is a good mechanism for looking at, well wow, that's a lot of garbage. This shows the current directory. Well how do we read this file? What is in the git index file? Well, we can take a look at content with Gitls files. There's also a mechanism here. It describes all of the mechanism of how that file is built. Here's a python program called Gin that is able to parse that index file. But let's actually use that Git command. Gitls files stage stage gives us this extra column that tells us what stage of a merge it's in. Right now it's zero because none of them are merging. So this shows me all of the files in my current working directory and the hash of the blob for each one. So we've got git ignore and git ignore isn't committed yet, but it is df three c, nine. Let's come back to our graph, let's refresh it. We will show type, parent child, let's show the lines, let's show the tags. And here's df three, c, nine. It shows the blobs. So that git can quickly look at is the hash of the file the same as the hash of the blob? If so, that file hasn't changed and I can ignore it when I'm doing things like staging things. So this index file is a great reference for all the things in my working directory. That is cool. We also have temp files here. In this repository we have a ridge head. This is the message passed into the hook. It writes it to disk so that it can pass it along. Here's the commit edit message. Here's the message that I was creating inside this commit passed into all the hooks. Info refs. You might find a refs folder inside the info folder. In this case we don't have one. A ridge head. Where was head before we were about to commit this? That's kind of interesting. Temp files just kind of kick around and ultimately ignoring them is probably good. Next, let's hooks at logs. Here's a logs folder. Inside logs is this head file. Now we started out with commit zero and then we created commit D 32 D. Then we're at d three. We went to one eight five. And so we can see that this is a history of all of our commits. Let's do a git commit m git ignore git logs, oneline graph, decorate all. And now we've got all of this content. Now I have pushed branch here, but let's say for example, I said git update refresh heads branch to over here. Now this is the ninja move. The do not pass go. So now if I say show me all the things that commit is gone. Now whether I deleted the branch or whether I whats in a detached head state and just moved on or whatever reason for this, I've lost that content. Well, have I lost that content? We're looking for seven. Let's come back to our graph and let's refresh it. Show type, parent child, show lines, show tags. Seven is still there. There just aren't any branches pointing to it. Eventually git garbage collecting will come through and delete this one together with any of the other dangling nodes like these. Well, how do we get back? Let's use that log file. Let's say git reflog. Reflog. Git reflog. Show me where the head has been. Well, it's at 30 one f eight ad. But it used to be at seven and before that it was. Ooh, so seven is where I want to go. Okay, git checkout seven. Oh, I didn't copy that. Let me copy it again. Oh. Now I am in a detached head state, but if I look at it, I now have that commits back git checkout b undelete. And now head is no longer detached. It's pointing at a branch and I have that commit history back now. The cool thing is that that was a great way to read this ref log. And this ref log was for head. But we also have a ref log for other things. So for example, here's the ref log for branch, here's the ref log for main, here's the ref log for undelete. It's only created moments ago. We also have ref logs for remotes which can be really helpful so you can parse through the ref logs to be able to get back to commits if the git garbage collection hasn't come through and removed them. That's cool. So we were able to look at the hooks folder so that we could see automation. We saw the exclude file, the configuration details. Here's also configuration details. We saw temporary files including commits message and a ridge head, the index file that allows us to quickly diff the blobs in our repository with the files in our working directory. We saw the refs folder, how we were able to store name tags, both branches, tags and remotes to be able to understand where these are in more human readable format. And we were able to look through the objects to be able to see all of the content in our repository. Now let's do one of those GitGCs for a bit. GitGC and this git GC, this will go enumerate all the objects and count them and pack them. And so what we end up with is now instead of objects having lots of folders, we have one small pack file. Well, kind of. We have two files. We have one that is a pack file and one that is an IDX. Much like this index file. That IDX file is a reference of all of the content in that folder. So we can unpack the references, but git will do that automatically if it needs to. The visualization tool knows how to read the pack files though. So if we refresh we still get all of the show type, parent, build, show the lines, show the tags, we still see all of the commit history and yep, that dangling one is still there. Our secrets are still baked into the repository. One more thing that I would like to show you, which is pretty cool. Here's a reference into all of those files. We looked at each one and if you learn more by reading instead of by watching, I would invite you to point to this post on get ready. It describes each of the files and what each one does not. Whats different from what we talked about here. We have refs, we have objects, we have hooks, we have configuration, we have temp files. It is really cool. All of this content lives inside this git folder. So how do I backup this repository? Well, I can just copy this git folder to another location or just push those commits into another repository. This has been a lot of fun, getting to show you the deep dive into the git folder. Grab the code for the git Explorer here on GitHub and you can get to it real easily by going to robrich.org slash presentations and looking for the definitive deep dive into the git folder. I'll be at that spot where the conference is designated for Q A. Or if you're watching this later, hit me up on Twitter at robrich or by email by clicking on the email link on my site@robrich.org. Thanks for watching.

See all 29 talks at this event!

Conf42 Python 2021 - Online

May 27 2021

The Definitive Deep Dive into the .git Folder

Video size:

Abstract

Summary

Transcript

Rob Richardson

Developer Advocate @ Cyral

Join the community!

Featured event

2025

2024

Info

Conf42 Python 2021 - Online

May 27 2021

The Definitive Deep Dive into the .git Folder

Video size:

Abstract

Summary

Transcript

Rob Richardson

Developer Advocate @ Cyral

Join the community!