Intro

What is Git?

Git is a distributed version control system designed for collaborative software development and used in open science. “Distributed” means that there is no central repository that stores the version of your documents/applications/code, as is the case with Google Drive, Dropbox, or other systems. In fact, there is no one master version of anything in Git: Each user and each device that is participating in a project has a complete version of all the files along with their entire history. This great video tutorial gives a very nice explanation of how the system works.

Jargon alert

Talking about Git necessarily involves quite a lot of jargon. We’ll do our best to introduce it gently, in an intelligible way.

Git vs GitHub

Let’s clear the potential confusion between Git and GitHub. While Git is the version control software, GitHub is primarily a hosting service that works on the principles of Git.

In a way, the distinction is not completely unlike that between R, the language, and RStudio, the IDE. Just like RStudio started as just an environment in which you can work with R more conveniently but has since gained a ton of addition functionality, GitHub started as a hosting service for Git repositories but can now do other pretty nifty things.

GitHub (now owned by Microsoft), though very popular, is just one of several similar services. A few of possible alternatives are:

GitLab

Bitbucket

SourceForge

You can read about these and more in this listicle.

Setting up

This tutorial shows you how to work with Git using the command line (console/terminal). There are graphical user interfaces (GUIs) you can use just as well but a single line can sometimes be quite a bit less overwhelming than a huge window full of incomprehensible jargon. Writing commands also helps understand the basic principles behind this system.

That said, there is nothing wrong with using GUIs and, if it works for you, you should definitely go for it!

Download and install Git

Task 1

First thing you need is a Git installation on your computer.

Now, it may well be the case that you already have Git installed. To check whether or not you do, you only need to type a simple command.

On MacOS, open the terminal (open Spotlight search with the ⌘ Command +       shortcut, type in terminal and press ↵ Enter) and type in which git, confirming with ↵ Enter.

On Windows, open the command line (type cmd in the Windows search bar and press ↵ Enter), type in where git and press ↵ Enter again.

If you do have Git installed on your computer, the command will return a path to it. On Windows, it will look something like this:

C:\>where git
C:\Program Files\Git\cmd\git.exe

while on MacOS, the path will be different:

$ which git
/usr/bin/git    

If the command does not return a path, that means that you need to install Git.

If you are running MacOS, install download the software from the Git website.

Windows users should install Git via Git for Windows. That way, you will also get GitBash - a command line tool that’s better to use with Git than Windows-native command line (or PowerShell).

Unless you know what installation settings you want, you can go ahead and accept the defaults: They are pretty sensible. If you’d like a deeper dive into these settings, check out this Git installation guide.

Stay relevant!

Even if you do have it installed, it’s worth checking if you’re running a recent version1. To do this, simply run the git version command using your terminal/command line.

Create a GitHub account

Task 2

We want to be using GitHub for storing our files and so the next thing to do is sign up for a GitHub account, if you don’t already have one. Simply follow the standard sign-up procedure; it doesn’t take long.

Signing up with your personal email is better than using your institutional email as you’ll really want to hold on to all the things you’ve produced!

Configure Git

OK, now that everything is installed and you have a nice new GitHub account, you need to configure Git so that it can talk to GitHub. To do this, you need to set a few options.

In general, there are three types of options:

Task 3

For the time being, setting a few global options will suffice.

Task 3.1

First of all, you need to register your GitHub user name.

To do that, open your MacOS terminal or GitBash on Windows and type in:

git config --global user.name sauron

Obviously, replace sauron with your actual GitHub user name.

If there are white spaces in your username use quotes:


git config --global user.name "Tobias Sauron"

Task 3.2

The second thing on the list is the email address you used for your GitHub account.

The command is pretty similar to the last one:

git config --global user.email sauron@mor.dor

The last thing we need to do is make sure that the way Git treats ends of lines in your files is configured the right way. This is important because Windows on the one hand and MacOS and Linux/Unix on the other use different special characters to denote ends of lines. While MacOS and Linux/Unix use just a new line (\n), Windows uses a carriage return followed by a new line (\r\n). Having Git set up the wrong way can lead to a lot of headache.

Task 3.3

Now, chances are, that your Git is indeed set up the right way but, just to be on the safe side, run the following.

On Windows:

git config --global core.autocrlf true

This will remove all carriage returns from your line ends before the file is sent to the repository and add them when you’re retrieving files back to your computer.

On MacOS / Linux:

git config --global core.autocrlf input

This setting will not add any carriage returns when bringing files in and will only remove those that were inputted when sending files out. This can happen if the software you are using on your Mac isn’t set up quite right.

 

And with that, we’re ready to roll!

Using Git + GitHub

Initiating repos

A repository, or repo, is a place that stores files tracked by Git as well as the history of all the changes you made to these files. It is basically just a folder with a few special files inside of a .git folder within. The details are beyond the scope of this quick guide but what’s important to understand is that there are two kinds of repositories: remote and local.

A remote repo contains a version of your files that is hosted either somewhere on a network or online. For the purposes of this tutorial, you will be hosting all your remote repos on GitHub.

A local repo sits on your computer. If you have collaborators or are working on several devices, each one will have its own independent clone of the remote repo. This is what we mean when we say that Git is a distributed version control system.

The repos, both local and remote, are equal and you don’t have to communicate with the remote repository to have access to your files: they are all contained within your local repo. Remote repos only exist because you don’t want to store your files only on your computer, want to have your files more or less synchronised across several devices, or are working on a collaborative project with others.

You can turn any folder on your computer into a local Git repo with the git init command but, given that we want to be working with GitHub, it’s easier to just go to the website and create the remote repo first and then clone it. Let’s do it then!

Task 4

Got to GitHub and click on the New button to create a new public GitHub repository. Give your repo a name, add a short description and select the “Add a README file” and “Add .gitignore” options under the “Initialize this repository with:” section.

The .gitignore file is a useful one to have in case that you don’t want Git to track all the files in a folder. This will more often than not be the case. The .gitignore file contains the names of files and folders that should be ignored by the version control system. You can also use file naming patterns, such as *.Rproj, if you want Git to ignore all .Rproj files in the given repo folder on your computer.

Finally, you can pick a license for the files/software in your repo if you care about these things.

All that’s left to do is click on the Create repository button and you will have created yourself your first repo!

Cloning

Now that you have a remote GitHub repository, you want to bring it to your computer. This is referred to as cloning the repo.

To clone a repo, use the git clone command:

git clone repo-you-are-cloning path/to/local/folder

By default, the git clone command clones the repo into a folder that’s named the same as the remote repository and puts it wherever your current position in the terminal/command line/GitBash is (let’s refer to all of these as simply the terminal).

Task 5

Let’s clone our GitHub repo!

Task 5.1

Navigate to wherever you want your new repo folder to live using the cd command in the terminal.

If you are using GitBash on Windows, you can simply go to the folder in Windows Explorer, right click inside and choose the “Git Bash here” option.

Task 5.2

Go on GitHub, click on your repo and from the Code   menu, copy the URL to your repo.

Task 5.3

Clone the repo using the git clone command. If you are happy with the repo name being also your folder name, all you need is the URL. If you want to clone the repo into a folder with a different name, simply type the name in the command after the URL.

Let’s say my repo is called at https://github.com/sauron/ring.git but I want it to be cloned into a folder called one_ring in my Documents folder.

First, I navigate to Documents either with the cd command or using the GitBash for Windows right-click method. All I have to do then is run the following command:

git clone https://github.com/sauron/ring.git one_ring

You should now see that the rep folder got created in the selected location. The folder contains the README.md file, the .gitignore file, and a hidden .git folder.

Magic!

The .git folder is very important: it is, in fact, your local Git repo. It’s hidden for a reason: You don’t ever have to touch it so just leave it as it is! Certainly do not delete it unless you don’t want the repo in that folder any more.

Basic workflow

OK, so you have successfully cloned the remote repo and it’s now sitting locally on your computer. Great stuff! Now, you can start adding, editing, and deleting files just like normal.

Before you do, however, it’s important for you to understand the basic workflow as implemented in Git. At first, it may feel a tad overwhelming but with a little practice, you get used to it and realise that it actually makes a lot of sense.

First of all, remember that there’s the remote repository and the local repository. The remote repo is hosted on GitHub, while local is on your device. Both of these contain a full history of all files in the repos and all changes that have ever been made to them.

The repositories can talk to each other via processes called git push and git fetch. You git fetch when you want to get the latest changes made to remote to be brought to your local. You git push when the flow of information is the other way around.

That’s not too difficult, eh?

The interaction between your work and the local repo is a little bit more involved. Have a look at the animation below:

git add
git commit
git push
git pull

It shows that there are actually three local units on your computer. First, you have the local repo which you already know about.

Next, there’s your working directory. That’s simply the folder with all the files that got created when you cloned the GitHub repo (remember that the local repo is just the hidden .git folder inside this folder). You treat the working directory the same way as any other folder on your computer: You add files and folders to it, edit existing files, and delete those files that are no longer needed.

Finally, there is the staging area (AKA index). This is an “invisible” virtual middleman between the working directory and the local repo and it’s used for transferring information between these two systems.

When you make a change to a file that you want to be recorded in the repo, you first add the change to the staging area. You can add as many changes as you like, either in one go or on several occasions.

Once your changes form a meaningful whole that you’d like to be treated as a single event in the history of your repo, you commit the added changes to the local repo. Commits are crucial because they form entries in the chronicle of changes.

You can go back to any individual commit that’s ever been made whenever you want.

So, the upstream flow (from your working directory to the remote repo) should look something like this: You make changes and add them to the staging area. When the changes make up a meaningful “thing” that you did, you commit them to the local repo. You can then keep working on something else and repeat this add ⇒ commit process. Once you would like to record your progress in the remote repo, you push all commits (whether there is one or ten) to the remote.

The downstream flow consists of two steps: fetching the contents of remote to local and then merging local with your working directory to actually see the updated files in your folder. However, in practice, these two steps are often completed in one go, referred to as a pull.

Making changes

Task 6

Let’s get some hands-on experience with this workflow to make it feel less arcane. First of all, you need some changes.

Task 6.1

Open the README.md file and add a random line of text to it.

Task 6.2

Create a new text file and write a line or two inside and save it as file_1.txt

Now, let’s see what git thinks about what just happened.

Task 6.3

Run the git status command.

You should see something like this:

On branch main
Your branch is up to date with 'origin/main'.

Changes not staged for commit:
  (use "git add <file>..." to update what will be committed)
  (use "git restore <file>..." to discard changes in working directory)
        modified:   README.md

Untracked files:
  (use "git add <file>..." to include in what will be committed)
        file_1.txt

no changes added to commit (use "git add" and/or "git commit -a")

There are four pieces of information on interest in this output:

That sounds about right. Let’s add the changes!

The command that adds (stages) changes in its basic general form is:

git add path/to/file/file_name

You can add several files at once with:

git add file1 file2 file3

You can also add an entire folder inside of your working directory and all new/modified files in it will get added:

git add folder_name

Finally, you can add everything inside your working tree (or a location within it) using the . (dot) wildcard:

git add .

Task 6.4

Add README.md to the staging area.

git add README.md

Task 6.5

Run the git status command again and see how it changed.

You should see that the second paragraph of the output now says:

Changes to be committed:
  (use "git restore --staged <file>..." to unstage)
        modified:   README.md

This means that the changes to README.md have now been staged and are waiting to be committed. Notice also that file_1.txt is still untracked.

 

OK, let’s commit the added change now.

To commit a staged change, in other words, to have it written into the local record, we need to run the git commit command. In its general form, it looks like this:

git commit -m "short informative description of what's in the commit" -m "Optional, longer account of the changes"

The -m option indicates that what comes after it is a message.

Task 6.6

Commit the change with the totally informative message "my first commit".

git commit -m "my first commit"

The output should be something like:

[main 2765c56] my first commit
 1 file changed, 2 insertions(+), 0 deletions(-)

This first line tells you that you’ve committed a change to the main branch, what the ID of the commit is, and what message is appended to the commit. The second line tells us that you’ve modified one file, added some stuff and deleted nothing.

Task 6.7

Check git status again and see what it says now.

As you can see, the paragraph concerning README.md has disappeared and the first paragraph changed to:

On branch main
Your branch is ahead of 'origin/main' by 1 commit.
  (use "git push" to publish your local commits)

Here, Git is telling you that our local repo contains changes that the remote repo does not. y

Task 6.8

Add and commit file_1 but do check git status after each step.

git add file_1.txt
git status
On branch main
Your branch is ahead of 'origin/main' by 1 commit.
  (use "git push" to publish your local commits)

Changes to be committed:
  (use "git restore --staged <file>..." to unstage)
        new file:   file_1.txt
git commit -m "Added new file"
On branch main
Your branch is ahead of 'origin/main' by 2 commits.
  (use "git push" to publish your local commits)

Good! Let’s say that you now want to update the remote repo to reflect changes you’ve made locally. To push the changes to remote (origin), use the, you guessed it, git push command. There are further argument you should add the first time you are pushing a new branch to a repo but given that we initialised this one on GitHub and cloned in from there, doing this should not be necessary.

If it’s not working for you, use the full command:

git push -u origin main

The -u flag stands for “set upstream” and tells Git that you want to push your commits to the remote (origin) repo, to a branch called main (previously referred to as the master branch). Again, you should only have to do this once per branch.

Task 6.9

Push the commit, go on GitHub (refresh the page) and see that the changes have been reflected.

Pretty neat, don’t you think?

 

Task 7

While you’re on GitHub, let’s see how pulling from remote works.

Task 7.1

Use the GitHub interface to edit file_1.txt. Any change will do…

Just click on the file name inside of the repository and then on the icon towards the right-hand side of the screen.

Once you’re done editing, you can add a little more informative commit message below the editor panel or just leave the default one and then click on Commit changes.

Task 7.2

Back in the terminal, run the git pull command (just like that).

What you get is a few progress messages and then something like:

From https://github.com/sauron/one_ring
   e81754f..e4e634e  main       -> origin/main
Merge made by the 'recursive' strategy.
 file_1.txt | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

Task 7.3

Check that the file_1.txt file on your computer now has the same changes.

Undoing things

There are many ways of backtracking your changes in Git. The various commands can get a little confusing and so, rather than give a full account of git checkout, git reset, git restore, git revert, or git put away the thesaurus2, we’ll talk about dealing with the scenarios you are most likely to encounter:

There is more than one way to achieve some of these tasks but we’ll only discuss one to keep things simple. Well, simpler

Task 8

Let’s learn how to undo all these different kinds of changes!

Undo a change that has not been staged yet

So your cat jumped on your keyboard and somehow managed to hard-delete a file. First of all, well done Mittens, you evil genius you! Don’t worry though, there’s a simple command that allows you to discard all changes before you stage them.

What we will be doing is restoring the “worktree”. Worktree is just a synonym for your working directory. The reason why you’re being bombarded with this jargon is that it is reflected in the command and so understanding it will make the command easier to understand and use.

The command we’ll be using is:

git restore --worktree file_to_restore

You can also just use the -W flag instead of --worktree, if you wish.

Task 8.1

Delete file_1.txt check git status and then restore it using git restore.

After deleting, git status should give you:

Changes not staged for commit:
  (use "git add/rm <file>..." to update what will be committed)
  (use "git restore <file>..." to discard changes in working directory)
        deleted:    file_1.txt

Then, just do:

git restore --worktree file_1.txt

Alternatively, you can use the (upper case) -W flag like this:

git restore -W file_1.txt

You should now see the file back in your folder.

Unstaging not yet committed changes

The next time Mittens took a stroll across your keyboard and deleted file_1.txt3, you didn’t notice and added all changes you’ve been working on with the git add . command. That means that the unwanted deletion has now been staged. You tried git restore -W file_1.txt but you got this output:

error: pathspec 'file_1.txt' did not match any file(s) known to git

There’s no need to panic, however; everything can be fixed. In fact, if you look at git status, it will tell you how to undo the change.

Task 8.2

Delete file_1.txt again. Then, read the output of git status and figure out how to unstage the deletion.

This is the crucial bit in the output from git status:

(use "git restore --staged <file>..." to unstage)

So let’s do that:

git restore --staged file_1.txt

Alternatively, you can use the (again, upper case) -S flag like this:

git restore -S file_1.txt

Good! Now, if you check git status again, you’ll see that, although the file is still deleted, the change is now unstaged and you can use git restore -W file_1.txt to undelete the file.

You can perform both of the steps with one command by combining the --staged and --worktree options:

git restore --staged --worktree file_1.txt

If you’re a particularly lazy4 typist, you can combine the -S and -W flag into -SW.

Resetting last local commit

OK but what if you already committed your changes before you realised the inopportune deletion of file_1.txt? What now?

Well, the answer to that question depends on where you want to get. However, no matter the state to which you wish to return, you can use the git reset command to go back to what things were like in a previous commit.

There are several ways to specify the commit to which you are resetting. Perhaps the easiest is using HEAD.

HEAD is your current commit. You can count back from head using a tilde ~. So, if you want to go to the last but one commit, you can specify that as HEAD~1. Easy…

The rest of the command depends on what state you want to return to.

If you want to go back to just before the commit, with all changes already staged, you want to do a --soft reset:

git reset --soft HEAD~1

git status

...

Changes to be committed:
  (use "git restore --staged <file>..." to unstage)
        deleted:    file_1.txt

If, however, you’d like to return to the moment after the changes have been made but before they were staged, you want a --mixed reset:

git reset --mixed HEAD~1
Unstaged changes after reset:
D       file_1.txt

The --mixed option is actually the default for the reset command and so you don’t have to include it if you don’t want to.

Finally, if you just want to straight up go back to before you made the change, do a… no points for guessing… --hard reset:

git reset --hard HEAD~1

And with that, the file is back in your folder!

Task 8.3

Delete the file again and add and commit the change. Then, pick whichever reset you like best and try it out!

Resetting last remote commit

Knowing what you already know, resetting the remote repo to the previous commit is pretty straightforward. All you need to do is do a git reset --hard HEAD~1 on your local repo and then force a push with the command:

git push --force 

# alternatively
git push -f

HIC SVNT LEONES

Be aware that if you’re resetting the remote repo to a previous commit, you might be undoing other people’s changes! For a bit of a discussion on the topic, check out this StackOverflow post.

 

Understanding the Git work flow, git restore, git reset, and their options will ensure that you’ll never lose or mess up any files.

This may actually be a good place for a wee quiz, don’t you think?

QUESTION 1

What is the command that stages ALL changes you’ve done?

git add . | git add --all | git add -A

Correct!That’s not right…

QUESTION 2

How do you do a soft reset to the previous commit?

git reset --soft HEAD~1 | git reset -S HEAD~1

Correct!That’s not right…

QUESTION 3

Which command records your changes into the remote repo?

git push -u origin/main | git push -u origin/master | git push --set-upstream origin/main | git push --set-upstream origin/maste | git push origin/main | git push origin/master | git push

Correct!That’s not right…

QUESTION 4

You accidentally deleted a file and before you noticed, you committed the change to your local repo.

What kind of reset do you need to do to get back to having the file in your folder?

--hard

QUESTION 5

How do you unstage the foo.Rmd file if you just added it, assuming you’re currently in the folder where it sits?

git reset foo.Rmd | git restore --staged foo.Rmd | git restore --S foo.Rmd

Correct!That’s not right…

QUESTION 6

What is the Git synonym for the staging area?

index

QUESTION 7

What is the term for bringing things from the local repo onto your computer?

git pull

QUESTION 8

Is Mittens the devil’s spawn?

Yes

Creating alternative realities

If you’re binging this guide, maybe this is the place where you want to have a little stroll and/or a cup of tea because things are about to get pretty loopy. Strap in!

A very cool feature of Git is branching. Any user, at any point, can create a branch of a repository. A branch is basically an alternative version of the repo that exists in parallel with the main or master branch and has its own history. It’s essentially an alternative reality to what’s happening on the main branch.

Imagine Dr Sauron is working on his one_ring repo from earlier. He decides that the ring should have a few features directed at the other rings of power. Instead of working on all the features in the main repo, he can create several branches:

Once the work on these features is ready, Sauron can easily bring them all to the main branch of the repo. This way, work on one sub-project will not interfere with anything else.

The thing is that, in a collaborative project, the main branch is not the branch people work on. main is supposed to be a stable thing that only gets updated with finished features, working code, or final versions of documents. The rest is done on other branches, conventionally the dev branch for development and various feature_... branches for smaller, more modular tasks.

Once the work on a branch is ready, the branch can be merged into a superordinate branch (e.g., feature into dev, dev into main). If a branch is no longer needed, it can be deleted.

Obviously, when different people create different branches and work on their things, it takes a lot of project management skills to avoid conflicts, once it’s time to merge things. GUI tools, such as GitKraken offer visual flowcharts to help you find your bearings in what can be a hot mess of alternative realities à la the cult movie Primer.5

As this is not supposed to be a comprehensive guide to Git, we’ll only introduce the branch basics. You can find a wealth of resources on more advanced branch management online, for example on the Git website

Creating a branch with git checkout

Imagine you wrote an R package. It’s working well but you would like to add a feature to one of the functions inside. Obviously, you don’t want to start messing with the code in the main branch because it would create problems if someone tried to install the package from GitHub. That’s when you create a branch…

Creating a branch is very easy, all you need is one command:

git checkout -b branch_name

This command takes a snapshot of your local repo, creates a parallel universe, and puts you in it.

Task 9

Let’s learn how to create and manage branches with Git.

Task 9.1

Create a branch of your repo called my_branch.

git checkout -b my_branch

Notice, how it now says (my_branch) above your terminal prompt. That means, you’re currently in this branch.

Task 9.2

Create a file called branch.txt and write a line in it. Then, add and commit it to the branch using what you already know. Do not push to remote yet!

git add branch.txt
git commit -m "added text file into my_branch"

Switching between branches

Task 9.3

Let’s go back to the main branch, just because we can.

To do this, you use the same git checkout command but, since you are not creating a new branch, you need to omit the -b option.

git checkout main

Check your folder, branch.txt is not there. That must be sorcery, right!?

Task 9.4

Return to my_branch.

Remember, not the -bs! The branch already exists…

git checkout my_branch

Pushing branch to GitHub

Say you want to store our branch on GitHub. What you need to do is push our commit to a remote branch but since the remote branch does not exist, you need to do the --set-upstream thingy we talked about above:

git push --set-upstream origin branch_name
# or, alternatively
git push -u origin branch_name

Remember origin is the remote repo so with this command, you are pushing your branch to the remote repo’s newly set up branch.

Task 9.5

Push my_branch to a remote branch of the same name.

git push -u origin my_branch

Task 9.6

Go to your repo on GitHub and switch to your new branch.

You should see that, next to main, it now says 2 branches. Click on main to switch to my_branch.

As you can see branch.txt is in there, sitting pretty.

Task 9.7

Switch back to main.

Bringing everything back to main

So you have been working away on your branch, completed the task you set out to do, and now is the time to update main to bring the new functionality into your R package. What you want to do is merge my_branch into main.

You can do this on your local repo and then push into remote main but we’ll talk about how to merge remote branches because it introduces and important concept of pull requests.

A pull request is a request you send to the owner of a repository asking them to pull the changes you made and merge them into their branch. Remote branches on GitHub are merged via pull requests even if you’re the owner of both of the concerned branches, as is our case.

Notice how it says above the main button “my_branch had recent pushes” and it’s offering you an option to Compare & pull request

Task 9.8

Go ahead and click on the button!

You’ll find yourself on a page where you can review the changes in the pull request and leave a message describing what the changes are.

Notice the little menu towards the top of the screen: base: main compare: my_branch

This menu lets you set up the request and pick what branch you are proposing to pull and merge into what. Here, the request is for my_branch to be merged into main, which is exactly what we want.

Task 9.9

Leave a pull request message (up to you) and then click on Create pull request

OK, so the pull request has been created and you, as the owner of the repo that’s being merged into, can now review and either close or merge the pull request.

Task 9.10

Have a little look at the pull request review page that you just got redirected to in order to get a little familiar with it. Then, click on Merge pull request and Confirm merge.

Hopefully, you got a message saying that all went well and that my_branch can now safely be deleted. Most branches tend have a fairly short life span. After all, they kinda lose their raison d’être once the features they contain have been merged into a higher ranking branch.

Task 9.11

Go ahead and delete my_branch.

Doing this only removed the remote branch but you still have your local one. Let’s deal with that.

Task 9.12

First of all, make sure you are currently in your main local branch and, if not, git checkout into it.

To delete a branch use the git branch -d branch name

Task 9.13

Use the command to delete my_branch from your local repo.

git push -u origin my_branch

You might have got a warning that says something like:

warning: deleting branch 'my_branch' that has been merged to
         'refs/remotes/origin/my_branch', but not yet merged to HEAD.
Deleted branch my_branch (was b56c581).

This is not a problem, as you made sure everything is fine over on GitHub!

Task 9.14

Pull the changes you made on GitHub to your local repo and see that banch.txt is now there.

git pull

The output should contain info about branch.txt being added:

...

Fast-forward
 branch.txt | 1 +
 1 file changed, 1 insertion(+)
 create mode 100644 branch.txt

Pretty cool! You can now create parallel universes. Mittens is starting to look a little Schrödinger‑nervous…

If you’d prefer a little more animated explainer, check out this demonstration of a simple GitHub pull request.

My dev branch is better than yours, damn right it’s better than yours

…it brings all the gits to the hub

As you can imagine, with several people working on multiple branches and merging them at different times, problems can sometimes arise. These conflicts happen when the system doesn’t know how to reconcile the differences between your branch and master when you merge or push to it.

Handling and resolving conflicts is an advanced topic and is outwith6 the scope of this document. If you want to know more, there are a ton of resources out there. You can start by reading this tutorial.

As a rule of thumb, it’s far easier to avoid conflicts by having a good work flow than having to resolve them.

Taking other people’s stuff

Yet another extremely useful feature of the open source nature of GitHub is that, at a click of a button, you can copy – or fork – any public repo you come across. You can then treat the copy as any of your own repositories: you can make edits, commit, them, and push them. You can even create branches of that repo and merge them back in the main branch. Finally, if you feel like you made the repo better in some way (the improvement can be as minor as fixing a typo), you can open a pull request so that the original author can merge your edits into their repo. Isn’t that cool?

As an example, imagine you are using someone’s R package (installed from GitHub) and you get an idea for a new feature. You simply fork the package’s repo, clone it onto your machine, and are ready to start working on the edit. Once you’re done, you push it back to your remote fork and open a pull request. It’s really as simple as that.

Task 10

Try forking your favourite R package on GitHub, for instance papaja.

All you have to do is look up the repo using your favourite web search engine, go to it, and click on the Fork button in the top right-hand side corner of the window.

Once you’ve done that, you’ll see that you now have a copy of the repo on your own GitHub.

Task 10.1

To practice the commands form earlier, you can now clone the repo onto your computer.

Git + GitHub with R Studio

Some time ago, RStudio introduced in-built integration of RStudio projects with Git and GitHub. It even provides its own Git GUI that you can use to perform all the operations we’ve been talking about without the need for a command line.

Task 11

Let’s use your GitHub repo to create an RStudio project.

All you need to turn your repo into a version-controlled RStudio project is to open RStudio and create a new project. For a quick intro to RStudio projects, check out RStudio website.

If you hadn’t yet cloned your GitHub repo onto your computer, you’d pick New Project () > Version Control > Git, copy the repo URL into the box, and select where you want the repo to be cloned. Since you’ve already cloned the repo, just pick > Existing Directory and pick the repo folder.

Task 11.1

Once the project has been created, you should see a Git tab in your environment pane. Have a look at it.

Hopefully, knowing what you know now about the Git workflow, it shouldn’t take you long to figure out what it all means. If you’d like a little help, check out this video guide but, in short, you can stage a file by checking the box in the “Staged” column and see the status of the files in, well, the “Status” column. Hover over the icons to see what they mean.

Use the menu bar at the top of the Git tab to Commit, Pull, or Push any changes. It’s also worth to check out the visual Diff tool that shows you differences between a file and its previous version and the History window that gives you, among other things, a flow diagram of your branches.

Task 11.2

Before you start working on your version-controlled RStudio project, it’s important to put something in your .gitignore file. This file should contain all the file types that should not be tracked by Git, such as .Rproj, .RData, and others. Fortunately, there’s already a ready-made template that you can use. Just copy the content of the file from the link into your .gitignore file.

Task 11.3

Create a simple RMarkdown file, knit it into a HTML document and use the Git tab in RStudio to push it into your remote repo.

 

And with that, you now know all you need to start using version control in your own projects!

Other things you can do

Hopefully, you’re now sold on the idea of using Git and GitHub (or one of the alternatives) but if you need a little more convincing, here are some additional cool things you can do.

If you haven’t heard about it yet, the Open Science Framework is an online hosting service created to facilitate and encourage open science. You can use it to pre-register your study, manage your individual or collaborative research projects, and make any or all parts of your study (data, code, documentation) publicly available.

OSF is well-integrated with GitHub so you can easily link a GitHub repo with your project and all documents will automatically sync.

For more info and a how to guide, check out this presentation.

Make things visual with GUI

As stated at the beginning of this tutorial, the reason why we encouraged you to use the terminal while you’re finding out about the basics of Git is that it focuses your attention on that which is relevant at the moment and encourages a better understanding of the workflow.

That said, there are some fantastic pieces of software out there that offer convenient visual tools for managing your repositories. The RStudio Git tab is pretty good in and of itself but in case you’d like something a little flashier, check out GitKraken, one of the most popular Git GUIs there is. Alternatively, you can download the GitHub Desktop app and see if you like it.

Choosing to work with a GUI or the terminal is both a matter of personal preference as well as the task at hand so pick and mix to your heart’s content. Command line snobbery and purism are silly!

Build a website from a repo

These days, it’s really very easy to build a website. If this is something you’d like to explore, do check out this guide to websites with RStudio or this more condensed tutorial.

This very website was largely created in RStudio using the distill package.

GitHub pages

GitHub offers free and easy web hosting and website build tools with GitHub pages. All you have to do to use them is to create a special repo, pick a few settings, put your website documents in the repo, and GitHub pages will take care of all the rest.

Netlify

Alternatively, you can keep your website documents in an ordinary GitHub repo and then link it with a third-party service, such as Netlify, that will automatically update the website within seconds of every git push! This is exactly how the Analysing Data website you’re currently on is managed.

GitHub Organisations

GitHub offers you the option of creating an organisation account that is not owned by a single person but whose ownership is shared by members of an institution. For example, all files on this website are hosted in a private GitHub repo owned by the SussexPsychMethods organisation account that all the authors of and contributors to the website are members of. This makes collaborating on shared projects even easier!

For more details, see the GitHub Docs section on organisational accounts.

SSH authentication

Finally, it’s worth pointing your attention to more advanced security topics. There’s nothing wrong about using you user name and password for authenticating your git pushes and git pulls (provided they are securely stored!) but you should at least be aware that there’s a more secure alternative.

If this sounds interesting, have a browse in the GitHub Docs section on SSH authentication.

Resources

There are simply too many resources freely available online to list here so here’s just a few to get you started:


  1. At the time of writing, the current version of Git was 2.31.1.↩︎

  2. OK, this one isn’t real…↩︎

  3. This is starting to look pretty suspect!↩︎

  4. Efficient!↩︎

  5. This mind-boggling 2004 time-travel film was made on a shoestring budget of around $7,000!↩︎

  6. Outwith
    /aʊtwɪð/, /aʊtwɪθ/

    prep
    1. (now chiefly Scotland, Northern England) Outside; beyond; outside of.↩︎