To introduce the basics of working with version control systems in R, using Git, GitHub, and GitHub Desktop
For any of us who work with data files and associated analyses and documents over a long period of time, it can become very complicated to keep track of the “latest version” of what we’re working on. This is especially true if we are collaborating with others on a project and need to share and modify these things back and forth in real time. This is a problem that software developers have been dealing with for a long time, however, and there is a robust ecosystem of “version control systems” (VCSs) out there for dealing with this problem. The basic idea behind these systems is that all of the work on a particular project is stored in a repository (or repo), and as you work on and modify files and data for the project, you “commit” your changes periodically. The VCS keeps track of what changes between commits and allows you to roll back to previous versions if need be.
You can also fork a repo – basically, make a duplicate copy of it – and work on the forked branch and then, later, merge your changes back into the master branch. Multiple people can each work on different branches simultaneously, and the software will take care of looking for and highlighting changes that occur on different branches so that they can be merged back in appropriately. This module will introduce you to one such system.
One of the most popular and frequently used VCSs is git.
The git software we just installed is strictly a command-line tool and is a bit difficult to pick up from scratch. However, the repository hosting service GitHub provides an easy to use web-based graphical interface for repository management and version control. GitHub offers the distributed version control and source code management functionality of git plus some additional features. We will get introduced to git by using GitHub.
Sign Up
and create your own account.Create a new repository on GitHub
README
file for your repo using the
browser-based Markdown
editor. Markdown is basically a set of rules for how to
easily style plain text files in such a way that can be easily converted
and rendered in HTML, the structural language of the web. This module,
for example, is written using Markdown!
GitHub has a nice, short tutorial that you can
follow about Mastering
Markdown.README
file, and then merge
your edits back into the master
branch of your repo.git is a command-line tool, but both local (on your computer) and hosted (e.g., on GitHub) repositories can be managed using a GUI, such as GitHub Desktop, GitUp, or a host of others. I find GitHub Desktop the easiest one to use.
With GitHub Desktop, we can download, work on, and update remote repos or we can create, work on, and push local repos to remote sites.
Clone a remote repository from GitHub to your computer
This should create a local copy of your repo. On MacOS, if you then right-click on it in the list of your local repos, you will see the option to open your repo in GitHub (which will take you back to the web interface), in the Terminal, in the Finder, or with the Pulsar text editor. The process should be similar on Windows, but the same options listed above may not appear.
Create a local repository and push it to GitHub
README.md
file to the
master
branch, providing a short bit of text describing
your commit (e.g., “Initial commit”).NOTE: This file is still located only in your local git repository, and not on GitHub’s servers.
If you now go to GitHub.com and log in, you should be able to find the repo you have pushed up there.