The basics of Version Control

Ondřej Mottl

Setkání komunity Data Stewardů

10.09.2024

Evolution

Open Science

A better view

The Journey

Version control

Ring a bell ?

What is Version Control? 🤔

It is all about keeping track of changes 📓✍️

Practical Exercise

03:00

Do you recognize some of these questions?

  • It broke … hopefully I have a working version somewhere?
  • Can you please send me the latest version?
  • Which version are you using?
  • I am sure it used to work. When did it change?
  • My laptop is gone. Is my data now gone?

How do you keep track of changes?

Git

  • local software
  • keep track of changes of files


GitHub

  • host server
  • store (git) the data
  • project management, collaboration, publishing

Git/GitHub setup AKA “git hell”

Follow instructions in Version Control - git hell (a separate presentation).

Getting all the necessary software installed, configured, and playing nicely together is honestly half the battle … Brace yourself for some pain

Basic vocabulary

  • Every such project is called repository (ie a repo)
  • Your local repository is called local
  • Your online repository, is called remote




Note on practital exercises

Git init (project first)

Activate git for a repo

Create new project with git tracking

For existing project

git init

Create new project with git tracking

git init <DIRECTORY>

For existing project

usethis::use_git()

Create new project with git tracking

usethis::create_project("<DIRECTORY>")
# switch to the new project
usethis::use_git()

Git integration is automatic in Source control panel

Practical Exercise

Make a new project with Git tracking

03:00

Git clone (repo first)

Copy (download) for remote repo to local machine

Example of online repo: OndrejMottl/VersionControl_DataStewards_Sep2024

git clone https://github.com/<OWNER>/<REPO>.git <DIRECTORY>
usethis::create_from_github(
  repo_spec = "https://github.com/<OWNER>/<REPO>.git",
  destdir = "<DIRECTORY>",
  fork = FALSE
  )

Open Command Palette (Ctrl+Shift+p)

Paste in URL: "https://github.com/<OWNER>/<REPO>.git"

Practical Exercise

clone a repo (e.g. this one)

05:00

a commit

A commit is a record of a change

If you create or edit a file in your repository and save the changes, you need to record your change via a commit

Chess analogy?

Chess move diary:

  • Bc4 (Bishop to c4)
  • Nf3 (Knight to f3)
  • Qc7 (Queen to c7)

a commit

Pawn to d4

Edit line 32 of file A

a commit



3 states of a file


Staging changes

Make a change to a file and save it. Now stage the change:

  • The red icon indicates removed files.
  • The yellow icon indicates modified files.
  • The green icon indicates added files.
git add <FILE>

  • two yellow ?? indicates adding a file
  • a blue M indicates edit a file that has already been committed
  • a red D indicates deleting a file

Practical Exercise

  1. Make changes to a file
  2. Make a new file
  3. Stage the changes

05:00

a first commit

Commit (record) staged changes:

git commit -am "commit message"

Review history

Dissecting a commit

SHA - unique identifier

Author - who has done this?

Date - when was this done?

Message - description of what has been done

Stats - what has changed?

Practical Exercise

  1. commit some changes
  2. review history

05:00

Commit message

Commits are quick and cheap. Therefore:

  1. commit often (!)
  2. provide useful commit messages.

Commit history

remote

remote

Now we need to sync chnages with the remote using PUSH

Add a remote to existing local repo (only once):

Push local to remote (GitHub):

Add a remote to existing local repo (only once):

git remote add origin https://github.com/<OWNER>/<REPO>

Push local to remote (GitHub):

git push

Add a remote to existing local repo (only once):

usethis::use_github(protocol  = "https")

Push local to remote (GitHub):

Add a remote to existing local repo (only once):

Push local to remote (GitHub):

Practical Exercise

  1. Publish repo to GitHub
  2. make new commit(s)
  3. Push changes to remote

05:00

GitHub intermezzo

A GitHub repo

GitHub creating a repo

GitHub creating a repo

GitHub creating a repo

GitHub creating a repo

GitHub creating a repo

README - description of the project

.gitignore - list of files ignored by GitHub (more about it later)

license - tell other what they can do wit your code

A note on {usethis}

{usethis} package provide a lot of usefull helpers

  • README - description of the project
usethis::use_readme_md()
  • LICENSE - restict the use of your code
usethis::use_mit_license(name = "Ondřej Mottl")
  • CONTRIBUTING.md - guidelines for contributors
  • CODE_OF_CONDUCT.md - set the tone for discourse between contributors

GitHub creating a repo

Practical Exercise

Create a new repo on GitHub

05:00

.gitignore file

A guide to the git which files should be ignored for detecting changes

Here is an example of a .gitignore file:

# History files
.Rhistory
.Rapp.history

# Session Data files
.RData

# RStudio files
.Rproj.user/

# OAuth2 token, see https://github.com/hadley/httr/releases/tag/v0.3
.httr-oauth
.Rproj.user

#data (excludes everything in the folder data)
data/*

# you can make exceptions for specific files
!data/dragon_taxonomy.csv

#figures & output (excludes all figure files)
*.png
*.pdf
*.jpeg

update local- PULL

update local- PULL

update local- PULL

Now we need to sync chnages from the remote to local the using PULL

Pull from remote (GitHub) to local

Pull from remote (GitHub) to local:

git pull

Pull from remote (GitHub) to local

Pull from remote (GitHub) to local

Merge conflict 💩💩💩

A merge conflict can occur when you are changing the same line in one file differently.

Merge conflict 💩💩💩


To https://github.com/picardis/myrepo.git
 ! [rejected]        master -> master (fetch first)
error: failed to push some refs to 'https://github.com/picardis/myrepo.git'
hint: Updates were rejected because the remote contains work that you do
hint: not have locally. This is usually caused by another repository pushing
hint: to the same ref. You may want to first integrate the remote changes
hint: (e.g., 'git pull ...') before pushing again.
hint: See the 'Note about fast-forwards' in 'git push --help' for details.

a good strategy to avoid such conflicts:

  • Commit often
  • Work in small steps
  • Push and pull regularly
  • Organize your code in small modules (scripts)


Merge conflicts cannot always be avoided (but can be mitigated by branches; later).

Merge conflict 💩💩💩

If you have questions, please
<<<<< HEAD
open an issue
=======
ask your question in IRC.
>>>>> branch-a

Delete the unwanted text (including the decorations)

If you have questions, please
ask your question in IRC.

Then save the file, stage, and commit again

Practical Exercise

In pairs:

  1. Clone someone’s else repo
  2. Add them as a collaborator
  3. Create a merge conflict
  4. fix it

03:00

Ups! I have made a mistake 😮

How to undo last commit?

Variant A: I commited but NOT pushed yet.

git reset --soft HEAD@{1}

RStudio has a range of possibilities to work with Git and GitHub as shown in this tutorial. The Terminal (NOT console) has more commands and options and will be handy for trouble shooting.

git reset --soft HEAD@{1}

Open Command Palette (Ctrl+Shift+p)

Write Git: Undo Last Commit

Ups! I have made a mistake 😮

How to undo last commit?

Variant B: I commited but AND pushed already.

Right-click on the commit you would like to undo to and select Revert a commit.

Copy the SHA of the last commit

git reset --hard <SHA>

We need the Terminal (NOT console) again.

Copy the SHA of the last commit

git reset --hard <SHA>

In the Source control panel -> COMMITS section -> Right-click on the commit you want to revert to -> Select the Reset Current Branch to Previous Commit

Practical Exercise

Undo/Revert commit

03:00

Branches

Branches

Branches

Make a branch

Copy the SHA of the last commit

git branch <BRANCH-NAME>

Switching between branches (checkout)

The default branch is called main or master

‼️ Make sure that you have all changes commited before switching ‼️

Copy the SHA of the last commit

git checkout  <BRANCH-NAME>

Practical Exercise

  1. Make a branch and switch
  2. commit changes
  3. push to remote

05:00

Merging branches

Merging branches

Pull Request (PR)

Request to merge a branch

Pull Request - create

After you push new branch, you should have a green button Compare & pull request

Pull Request - components

Pull Request - Overview

Now you can more commits, (add Comment to start discussion), or merge

Practical Exercise

  1. Create a PR

05:00

Note on Markdown

You can use Markdown in the description and comments

More details on Github Docs

Pull Request - review

A tool to review suggested changes

Collaboration

Pull Request - review

On someone else’s PR, you can comment on individual lines or whole files

Pull Request - review

Practical Exercise

  1. Make a comment on your PR
  2. Make a comment on a file in your PR

05:00

Merging branches

Merge conflict with branch 💩

Merge conflict with branches is much more pleasant😎

Merge conflict with branch 💩

Edit the file as needed

Merge conflict with branch 💩

Commit the changes

Merging strategies

Merge commit

Squash & Merge

Rebase & Merge

Practical Exercise

Merge a branch

03:00

Delete branch

We can delete branch directly on GitHub after merging

Delete branch

We can also delete branch before merging

To delete a local branch

git branch -d <BRANCH-NAME>

To delete a remote branch

git push origin --delete <BRANCH-NAME>

We need the Terminal (NOT console) again.

To delete a local branch

git branch -d <BRANCH-NAME>

To delete a remote branch

git push origin --delete <BRANCH-NAME>

Open Command Palette (Ctrl+Shift+p)

Select Git: Delete branch …

Practical Exercise

Delete a branch

03:00

Bonus for R users

This is just a teaser

GitHub has a lot of features and tools to make your life easier:

  • Project management
  • Task management
  • Collaboration
  • Dissimination
  • Automation

Maybe next time😊?

Outro

This presentation

License: MIT

License: MIT

About me

Ondřej Mottl Assistant Professor at Charles University