Intro to Git and GitHub

Frank Aragona & Juan Salazar, DIQA

the problem

  • multiple files for the same script
  • easy to lose track of production version
  • terrible for collaboration
    • who has the main script?
    • who deleted my changes?
    • conflicts?
    • I overwrote my code and lost everything

20211221_AllTests_datatable_conversion.R

20211221_AllTests_datatable_conversion_local.R

20211221_AllTests_updated.R

20211223_AllTests_dplyr_FA.R

20211221_AllTests_dplyr_conversion_JS.R

what is Git?

  • Git is version control software
  • projects have branches
  • advantages:
    • create new branches to test code
    • isolate while keeping the main branch clean
    • version-tracking, keeps track of everything
    • prevents overwriting/losing your changes

what is GitHub?

  • GitHub is a graphic user interface platform that hosts Git repositories
    • repos are used to store project files
  • advantages:
    • code management: isolate your changes from main codebase
    • project management
    • collaboration
    • transparent, helps see everything

tl;dr

  • Git repos -> movies 🎞️
  • GitHub -> Netflix

benefits of GitHub - code management

Features:

  • version history (commits): easily view the entire version history of your codebase
    • change attribution: know who did what, and why
    • time travel: roll back to previous versions
  • structured repositories: organizes scripts and documentation clearly

benefits of GitHub - project management

Features:

  • issues: track tasks, decisions, bugs, and enhancements
    • labels and assignees: categorize work and assign ownership
    • organization: easy to tag and search tasks
  • milestones: group issues to track progress towards goals

benefits of GitHub - collaboration

Features

  • branches: team members work on different tasks without conflict
  • pull requests: enable code review and discussion
    • comment & review: clear contextual communication history
  • sharing code: easily share and control access to your code
    • organizations, private/public repos

what do we use it for

  • all code development
    • surveillance pipelines
    • single ‘one-off’ scripts
  • large collaborative organizations to share code
    • public repos: share code broadly or collaborate with external partners
    • private repos: restricted access for sensitive internal work
    • organizations: WA-EIP, NW-PaGe

how to use Git/GitHub in practice

  • find a repo in GitHub
  • clone the repo: copy the whole repo to a folder on your laptop

  • make your own branch for your isolated edits
  • keeps the main branch clean

  • write code like normal
  • save your code updates, and commit code updates to branch
    • commits are changes stored in the branch
  • make small, frequent commits if possible

  • after you save and commit,
  • push branch to GitHub
    • if first time creating a branch, must publish first
  • like taking your local changes and pushing them to GitHub where your team can see the changes

after you push the branch, create a pull request

after you push the branch, create a pull request

in GitHub you can pull your feature branch into the main branch - this merges the changes into your codebase

in GitHub you can pull your feature branch into the main branch - this merges the changes into your codebase
  • back in your Git Desktop App, pull the latest version of a branch
  • it will pull the latest changes your teammates made into your local clone
  • should pull regularly throughout the process

DOH-EPI-Coders

  • Our private DOH repos
  • Can request for a license
  • Teams channels for Q/A and discussions

Examples

GitHub Pages

  • GitHub can host static webpages for free
  • Works well with Quarto and Rmarkdown
  • Many of our public repos use GH Pages

Package Documentation

ELR Documentation

Reports

Publications

Resources

Questions?

Advanced Settings

  • Release cycles
  • Git Actions