-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathTeaching_Survival_Git.qmd
388 lines (264 loc) · 10.5 KB
/
Teaching_Survival_Git.qmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
---
title: "Teaching Survival Git"
subtitle: "Public GitHub Repo"
author: "Rob Donald"
date: today
date-format: "dddd DD MMMM YYYY"
format:
pdf:
toc: true
toc-depth: 2
number-sections: false
lot: false
lof: false
colorlinks: true
code-block-bg: 'ffffcc' # fafad2 lightgoldenrodyellow, ffffcc web safe
papersize: a4
include-in-header:
text: \addtokomafont{disposition}{\rmfamily}
\usepackage{amsmath}
\numberwithin{equation}{subsection}
\usepackage{xurl}
bibliography: ../../PostDocMaster.bib
cite-method: biblatex
biblatexoptions:
- citestyle = authoryear
- style = authoryear
- urldate = iso
- datezeros = true
- natbib = true
- sorting = nty
---
```{=tex}
\def\shrug{\texttt{\raisebox{0.75em}{\char`\_}\char`\\\char`\_\kern-0.5ex(\kern-0.25ex\raisebox{0.25ex}
{\rotatebox{45}{\raisebox{-.75ex}"\kern-1.5ex\rotatebox{-90})}}\kern-0.5ex)\kern-0.5ex\char`\_/\raisebox{0.75em}
{\char`\_}}}
```
```{r}
#| label: "library_setup"
#| include: false
suppressMessages({suppressWarnings({
library(dplyr)
library(tidyr)
library(readr)
library(here)
library(tibble)
})})
```
# Introduction
My notes on using git as a data scientist. In other words you *know and accept* you
should be using git but man is it confusing and almost painful. \small\shrug\normalsize
I've been a software developer and data scientist for more than 30 years and I've used
all sorts of version control systems (started with Apollo DSEE, look that one up). I find
git confusing.
# Moving from SVN to Git
* \url{https://smartbear.com/blog/migrating-from-subversion-to-git-lessons-learned/}
* \url{https://dev.to/fpuffer/what-makes-companies-switch-from-svn-to-git-3687}
* \url{https://ohshitgit.com/}
# Git setup in general
* \url{https://happygitwithr.com/}
* \url{https://happygitwithr.com/git-intro.html}
## GitHub Access Tokens
Github now requires you to use personal access tokens (PATs). These are a bit of a pain
but described by Jenny Bryan here:
* \url{https://happygitwithr.com/https-pat}
## GitHub Push Issues
Today (2024-08-31) I was setting up a new repo with admittedly some large `.graffle` files.
I got this error
```{bash eval=FALSE}
rob@Taylor Platform_Terrace_Shed % git push -u origin main
Enumerating objects: 69, done.
Counting objects: 100% (69/69), done.
Delta compression using up to 10 threads
Compressing objects: 100% (69/69), done.
error: RPC failed; HTTP 400 curl 22 The requested URL returned error: 400
send-pack: unexpected disconnect while reading sideband packet
Writing objects: 100% (69/69), 11.63 MiB | 19.02 MiB/s, done.
Total 69 (delta 25), reused 0 (delta 0), pack-reused 0
fatal: the remote end hung up unexpectedly
Everything up-to-date
```
This happened in RStudio and in a Mac Terminal window. Over to stackoverflow to this link
* \url{https://stackoverflow.com/questions/78590144/while-cloning-the-git-repository-getting-unexpected-disconnect-while-reading-sid}
Which says increase a git config setting. So I did that and everyone is happy.
```{bash eval=FALSE}
rob@Taylor Platform_Terrace_Shed % git config http.postBuffer 10g
rob@Taylor Platform_Terrace_Shed % git push -u origin main
Enumerating objects: 69, done.
Counting objects: 100% (69/69), done.
Delta compression using up to 10 threads
Compressing objects: 100% (69/69), done.
Writing objects: 100% (69/69), 11.63 MiB | 46.15 MiB/s, done.
Total 69 (delta 25), reused 0 (delta 0), pack-reused 0
remote: Resolving deltas: 100% (25/25), done.
To https://github.com/StatsResearch/Platform_Terrace_Shed.git
* [new branch] main -> main
branch 'main' set up to track 'origin/main'.
rob@Taylor Platform_Terrace_Shed %
```
# Lightbulb Moments
## Switching back and forth between main and branches
This is from the world famous Jenny Bryan \footnote{
If you don't know who she is look at this link
\url{https://www.rstudio.com/speakers/jenny-bryan/}
and follow her on twitter: @JennyBryan.}
* \url{https://happygitwithr.com/git-branches.html Section 22.2}
So you *can* do it and yes git __stash__ is a way but she recommends
the same idea I had read somewhere else about *always* doing a commit
even if it is just with the comment 'WIP xxx'. But she then
goes on to explain how you can get rid of that temporary commit using
terminal (bash) commands like:
```{bash eval=FALSE}
$ git checkout issue-5
$ git reset HEAD^
```
## Merge conflicts
* \url{https://happygitwithr.com/git-branches.html} Section 22.4
# Daily routine
```{bash, attr.source='.numberLines', eval=FALSE}
# fetch the changes from the remote
$ git fetch origin
# show commit logs of changes
$ git log master..origin/master
# show diffs of changes
$ git diff master..origin/master
# apply the changes by merge..
$ git merge origin/master
# .. or just pull the changes
$ git pull
```
# I Just Want To Rename A File (git mv)
Hold on to your hat.
1. Make sure everything is commited i.e. you don't have any work in progress.
2. Use the ```git mv``` command
* % git mv -v old-name.R new-name.R
* Using ```git mv``` rather than just ```mv``` means the file is
automatically 'staged' for you. All you need to do now is ```commit`` it with
a suitable message.
3. Do a git status to double check
* % git status
4. Do the commit
* % git commit -m"Renamed file"
* (*you could of course use the RStudio Git menus for this bit*)
# Marking Something as Significant (git tag)
If you want to be able to refer to a commit without using the odd looking SHA1 you
need to use git tag. You do this __*after*__ the commit which you want to mark.
```{bash eval=FALSE}
$ git tag -a v1.4 -m "my version 1.4"
```
Don't put quotes round the -a bit
* \url{https://www.atlassian.com/git/tutorials/inspecting-a-repository/git-tag}
Then once you have done that you need to ```push``` the tag up to the remote.
```{bash eval=FALSE}
$ git push --tags
```
If you just want to see what tags you have
```{bash eval=FALSE}
$ git tag -l
```
Which only give you the tag 'ref name'
What you probably wanted was this
```{bash eval=FALSE}
$ git tag --format "%(refname:short) | %(taggerdate:iso) [%(contents:subject)]"
--sort=-taggerdate:iso
```
Do the above all on one line. How painful is that \small\shrug\normalsize.
## Tagging A Specific Commit (with the correct date)
Sometime you may want to clean things up by putting a tag on an *older* commit
and make it fit in with some naming scheme you have come up with. This link
and bash code snippet will do that *importantly making the tag use the correct
date of the older commit*.
See this link
* \url{https://stackoverflow.com/questions/4404172/how-to-tag-an-older-commit-in-git/21759466#21759466}
```{bash eval=FALSE}
# Set the HEAD to the old commit that we want to tag
git checkout 9fceb02
# temporarily set the date to the date of the HEAD commit, and add the tag
GIT_COMMITTER_DATE="$(git show --format=%aD | head -1)" git tag -a v1.2 -m"v1.2"
# Make sure the above is a SINGLE line at the command prompt.
# TWO commands will not work.
# push to origin
git push origin --tags
# set HEAD back to whatever you want it to be
git checkout master
```
# How Do I Compare Two PDF Output Files
Well you would think this is easy eh? But the problem is the fact that it is a .pdf file
i.e. __*not*__ a text file. So what you need to do is get the older version's commit SHA1
and then extract that particular commit to somewhere different. Note this command below
is extracting the *whole folder structure* at the point in time from your SHA1.
From this link:
* \url{https://stackoverflow.com/questions/11018411/how-do-i-export-a-specific-commit-with-git-archive}
```{bash eval=FALSE}
$ git archive --format zip --output ./CompareResults/Cmp-01.zip 4081d45d
```
Using the above command you have the whole repo as at SHA1 ```4081d45d``` in the zip file ```Cmp-01.zip```
You can of course just pull out a specific file using this command:
```{bash eval=FALSE}
$ git archive --format zip --output ./CompareResults/SingleFile.zip 4081d45d
Prospec_Stg_1_N69_ELvsDP.pdf (all on a single line)
```
# Getting Info
Sometimes you just wish you could ask a simple question \small\shrug\normalsize.
## Order the branches by time and give me some context
```{bash eval=FALSE}
$ git branch -vv --sort=-committerdate
```
## Find out what branch a commit came from
This command also shows you what branches it is in
```{bash eval=FALSE}
$ git branch --contains ec0b406e -vv --sort=-committerdate
```
Where the ```ec0b406e``` is the SHA1 hash you are interested in. The output is reverse
time ordered so look for the bottom one to see where it started out in life.
## Is my file being tracked by git?
The simple answer is:
```{bash eval=FALSE}
$ git ls-files
```
As you'll probably expect by now you can get an uber nerd answer as well, see this:
* \url{https://stackoverflow.com/questions/15606955/how-can-i-make-git-show-a-list-of-the-files-that-are-being-tracked/15606998}
## Tag Info
* \url{https://stackoverflow.com/questions/13208734/get-the-time-and-date-of-git-tags}
## Am I connected to a remote?
This is most likely to be GitHub.
```{bash eval=FALSE}
$ git remote -v
```
This should get you something like:
```{bash eval=FALSE}
Taylor:Teaching_Survival_Git rob$ git remote -v
origin https://github.com/StatsResearch/Teaching_Survival_Git.git (fetch)
origin https://github.com/StatsResearch/Teaching_Survival_Git.git (push)
Taylor:Teaching_Survival_Git rob$
```
Note that if you are *not* connected to a remote you get no output from the command.
Which might be hard to spot.
Another way to do it is to use the command
```{bash eval=FALSE}
$ git remote show origin
```
This will give you something like this if you are connected to a remote
```{bash eval=FALSE}
Taylor:Teaching_Survival_Git rob$ git remote show origin
* remote origin
Fetch URL: https://github.com/StatsResearch/Teaching_Survival_Git.git
Push URL: https://github.com/StatsResearch/Teaching_Survival_Git.git
HEAD branch: main
Remote branch:
main tracked
Local branch configured for 'git pull':
main merges with remote main
Local ref configured for 'git push':
main pushes to main (up to date)
Taylor:Teaching_Survival_Git rob$
```
If you are *not* connected to a remote you'll get this slightly scary output
but it at least tells you something.
```{bash eval=FALSE}
Taylor:GaggiaClassic rob$ git remote show origin
fatal: 'origin' does not appear to be a git repository
fatal: Could not read from remote repository.
Please make sure you have the correct access rights
and the repository exists.
```