Git Cherry-pick
How Cherry-picking is handled in KEDEHub.
Cherry-pick in general
For each of one or more existing commits, the git cherry-pick <oid>
command creates a new commit
with an identical diff to <oid>
whose parent is the current commit[2].
Git is following these steps:
-
Compute the diff between the commit
<oid>
and its parent. -
Apply that diff to the current
HEAD
. -
Create a new commit whose root tree matches the new working directory and whose parent is the commit at
HEAD
. -
Move the ref at
HEAD
to that new commit.
It is important to recognize that cherry-pick didn't “move” the commit to be on top of our current HEAD. Instead cherry-pick created a new commit whose diff matches the existing commit. This way there are two copies of the same diff i.e. code contributed by the same author on the same date.
How is Cherry-picking handled in KEDEHub?
Cherry-picking keeps the author and created date of the original commit in the newly created commit. The cherry-picked commits have the additional info of who committed at the moment of cherry-picking. If we ever needed to answer the question "who committed this code the first time?", we would be able to retrieve that by tracking the source of the cherry-pick and reading this unchanged data.
Here is an example how to asure ourselves that author name and date are preserved in the cherry-picked commit:
rm -rf 1
mkdir 1
cd 1
git init
echo 1 > 1
git add 1
git commit -m c1
git checkout -b dev
echo 2 > 2
git add 2
git commit -m c2
git log --all --reverse --date-order --format=fuller
----------
commit fada57f3fbb8ff6addb9f4c1222d82a2427053ad (master)
Author: Dimitar Bakardzhiev
AuthorDate: Thu May 19 15:38:37 2022 +0300
Commit: Dimitar Bakardzhiev
CommitDate: Thu May 19 15:38:37 2022 +0300
c1
commit a4ec7c5ae39f5254418c14966dd1450f13fc1ea6 (HEAD -> dev)
Author: Dimitar Bakardzhiev
AuthorDate: Thu May 19 15:38:59 2022 +0300
Commit: Dimitar Bakardzhiev
CommitDate: Thu May 19 15:38:59 2022 +0300
c2
----------
c2=`git rev-parse HEAD`
sleep 1
git checkout master
git cherry-pick "$c2"
git log --all --reverse --date-order --format=fuller
----------
commit fada57f3fbb8ff6addb9f4c1222d82a2427053ad
Author: Dimitar Bakardzhiev
AuthorDate: Thu May 19 15:38:37 2022 +0300
Commit: Dimitar Bakardzhiev
CommitDate: Thu May 19 15:38:37 2022 +0300
c1
commit a4ec7c5ae39f5254418c14966dd1450f13fc1ea6 (dev)
Author: Dimitar Bakardzhiev
AuthorDate: Thu May 19 15:38:59 2022 +0300
Commit: Dimitar Bakardzhiev
CommitDate: Thu May 19 15:38:59 2022 +0300
c2
commit 34c3e82f8ad935da538a4e3bb98790cb35a3a4ef (HEAD -> master)
Author: Dimitar Bakardzhiev
AuthorDate: Thu May 19 15:38:59 2022 +0300
Commit: Dimitar Bakardzhiev
CommitDate: Thu May 19 15:40:51 2022 +0300
c2-
----------
cd ..
rm -rf 1
We can see that c2 is duplicated in two commits. However, for both duplicates AuthorDate is the same.
How to lie with Cherry-picking?
There might be people who would like to game KEDEHub in order to get higher KE$DE. There are several ways to create false, fraudulent commits using cherry-pick.
If you supply --reset-author
as a command line flag, git commit
will reset the author to what is configured,
or to whomever you name.
This also renews the author timestamp[3].
You can also specify an author-date at this point in the same way.
The git cherry-pick
command won't pass --reset-author
when runs git commit
,
but here is what things you can do:
-
If you run
git cherry-pick -n
instead ofgit cherry-pick
, then thecherry-pick
command won't rungit commit
. You'll have to run it yourself. You can run it with--reset-author
, and hence adjust the author and date.
Here is an example how to chance author date with cherry-picking:
rm -rf 1
mkdir 1
cd 1
git init
echo 1 > 1
git add 1
git commit -m c1
git checkout -b dev
echo 2 > 2
git add 2
git commit -m c2
git log --all --reverse --date-order --format=fuller
----------
commit 677233c000425b3bd0348f19b056ce014de6e2ea (master)
Author: Dimitar Bakardzhiev
AuthorDate: Thu May 19 16:08:42 2022 +0300
Commit: Dimitar Bakardzhiev
CommitDate: Thu May 19 16:08:42 2022 +0300
c1
commit 7c26cb9be9bef362de0805c8620e002be269085f (HEAD -> dev)
Author: Dimitar Bakardzhiev
AuthorDate: Thu May 19 16:08:57 2022 +0300
Commit: Dimitar Bakardzhiev
CommitDate: Thu May 19 16:08:57 2022 +0300
c2
----------
c2=`git rev-parse HEAD`
sleep 1
git checkout master
git cherry-pick "$c2" -n
git commit --amend --no-edit --date 'now'
git log --all --reverse --date-order --format=fuller
----------
commit 677233c000425b3bd0348f19b056ce014de6e2ea
Author: Dimitar Bakardzhiev
AuthorDate: Thu May 19 16:08:42 2022 +0300
Commit: Dimitar Bakardzhiev
CommitDate: Thu May 19 16:08:42 2022 +0300
c1
commit 7c26cb9be9bef362de0805c8620e002be269085f (dev)
Author: Dimitar Bakardzhiev
AuthorDate: Thu May 19 16:08:57 2022 +0300
Commit: Dimitar Bakardzhiev
CommitDate: Thu May 19 16:08:57 2022 +0300
c2
commit c1c67487ff10dbd4350f3c5c4ac1f2145ad02bdc (HEAD -> master)
Author: Dimitar Bakardzhiev
AuthorDate: Thu May 19 16:09:56 2022 +0300
Commit: Dimitar Bakardzhiev
CommitDate: Thu May 19 16:09:56 2022 +0300
c1
----------
cd ..
rm -rf 1
We see that commit c1 is created a new but with a different author date. In this way code changes will be duplicated and if used in KEDE calculations increase individual performance.
References
1. Git Basics - Undoing Things
2. git-cherry-pick - Apply the changes introduced by some existing commits