KEDEHub Starter
How to install KEDEHub on your laptop in 10 minutes
Overview
Welcome to an exclusive exploration of KEDEHub!
-
KEDEHub Starter is free to use for:
- Small teams with less than 10 active monthly contributors
- Open Source projects
If your organization has several small teams then each one of them is allowed to use KEDEHub starter for free.
-
Duration
Your journey with KEDEhub is unlimited in duration - hopefully forever!
-
Feedback & Support
Your feedback drives our innovation. Share your experience, queries, or challenges anytime during the trial. Our dedicated support team is here to help. Contact us at support@kedehub.com.
Your insights and satisfaction are paramount to us. Dive in, explore, and let's improve your organization together with KEDEHub.
How to install KEDEHUb locally
Pull KEDEHUb Image
KEDEHub image is hosted on Amazon Elastic Container Registry (ECR) and it is set to be public. You can pull and use the image without needing to authenticate.
-
Image location.
public.ecr.aws/kedehub/kedehub-localhost-image
-
Pull the Image:
You can pull the image using the Docker command:
docker pull public.ecr.aws/kedehub/kedehub-localhost-image && docker tag public.ecr.aws/kedehub/kedehub-localhost-image kedehub-localhost-image
You can pull the image using the Podman command:
podman pull public.ecr.aws/kedehub/kedehub-localhost-image && podman tag public.ecr.aws/kedehub/kedehub-localhost-image kedehub-localhost-image
-
Ensure the kedehub-localhost-image is there.
You can list the current images using Docker
docker images
You can list the current images using Podman
podman images
Run the image
After pulling the image, you can run a container based on the image, by following the below steps:
-
Before running the container, create a volume.
This volume will store your data, ensuring it persists even if the container is stopped or deleted.
docker volume create kedehubdata
podman volume create kedehubdata
-
Ensure the volume: is created
You can list the current Volumes and check if kedehubdata is there.
docker volume ls
podman volume ls
-
Run KEDEHub Container.
docker run --name kedehub-localhost-container -p 8080:80 -p 5400:5400 -v kedehubdata:/var/lib/postgresql/data -d kedehub-localhost-image
podman run --name kedehub-localhost-container -p 8080:80 -p 5400:5400 -v kedehubdata:/var/lib/postgresql/data -d kedehub-localhost-image
When you create a new container and attach the same volume, the data will be available to the new container. With this setup, all the data will be stored in the kedehubdata volume. Even if you stop or remove the container, the data will remain in the volume.
Check if the container is running:
docker container ls -a
podman container ls -a
You should see something like:
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES a7d5e3bf41a0 kedehub-localhost-image "/usr/local/bin/star…" 32 seconds ago Up 30 seconds 0.0.0.0:5400->5400/tcp, :::5400->5400/tcp, 5432/tcp, 0.0.0.0:8080->80/tcp, :::8080->80/tcp kedehub-localhost-container
-
Test using the below command if the KEDEHub instance is up and running:
curl -X 'GET' 'http://localhost:8080/api/public/projects/count' -H 'accept: application/json'
If everything is OK it should return:
{"count": 0}
See the KEDEHUb Web site
http://localhost:8080/
-
Create a new company
http://localhost:8080/signup_company/#/
The designated admin user will receive an invitation email on the email address provided during the company creation. Ensure that the company is created by clicking the link in the email.
Analyze Git repositories
KEDEGit is a Python application responsible for:
- Analyzing local Git repositories
- Sending commands to the KEDEHub
Analysis is performed on local clones of Git repositories, ensuring:
- Your source code and commit messages remain secure on your premises, with no transfer to KEDEHub
- No capture of your intellectual property through source code analysis
- No analysis of commit messages
The client application KEDEGit is an open-source project, which can be found here. KEDEGit must be installed on the same computer along with KEDEHUb and network access to the directory where the target Git repositories are cloned. You also need your user to be the owner of the target Git repository directory.
Clone KEDEGit
In this guide the KEDEGit directory is:
-
For for Mac/Linux:
~/git/kedegit/
-
For Windows:
C:\Users\dimit\git\kedegit
~/
equals C:\Users\dimit
for the user 'dimit'.
Navigate to the ~/git/ directory and clone the KEDEGit repo:
git clone https://github.com/kedehub/kedegit.git
Then clone the KEDEMatcher repo:
git clone https://github.com/kedehub/kedematcher.git
Configure KEDEGit
Configuration directory
KEDEGit uses Confuse for managing its configuration. In our case the application name is KedeGit. The configuration paths for different platforms are listed here . Users can also add an override configuration directory with an environment variable. The environment variable name for KEDEGit is KEDEGITDIR .
In this guide the configuration directory is:
-
For Mac/Linux containers:
~/git/kedegit/docs
-
For Windows containers:
%HOME%\git\kedegit\docs
-
For Mac executables:
~/.config/KedeGit
-
For Linux executables:
~/.config/KedeGit
-
For Windows executables:
%HOME%\AppData\Roaming\KedeGit
If you run KEDEGit and KEDEMatcher as containers you can change it later to any other directory.
Set Allowed and excluded file types
Allowed and excluded file types are in ~/git/kedegit/docs/kede-config.json. Edit the file if needed to match your architecture, technology and preferences.
Cppy kede-config.json. to the same configuration directory.
Set configuration file
For Mac/Linus Execute the below command to create a new empty config.yaml.
cp ~/git/kedegit/docs/docker_empty_config.yaml ~/git/kedegit/docs/config.yaml
For Windows copy docker_empty_config.yaml
as config.yaml
Open config.yaml and add values for company name, user and token from your invitation email.
server:
protocol: http
host: host.docker.internal
port: 5400
company:
name:
user:
token:
In case you are using Podman you need to replace:host: host.docker.internal
with host: host.containers.internal
as the hostname to access services running on the host machine.
This hostname resolves to an IP address assigned to the host on a bridge network managed by Podman.
In case you are running KEDEGit as an executable you need to replace:host: host.docker.internal
with host: localhost
as the hostname to access services running on the host machine.
Check for Windows-Specific Formatting Issues
Ensure that:
- Line endings (CRLF vs. LF) are consistent. You can convert line endings using an editor like Notepad++ or within Visual Studio Code (VS Code).
-
No special characters like non-ASCII characters or BOM (Byte Order Mark)
(\ufeff)
are accidentally inserted, especially if the file is edited on different systems.
- Running in a docker container
- Running as a system executable file
Running KEDEGit inside a docker container
Pull the KEDEGit image
Run the following command:
docker pull public.ecr.aws/kedehub/kedegit-image && docker tag public.ecr.aws/kedehub/kedegit-image kedegit-image
podman pull public.ecr.aws/kedehub/kedegit-image && podman tag public.ecr.aws/kedehub/kedegit-image kedegit-image
Test the KEDEGit Container
docker run --rm --add-host=host.docker.internal:host-gateway --name kedegit-container -v ~/git/kedegit/docs:/root/.config/KedeGit kedegit-image:latest list-projects
podman run --rm --name kedegit-container -v ~/git/kedegit/docs:/root/.config/KedeGit kedegit-image:latest list-projects
For Windows run:
podman run --rm --name kedegit-container -v c:\Users\dimit\git\kedegit\docs:/root/.config/KedeGit -v c:\Users\dimit\git:/usr/data kedegit-image:latest list-projects
Initializing a New Project
For testing KEDEGit we will use its source code repository located at https://github.com/kedehub/kedegit. We will clone the repository at `~/git/kedegit`.
Now, using the below command, we will initialize a new project called NEW_PROJECT, with the source code of the local Git repository located at `~/git/kedegit`.
docker run --rm --add-host=host.docker.internal:host-gateway --name kedegit-container -v ~/git/kedegit/docs:/root/.config/KedeGit -v ~/git:/usr/data kedegit-image:latest init-project NEW_PROJECT /usr/data/kedegit
podman run --rm --name kedegit-container -v ~/git/kedegit/docs:/root/.config/KedeGit -v ~/git:/usr/data kedegit-image:latest init-project NEW_PROJECT /usr/data/kedegit
In case you are running Podman on Windows, you need to pay attention to how you mount the volume of the local git folder from Windows paths into a Linux container. Podman supports several notation schemes, as presented here:
podman run --rm --name kedegit-container -v c:\Users\dimit\git\kedegit\docs:/root/.config/KedeGit -v c:\Users\dimit\git:/usr/data kedegit-image:latest init-project NEW_PROJECT /usr/data/kedegit
You may also need to adjust permissions on your local repository Windows folder or use the 'ro' option as explained in the Appendix.
The following is the output from the init-project command:
Adding repo: /usr/data/kedegit to project NEW_PROJECT
Assigned project ID: new_project to project: NEW_PROJECT
Processing Repository: https://github.com/kedehub/kedegit.git
Processing commits: 100%|██████████| 37/37 [00:03:00:00, 10.99it/s]
Updating templates for persons: 100%|██████████| 3/3 [00:00:00:00, 25.92it/s]
Calculating Daily KEDE for persons: 100%|██████████| 3/3 [00:00:00:00, 24.87it/s]
Calculating Weekly KEDE for persons: 100%|██████████| 3/3 [00:00:00:00, 31.39it/s]
Successfully initialized project with ID = new_project
NEW_PROJECT will be the name you see in the KEDEHub web client as explained here.
After successfully initializing the new project, the system will display the new project ID as follows:
Assigned project ID: PROJECT_ID to project: PROJECT
.
Make sure to note down the PROJECT_ID, as it will be essential for all subsequent work related to the newly created PROJECT.
Adding a New Repository to an Existing Project
Now, we will add another repository to `NEW_PROJECT`. That will be the KEDEMatcher located at https://github.com/kedehub/kedematcher. We will clone it in a new local Git repository located at `~/git/kedematcher` Then we can execute the below commands.
For Mac/Linux run:
docker run --rm --add-host=host.docker.internal:host-gateway --name kedegit-container -v ~/git/kedegit/docs:/root/.config/KedeGit -v ~/git:/usr/data kedegit-image:latest add-repository new_project /usr/data/kedematcher
podman run --rm --name kedegit-container -v ~/git/kedegit/docs:/root/.config/KedeGit -v ~/git:/usr/data kedegit-image:latest add-repository new_project /usr/data/kedematcher
For Windows run:
podman run --rm --name kedegit-container -v c:\Users\dimit\git\kedegit\docs:/root/.config/KedeGit -v c:\Users\dimit\git:/usr/data kedegit-image:latest add-repository new_project /usr/data/kedematcher
The following is the output from the add-repository command:
Adding repo: /usr/data/kedematcher to project new_project
Assigned project ID: new_project to project: new_project
Processing Repository: https://github.com/kedehub/kedematcher.git
Processing commits: 100%|██████████| 15/15 [00:01:00:00, 13.18it/s]
Processed Repository: https://github.com/kedehub/kedematcher.git
Updating templates for persons: 100%|██████████| 3/3 [00:00:00:00, 23.84it/s]
Calculating Daily KEDE for persons: 100%|██████████| 3/3 [00:00:00:00, 22.29it/s]
Calculating Weekly KEDE for persons: 100%|██████████| 3/3 [00:00:00:00, 29.82it/s]
Successfully initialized project with ID = new_project
Updating Project Statistics
The update-projects command performs the following actions:- Analyzes all new commits
- Calculates KEDE and other statistics for the new commits
To update the statistics for all projects within a company with new code contributions, execute:
docker run --rm --add-host=host.docker.internal:host-gateway --name kedegit-container -v ~/git/kedegit/docs:/root/.config/KedeGit -v ~/git:/usr/data kedegit-image:latest update-projects
podman run --rm --name kedegit-container -v ~/git/kedegit/docs:/root/.config/KedeGit -v ~/git:/usr/data kedegit-image:latest update-projects
For Windows run:
podman run --rm --name kedegit-container -v c:\Users\dimit\git\kedegit\docs:/root/.config/KedeGit -v c:\Users\dimit\git:/usr/data kedegit-image:latest update-projects
The following is the output from the update-projects command:
Updating Kedehub for project: new_project, #1 of 1
Processing Repository: https://github.com/kedehub/kedegit.git
Processing commits: 100%|██████████| 39/39 [00:00:00:00, 269.49it/s]
Processed Repository: https://github.com/kedehub/kedegit.git
Processing Repository: https://github.com/kedehub/kedematcher.git
Processing commits: 100%|██████████| 15/15 [00:00:00:00, 409.60it/s]
Processed Repository: https://github.com/kedehub/kedematcher.git
Successfully updated 0 out of 1 projects
Updating Project Statistics for a single existing project with PROJECT_ID_1
For Mac/Linux run:
docker run --rm --add-host=host.docker.internal:host-gateway --name kedegit-container -v ~/git/kedegit/docs:/root/.config/KedeGit -v ~/git:/usr/data kedegit-image:latest update-projects -p PROJECT_ID_1
podman run --rm --name kedegit-container -v ~/git/kedegit/docs:/root/.config/KedeGit -v ~/git:/usr/data kedegit-image:latest update-projects -p PROJECT_ID_1
For Windows run:
podman run --rm --name kedegit-container -v c:\Users\dimit\git\kedegit\docs:/root/.config/KedeGit -v c:\Users\dimit\git:/usr/data kedegit-image:latest update-projects -p PROJECT_ID_1
Updating Project Statistics for multiple existing projects e.g. PROJECT_ID_1 PROJECT_ID_2
For Mac/Linux run:
docker run --rm --add-host=host.docker.internal:host-gateway --name kedegit-container -v ~/git/kedegit/docs:/root/.config/KedeGit -v ~/git:/usr/data kedegit-image:latest update-projects -p PROJECT_ID_1 PROJECT_ID_2
podman run --rm --name kedegit-container -v ~/git/kedegit/docs:/root/.config/KedeGit -v ~/git:/usr/data kedegit-image:latest update-projects -p PROJECT_ID_1 PROJECT_ID_2
For Windows run:
podman run --rm --name kedegit-container -v c:\Users\dimit\git\kedegit\docs:/root/.config/KedeGit -v c:\Users\dimit\git:/usr/data kedegit-image:latest update-projects -p PROJECT_ID_1 PROJECT_ID_2
More information here.
Running KEDEGit as system executable
Download KEDEGit executable for you environment
From here
download the latest KEDEGit into C:\Users\dimit\git\kedegit\dist\win_dist\kedegit
for Windows or
~/git/kedegit/dist/mac_dist/kedegit
for Mac/Linux.
Note: For Windows use the latest PowerShell!
Test KEDEGit
For Mac/Linux run:
~/git/kedegit/dist/mac_dist/kedegit list-projects
For Windows run:
C:\Users\dimit\git>.\kedegit\dist\win_dist\kedegit list-projects
That command should return nothing.
Initializing a New Project
For testing KEDEGit we will use its source code repository located at https://github.com/kedehub/kedegit. We will clone the repository at `~/git/kedegit`.
Now, using the below command, we will initialize a new project called NEW_PROJECT, with the source code of the local Git repository located at `~/git/kedegit`.
For Mac/Linux run:
~/git/kedegit/dist/mac_dist/kedegit init-project NEW_PROJECT ~/git/kedegit
For Windows run:
C:\Users\dimit\git> .\kedegit\dist\win_dist\kedegit init-project NEW_PROJECT C:\Users\dimit\git\kedegit
The following is the output from the init-project command:
Adding repo: C:\Users\dimit\git\kedegit to project NEW_PROJECT
Assigned project ID: new_project to project: NEW_PROJECT
Processing Repository: https://github.com/kedehub/kedegit.git
Processing commits: 100%|██████████| 37/37 [00:03:00:00, 10.99it/s]
Updating templates for persons: 100%|██████████| 3/3 [00:00:00:00, 25.92it/s]
Calculating Daily KEDE for persons: 100%|██████████| 3/3 [00:00:00:00, 24.87it/s]
Calculating Weekly KEDE for persons: 100%|██████████| 3/3 [00:00:00:00, 31.39it/s]
Successfully initialized project with ID = new_project
NEW_PROJECT will be the name you see in the KEDEHub web client as explained here.
After successfully initializing the new project, the system will display the new project ID as follows:
Assigned project ID: PROJECT_ID to project: PROJECT
.
Make sure to note down the PROJECT_ID, as it will be essential for all subsequent work related to the newly created PROJECT.
Adding a New Repository to an Existing Project
Now, we will add another repository to `NEW_PROJECT`. That will be the KEDEMatcher located at https://github.com/kedehub/kedematcher. We will clone it in a new local Git repository located at `~/git/kedematcher` Then we can execute the below command:
For Mac/Linux run:
~/git/kedegit/dist/mac_dist/kedegit add-repository new_project ~/git/kedematcher
For Windows run:
C:\Users\dimit\git> .\kedegit\dist\win_dist\kedegit add-repository new_project C:\Users\dimit\git\kedematcher\
The following is the output from the add-repository command:
Adding repo: C:\Users\dimit\git\kedematcher to project new_project
Assigned project ID: new_project to project: new_project
Processing Repository: https://github.com/kedehub/kedematcher.git
Processing commits: 100%|██████████| 15/15 [00:01:00:00, 13.18it/s]
Processed Repository: https://github.com/kedehub/kedematcher.git
Updating templates for persons: 100%|██████████| 3/3 [00:00:00:00, 23.84it/s]
Calculating Daily KEDE for persons: 100%|██████████| 3/3 [00:00:00:00, 22.29it/s]
Calculating Weekly KEDE for persons: 100%|██████████| 3/3 [00:00:00:00, 29.82it/s]
Successfully initialized project with ID = new_project
Updating Project Statistics
The update-projects command performs the following actions:- Analyzes all new commits
- Calculates KEDE and other statistics for the new commits
For Mac/Linux run:
~/git/kedegit/dist/mac_dist/kedegit update-projects
For Windows run:
C:\Users\dimit\git> .\kedegit\dist\win_dist\kedegit update-projects
The following is the output from the update-projects command:
Updating Kedehub for project: new_project, #1 of 1
Processing Repository: https://github.com/kedehub/kedegit.git
Processing commits: 100%|██████████| 39/39 [00:00:00:00, 269.49it/s]
Processed Repository: https://github.com/kedehub/kedegit.git
Processing Repository: https://github.com/kedehub/kedematcher.git
Processing commits: 100%|██████████| 15/15 [00:00:00:00, 409.60it/s]
Processed Repository: https://github.com/kedehub/kedematcher.git
Successfully updated 0 out of 1 projects
Updating Project Statistics for a single existing project with PROJECT_ID_1
For Mac/Linux run:
~/git/kedegit/dist/mac_dist/kedegit update-projects -p PROJECT_ID_1
For Windows run:
C:\Users\dimit\git> .\kedegit\dist\win_dist\kedegit update-projects -p PROJECT_ID_1
Updating Project Statistics for multiple existing projects e.g. PROJECT_ID_1 PROJECT_ID_2
For Mac/Linux run:
~/git/kedegit/dist/mac_dist/kedegit update-projects -p PROJECT_ID_1 PROJECT_ID_2
For Windows run:
C:\Users\dimit\git> .\kedegit\dist\win_dist\kedegit update-projects -p PROJECT_ID_1 PROJECT_ID_2
More information here.
Merge Identities
Identity matching in the context of Git involves the process of accurately identifying and distinguishing developers based on the various email addresses and names they use when committing work. Developers may use a range of email types, such as corporate, personal, or even anonymous addresses like "users.noreply.github.com." Similarly, the names they commit under can vary significantly, including full names with or without surnames, names with typographical errors, pseudonyms, or sometimes even missing names. The challenge of identity matching lies in aggregating these diverse identities for each individual developer and differentiating them from the identities of other developers. This process is crucial for obtaining precise information about a developer's contributions and activities in Git repositories.
KEDEMatcher addresses this problem by performing semi-automatic identity recognition The client application KEDEMatcher is an open-source project, which can be found here. KEDEMatcher must be installed on the same computer along with KEDEHUb and network access to the directory where the target Git repositories are cloned.
Note: Mac OS M1 architecture is not supported, It is actually an issue with this package apjv.
Clone KEDEMatcher
Navigate to the ~/git/ directory and clone the KEDEMatcher repo:
git clone https://github.com/kedehub/kedematcher.git
Configure KEDEMatcher
KEDEMatcher uses the same configuration as KEDEGit. Thus, if not already set up, go and set up KEDEGit as explained here.
There are two options to run KEDEMatcher:
- Running in a docker container
- Running as a system executable file
Running KEDEMatcher inside a docker container
Pull the KEDEMatcher docker image
Run the following command:
docker pull public.ecr.aws/kedehub/kedematcher-image && docker tag public.ecr.aws/kedehub/kedematcher-image kedematcher-image
podman pull public.ecr.aws/kedehub/kedematcher-image && podman tag public.ecr.aws/kedehub/kedematcher-image kedematcher-image
Merge identities for a single project
The identity-merge
command will:
- Determine all authors who belong to the same individual
- Create a new KEDEHub user for that individual
To merge identities on a single project, use the below and make sure to use the PROJECT_ID in this case 'new_project', not project name:
docker run --rm --name kedematcher-container -v ~/git/kedegit/docs:/root/.config/KedeGit kedematcher-image:latest identity-merge -p new_project
podman run --rm --name kedematcher-container -v ~/git/kedegit/docs:/root/.config/KedeGit kedematcher-image:latest identity-merge -p new_project
In case you are running Podman on Windows, you need to pay attention to how you mount the volume of the local git folder from Windows paths into a Linux container. Podman supports several notation schemes, as presented here:
podman run --rm --name kedematcher-container -v c:\Users\dimit\git\kedegit\docs:/root/.config/KedeGit kedematcher-image:latest identity-merge -p new_project
You may also need to adjust permissions on your local repository Windows folder or use the 'ro' option as explained in the Appendix.
The following is the output from the identity-merge command:
First pass...
Matching by: EmailMatcher: 67%|██████▋ | 2/3 [00:00:00:00, 5155.87it/s]
Matching by: EmailNameMatcher: 67%|██████▋ | 2/3 [00:00:00:00, 496.72it/s]
Saving users: 100%|██████████| 2/2 [00:00:00:00, 23.99it/s]
Successfully merged 2 into 2 users with 0 authors for 1 projects. Created 2 new users.
Second pass...
Matching by: EmailNameMatcher: 67%|██████▋ | 2/3 [00:00:00:00, 3128.91it/s]
Saving users: 100%|██████████| 2/2 [00:00:00:00, 40329.85it/s]
Successfully merged 2 into 2 users with 0 authors for 1 projects. Created 0 new users.
Merge identities for a company
To merge identities on all projects for a company, use:
docker run --rm --name kedematcher-container -v ~/git/kedegit/docs:/root/.config/KedeGit kedematcher-image:latest identity-merge
podman run --rm --name kedematcher-container -v ~/git/kedegit/docs:/root/.config/KedeGit kedematcher-image:latest identity-merge
In case you are running Podman on Windows, you need to pay attention to how you mount the volume of the local git folder from Windows paths into a Linux container. Podman supports several notation schemes, as presented here:
podman run --rm --name kedematcher-container -v c:\Users\dimit\git\kedegit\docs:/root/.config/KedeGit kedematcher-image:latest identity-merge
You may also need to adjust permissions on your local repository Windows folder or use the 'ro' option as explained in the Appendix.
More information here.
Running KEDEMatcher as system executable
Download KEDEMatcher executable for you environment
From here
download the latest KEDEMatcher into C:\Users\dimit\git\kedematcher\dist\win_dist\kedematcher
for Windows or
~/git/kedematcher/dist/mac_dist/kedematcher
for Mac/Linux.
Note: For Windows use the latest PowerShell!
Merge identities for a single project
The identity-merge
command will:
- Determine all authors who belong to the same individual
- Create a new KEDEHub user for that individual
To merge identities on a single project, use the below and make sure to use the PROJECT_ID in this case 'new_project', not project name.
For Mac/Linux run:
~/git/kedematcher/dist/mac_dist/kedematcher identity-merge -p new_project
For Windows run:
C:\Users\dimit\git> .\kedematcher\dist\win_dist\kedematcher identity-merge -p new_project
The following is the output from the identity-merge command:
First pass...
Matching by: EmailMatcher: 67%|██████▋ | 2/3 [00:00:00:00, 5155.87it/s]
Matching by: EmailNameMatcher: 67%|██████▋ | 2/3 [00:00:00:00, 496.72it/s]
Saving users: 100%|██████████| 2/2 [00:00:00:00, 23.99it/s]
Successfully merged 2 into 2 users with 0 authors for 1 projects. Created 2 new users.
Second pass...
Matching by: EmailNameMatcher: 67%|██████▋ | 2/3 [00:00:00:00, 3128.91it/s]
Saving users: 100%|██████████| 2/2 [00:00:00:00, 40329.85it/s]
Successfully merged 2 into 2 users with 0 authors for 1 projects. Created 0 new users.
Merge identities for a company
To merge identities on all projects for a company, use:
For Mac/Linux run:
~/git/kedematcher/dist/mac_dist/kedematcher identity-merge
For Windows run:
C:\Users\dimit\git> .\kedematcher\dist\win_dist\kedematcher identity-merge
More information here.
Successful Installation
The final result can be seen when you go to "Organization" tab on the KEDEHub web interface. You should see the image below:
We may notice that in the header menu:
- THere is one project count on the 'Projects' tab
- There is one active user count on the "People" tab
Appendix
Git's handling of directory ownership within a container
When mapping volumes from a Windows host to a Linux container, Windows filesystem permissions do not translate directly to Linux permissions.
When using KEDEGit and KEDEMatcher in a containerized environment like Docker or Podman, you may encounter issues related to Git's security feature that checks the ownership of the repository directory. This feature is designed to prevent unauthorized modifications by ensuring that the user executing Git commands matches the owner of the repository directory. In environments where volumes are mapped from Windows to Linux containers, discrepancies in user or group IDs can trigger these security checks, resulting in errors such as "detected dubious ownership" or "unsafe repository."
Here are some strategies for Handling Permission Issues in Volume Mapping:
-
Change Repository Ownership:
On Windows, use the takeown command to change the owner of the repository folder to the user running the Git command:
takeown /f path_to_the_repository /r /d y
On Linux, use the chown command:
chown -R username:group path_to_the_repository
This ensures that the user executing KEDEGit and KEDEMatcher commands has ownership of the repository directory
-
Examine and Adjust Permissions on your repository folder from Windows:
Adjust them to be as permissive as possible to test if it affects how they are perceived inside the container.
Here’s how you can check the permissions on Windows:
- Right-click on the repository folder.
- Select "Properties".
- Go to the "Security" tab to view or modify the permissions.
-
Read-Only Volume Mounting:
Using the ro (read-only) option with the -v or --volume flag when mounting directories in Podman or Docker is a viable way to manage some types of access and security issues, including problems like the "detected dubious ownership" or "unsafe repository" errors with Git. By mounting the directory as read-only, you ensure that the Git repository cannot be modified by any processes running inside the container, thus maintaining the integrity of your data. This approach doesn't require any changes to Git configurations or user permissions settings inside the container, making it simpler to manage and less error-prone.
Given that KEDEGit's Git usage inside the container is limited to read-only operations like git log, git show, and git diff, using the ro (read-only) mount option is an effective and straightforward solution to the "detected dubious ownership" or "unsafe repository" issues,.
Here's how you could adjust your Podman or Docker command to mount the volume as read-only:
podman run -v path_to_the_repository:/usr/data:ro ...
docker run -v path_to_the_repository:/usr/data:ro ...
Fix DNS resolution in WSL2
-
Turn off generation of /etc/resolv.conf
Using your Linux prompt, modify (or create) /etc/wsl.conf with the following content:
[network] generateResolvConf = false
using:
cd ~/../../etc echo "generateResolvConf = false" | sudo tee -a wsl.conf
-
Restart the WSL2 Virtual Machine
Exit all of your Linux prompts and run the following Powershell command
wsl --shutdown
-
Create a custom /etc/resolv.conf
Open a new Linux prompt and cd to /etc If resolv.conf is soft linked to another file, remove the link with:
rm resolv.conf
Create a new resolv.conf with the following content nameserver X.X.X.X
touch resolv.conf echo "nameserver X.X.X.X | sudo tee -a resolv.conf
-
Restart the WSL2 Virtual Machine:
wsl --shutdown
-
Start a new Linux prompt and check.
cat /etc/resolv.conf cat /etc/wsl.conf
Cloning Repositories on Windows from Azure DevOps
This PowerShell script uses a Personal Access Token (PAT) to authenticate with Azure DevOps and retrieve a list of repositories. It filters repositories containing "SEARCH_WORD_1" or "SEARCH_WORD_1" in their URLs (excluding those ending with "SEARCH_WORD_1%") and then clones each matching repository. The script ensures a secure connection by embedding the PAT in each clone URL, facilitating automated access to Azure DevOps repositories.
#Set PAT token
$PAT = "YOUR TOKEN"
$uri = "https://dev.azure.com/YOUR_ORG/_apis/git/repositories?api-version=6.0"
$headers = @{
Authorization = "Basic " + [Convert]::ToBase64String([Text.Encoding]::ASCII.GetBytes(":$($PAT)"))
}
# Make the API request to get the list of repositories
$response = (Invoke-RestMethod -Uri $uri -Method Get -Headers $headers)
$filteredResponse = ($response.value.remoteUrl | Where-Object {(($_ -like "*/SEARCH_WORD_1*") -or ($_ -like "*/SEARCH_WORD_1*")) -and ($_ -notlike "*SEARCH_WORD_1%*") })
foreach ($ulr in $filteredResponse) {
$gitURL = "https://"+$PAT+($ulr.TrimStart("https://kognifai"))
git clone $gitURL 2> $null
}
Updating Repositories on Windows from Azure DevOps
Here’s a PowerShell script to update the repositories that were initially cloned using the first script. This script assumes that each repository was cloned into a separate folder with a name that matches the repository's name.
# Set PAT token
$PAT = "YOUR TOKEN"
$uri = "https://dev.azure.com/YOUR_ORG/_apis/git/repositories?api-version=6.0"
$headers = @{
Authorization = "Basic " + [Convert]::ToBase64String([Text.Encoding]::ASCII.GetBytes(":$($PAT)"))
}
# Make the API request to get the list of repositories
$response = (Invoke-RestMethod -Uri $uri -Method Get -Headers $headers)
$filteredResponse = ($response.value.remoteUrl | Where-Object {(($_ -like "*/SEARCH_WORD_1*") -or ($_ -like "*/SEARCH_WORD_2*")) -and ($_ -notlike "*SEARCH_WORD_1%*") })
foreach ($url in $filteredResponse) {
# Extract repository name from the URL
$repoName = $url -replace "^.*/([^/]+)\.git$", '$1'
# Define the local path of the repository
$localPath = ".\$repoName"
if (Test-Path $localPath) {
Write-Host "Updating repository: $repoName"
Set-Location -Path $localPath
git pull 2> $null
Set-Location -Path ..
} else {
Write-Host "Repository '$repoName' not found locally. Skipping update."
}
}
Explanation
- Authentication: The script uses a PAT for authentication to access Azure DevOps.
- Filtering Repositories: Filters repositories to include only those with "SEARCH_WORD_1" or "SEARCH_WORD_2" in their URL
- Updating Repositories: For each filtered repository, it attempts to navigate to the respective local directory and run git pull to update it. If the local folder doesn’t exist, it skips that repository.
Getting started