Authorization Type

Carbon utilizes Access Key-based authentication for connecting to GitHub repositories.

Authorization Flow

Configuration

Step 1: Generate an Access Token

To enable synchronization of repositories with our system, users need to create a GitHub Personal Access Token with appropriate permissions (view screenshot below). Follow the steps outlined in the GitHub documentation on Managing Personal Access Tokens.

Step 2: Connect Your GitHub Account to Carbon

  • Submit your GitHub username and the generated Personal Access Token to our integration endpoint at /integrations/github or via the Carbon Connect UI.
  • Our system will initiate the synchronization process by listing your repositories and their respective content.

If your token expires, simply generate a new one with the necessary permissions and post it to the same endpoint (/integrations/github). Carbon will automatically update the data source with the new token, ensuring uninterrupted synchronization.

Step 3: List and Sync Repositories

  • To retrieve the repositories your token has access to, use the /integrations/github/repos endpoint.
  • To sync the repositories, make a request to /integrations/github/sync_repos. This will store the repository content within Carbon, which can be accessed through the /integrations/items/list endpoint.
  • Please note that a maximum of 25 repositories can be synced per request.

Step 4: Sync Github Files

After successfully connecting your GitHub account, you can use the following endpoints to browse and sync files from your repositories:

  • /integrations/items/list: Lists the files and folders within a repository.
  • /integrations/files/sync: Syncs specific files and folders from a repository.

Carbon currently supports syncing individual folders and files within repositories, rather than entire repositories. To list the items within a repository, pass the external_id (Github repository ID) as the parent_id parameter.

Example request to list items within a repository:

curl --location 'https://api.carbon.ai/integrations/items/list' \
--header 'customer-id: ddfdc805-be49-441e-af8b-bb08cf50957e' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer <token>' \
--data '{
    "data_source_id": 19517,
    "parent_id": "carbon/hello-world" 
}'

Alternatively, you can use the pre-built Carbon Connect file picker to easily browse and sync files.

Step 5: Sync Additional GitHub Data (Optional)

In addition to syncing files from repositories, you can fetch data directly from GitHub via the following endpoints:

  • /integrations/data/github/pull_requests: Lists all pull requests for a repository
  • /integrations/data/github/pull_requests/{pull_number}: Retrieves a specific pull request
  • /integrations/data/github/pull_requests/comments: Fetches comments on a pull request
  • /integrations/data/github/pull_requests/files: Retrieves files that were changed
  • /integrations/data/github/pull_requests/commits: Retrieves a list of commits on a pull request
  • /integrations/data/github/issues: Lists repository issues
  • /integrations/data/github/issues/{issue_number}: Retrieves a specific issue

By default, we return responses with mappings applied, but there is an option to include the entire GitHub response on every endpoint (include_remote_data).

Functionality

Carbon syncs Github files from all private and public repositories. In cases where an extension is unknown, we’ll attempt to parse it as a text file.

Synchronization

You can also use the resync_file API endpoint to programmatically resync specific Github files. To delete Github files from Carbon, you can use the delete_files endpoint directly.

To sync Github files on a 24-hour schedule (more frequent schedules available upon request), you can use the /update_users endpoint. This endpoint allows organizations to customize syncing settings according to their requirements, with the option to enable syncing for all data sources using the string ‘ALL’. It’s important to note that each request supports up to 100 customer IDs.

Here’s an example illustrating how to automatically enable syncing for updated Github content for specified users:

{
    "customer_ids": ["team@carbon.ai", "sam@openai.com"],
    "auto_sync_enabled_sources": ["GITHUB"]
}