Authorization Type

Carbon uses Access Key based authentication to connect to Azure Blob Storage.

Functionality

Carbon allows users to upload text, image, video. and audio file formats directly from Spaces Object Storage.

Synchronization

Step 1: Retrieve Account Name and Account Key

  1. Log into your Azure Portal and navigate to the “Storage Account” service.
  2. Click the name of the Storage Account you wish to sync.
  3. Under the selected account, navigate to “Security + networking” > “Access keys” in the left-hand menu.
  4. There will be two pairs of “Account Name” and “Account Key” under “Key1” and “Key2”. You can use either pair.

Step 2: Provide Credentials to Carbon

Via API: POST the generated credentials to the /integrations/azure_blob_storage endpoint in this format:

{
    "account_name": "carbontest",
    "account_key": "SOME_KEY"
}

Via Carbon Connect: Optionally you can input the credentials directly via the Carbon Connect component.

Step 3: List Files and Buckets

Optionally load all available Azure Blob Storage buckets and objects by using POST /integrations/items/sync and list them via POST /integrations/items/list.

Step 4: Sync Files or Buckets

Use the /integrations/azure_blob_storage/files endpoint to sync specific file objects or blobs.

For syncing specific files (multiple files allowed):

{
   "ids": [{"bucket": "container-name", "id": "file-name"}],
   "tags": "...",
   "chunk_size": "...",
   "chunk_overlap": "...",
   "skip_embedding_generation": "..."
}

For syncing all files from buckets (multiple buckets allowed):

{
   "ids": [{"bucket": "carbon-test"}],
   "tags": "...",
   "chunk_size": "...",
   "chunk_overlap": "...",
   "skip_embedding_generation": "..."
}

You can also use the resync_file API endpoint to programmatically resync specific Azure Blob Storage files. To delete Azure Blob Storage files from Carbon, you can use the delete_file endpoint directly.

To sync Azure Blob Storage files on a 24-hour schedule (more frequent schedules available upon request), you can use the /update_users endpoint. This endpoint allows organizations to customize syncing settings according to their requirements, with the option to enable syncing for all data sources using the string ‘ALL’. It’s important to note that each request supports up to 100 customer IDs.

Here’s an example illustrating how to automatically enable syncing for updated Azure Blob Storage content for specified users:

{
    "customer_ids": ["team@carbon.ai", "sam@openai.com"],
    "auto_sync_enabled_sources": ["AZURE_BLOB_STORAGE"]
}