Metadata and Filtering
Custom Metadata
A tag is a key-value pair that can be added to a file to provide custom metadata. These pairs can be used for searches or document retrieval to narrow down search results. A file can have any number of tags.
The following keys are reserved and cannot be used:
db_embedding_id
organization_id
user_id
organization_user_file_id
chunk_number
Carbon currently supports two data types for tag values: string
and a list of string
. Keys can only be strings. If values other than string
and a list of string
are used, they are automatically converted to strings (e.g. 4 will become “4”).
Tags can be added directly via the API (through file upload or OAuth URL) or through Carbon Connect.
Filtering
When filtering based on tags, customers can construct complex filters using “AND”, “OR”, and negation logic.
Take the below input as an example:
{
"OR": [
{
"key": "subject",
"value": "holy-bible",
"negate": false
},
{
"key": "person-of-interest",
"value": "jesus christ",
"negate": false
},
{
"key": "genre",
"value": "religion",
"negate": true
}
{
"AND": [
{
"key": "subject",
"value": "tao-te-ching",
"negate": false
},
{
"key": "author",
"value": "lao-tzu",
"negate": false
}
]
}
]
}
In this case, files will be filtered such that:
- “subject” = “holy-bible” OR
- “person-of-interest” = “jesus christ” OR
- “genre” != “religion” OR
- “subject” = “tao-te-ching” AND “author” = “lao-tzu”
Note that the top level of the query must be either an “OR” or “AND” array. Currently, nesting is limited to 3. For tag blocks (those with “key”, “value”, and “negate” keys), the following typing rules apply:
- “key” isn’t optional and must be a
string
- “value” isn’t optional and can be
any
or list[any
] - “negate” is optional and must be
true
orfalse
. If present andtrue
, then the filter block is negated in the resulting query. It isfalse
by default.