Uploading Documents
Upload documents to your new custom data source
Listed below are the required and optional fields needed to upload documents to your new custom data source. A document
in this case is a Mention and its metadata, such as the Mention's text, author or date.
"items" array
All of the following required/optional fields for each document are placed within an array called
items
as seen in the example calls below.
Required Fields
Parameter | Definition | Accepted Values |
---|---|---|
contents | Main body text of the mention. Max length is 16k characters | |
date | Date associated with the document. Must be a valid ISO-style date string | yyyy-MM-dd['T'HH:mm:ss] |
Optional Fields
Parameter | Definition |
---|---|
url | Optional if guid is supplied, otherwise required.URL associated with the document (should be unique within data source). Must be a valid URL. |
guid | Optional if url is supplied, otherwise required.User-supplied unique identifier for document (should be unique within data source). Max length of 1k characters. If the guid is re-used, then original document will be replaced. If date is different, but guid is the same, there is some undefined behavior for real-time matching (documents may end up appearing multiple times in a dashboard). Backfilling should fix this issue. |
author | Max length of 200 characters |
title | Max length of 200 characters |
gender | m ,f ,male or female |
language | Must be a valid language code. If this value is not set, then we will attempt to identify it automatically based on the contents field. |
geolocation | id : A valid BCR geolocation ID (please contact [email protected] for a list of location IDs)latitude /longitude : In degreeszipcode : A valid US zipcode (5 digit number as a string) |
parentGuid | Max length of 1000 characters It can be queried via the engagingWithGuid: operator |
engagementType | Values can be: comment, reply, retweet It can be queried via the engagementType: operator |
pageId | Max length of 1000 characters |
authorProfileId | Max length of 1000 characters |
batch | The identifier for the set of documents being uploaded; it can be specified when uploading new custom documents. If it's not specified, it will be automatically assigned. |
categories | An array of user defined categories to upload with the document. See below for more details of the format. |
Example Call
The following call uploads two documents. The DATA_SOURCE_NUMERIC_ID
value comes from the id
field we noted when creating a custom data source and is entered in the contentSource
field when making this request. You may also upload individual documents to a specific batch (group) of documents by adding the batch
id:
curl -X POST 'https://api.brandwatch.com/content/upload' \
-H 'authorization: bearer xxxxxx-xxxxxx-xxxxxx-xxxxxx-xxxx' \
-H 'Content-Type: application/json' \
-d '
{
"items": [
{
"guid": "3d101fd9b2004a11a76ba1ea637eb9f2",
"gelocation": {
"id": "USA.fl"
},
"date": "2020-03-25T15:04:00",
"contents": "testing the data upload API",
"custom": {
"myfield": "testmetric"
}
},
{
"guid": "3d101fd9b2004a11a76ba1ea637eb9f3",
"gelocation": {
"id": "USA.pa"
},
"date": "2020-03-25T15:05:00",
"contents": "testing the data upload API..again",
"custom": {
"npsCategory": "Promoter"
}
}
],
"contentSource": 34354220140,
"batch": "yourBatchIdHere-12345"
}
Here is what the response looks like:
{
"uploadCount":2,
"Batch":"yourBatchIdHere-12345"
}
Limits & Usage Reporting
You are limited to uploading 1,000 documents per request. The
uploadCount
value in the JSON response represents the number of documents you've just uploaded.There is also a limit to how many documents can be uploaded in a 30 day/24 hour period. There two options to monitor usage which you can find in the article Usage Reporting.
Custom Fields
Each document can have a set of custom fields associated with them. These custom fields are arbitrary key value pairs that you can use to upload your own categorization for the uploaded custom data. They are mostly used for filtering documents (at the query and/or dashboard level), but can also be used in rules and tags. For example, if you upload product reviews data, you can use a custom field to upload the product rating for each uploaded review. You can use alphanumeric characters and _
in the name(s) of your new custom field(s).
{
"items": [
{
"guid": "...",
"custom": {
"myField1": "this is some text",
"anotherField1": "with some different text"
}
},
{
"guid": "...",
"custom": {
"myField2": "this is some text again",
"anotherField2": "with some different text again"
}
}
],
"contentSource": DATA_SOURCE_NUMERIC_ID
}
There is a 100 character limit for the names of custom fields, and a 10,000 character limit for the contents.
Text upload in custom fields are tokenized, but not in the same way as regular contents. The text is simply lowercased and split by white space. This means that punctuation characters (and any other special characters) will be retained and can be search for.
To filter your data using custom fields, you can use the custom_CustomFieldName:
operator. Replace CustomFieldName
with the name of your custom field and add the value of your custom field after the operator as seen in the example below:
custom_NPSCategory:Promoter
This would match documents that had a custom field "NPSCategory" with date of "Promoter" or "Promoter Something" etc.
Numeric Custom Fields
If you upload data looks like a number (e.g. 123 or 10.5) into a custom field then you then you can treat it as a numeric custom field. This unlocks a few of things:
- Sorting mentions by numeric values
- Sum and average of field in charts
- Filtering by numeric ranges
The first two should be available in the BCR UI itself.
Filtering requires the customNumeric_CustomFieldName
special operator (which is different to the custom_CustomFieldName
). Again you replace CustomFieldName
with your field name (case will matter). Then you can use a "range" syntax to specify the range of data you want to match:
customNumeric_NPS:[* TO 3]
- filter to find docs with NPS less than or equal to 3customNumeric_NPS:[2 TO *]
- filter to find docs with NPS greater than or equal to 2customNumeric_NPS:[2 TO 3]
- filter to find docs with NPS between 2 and 3
NB. the numbers are always assumed to be decimals, so depending on your data you might need to allow for that when filtering.
Custom Fields Limit
There is a limit of 10 custom fields per data source. As documents are uploaded, the custom fields used are tracked. If the existing field names used and the new field names in an upload would exceed 10, then you will receive an error (HTTP 400) when trying to upload the documents.
Preassigned Categories
It can be useful to upload documents with some categories pre-assigned. This can work well in conjunction with numeric custom fields, as it unlocks the ability to segment your own data in an arbitrary manner. e.g. you could upload review data with a category for the product type and a numeric custom field for the rating. You would then be able to show custom charts of average review scores broken down by product type.
You will need to have created your categories using either BCR itself or use the API to manage categories.
To upload a document with categories pre-assigned you just need to add a categories
array to the document. Each element in the array will be an object with two fields:
Field | Definition |
---|---|
id | The ID of the category (can be discovered via the API) |
projectId | The ID of the project the category is part of. |
An example call to upload a document with a category assigned:
curl -X POST 'https://api.brandwatch.com/content/upload' \
-H 'authorization: bearer xxxxxx-xxxxxx-xxxxxx-xxxxxx-xxxx' \
-H 'Content-Type: application/json' \
-d '
{
"items": [
{
"guid": "3d101fd9b2004a11a76ba1ea637eb9f2",
"date": "2020-03-25T15:04:00",
"contents": "testing the data upload API",
"categories": [
{"id": 1234, "projectId" 7890}
]
}
],
"contentSource": 34354220140
}
The category and project IDs will be validated to ensure they exist and that you have access to them. There are also other validation steps to ensure you are not assigning multiple categories that would conflict and so on.
Categories Limit
There is a limit of 10 categories per document uploaded. Each document can have 10 different categories though.
Categories Are Project-Scoped
If you view your uploaded data in two different projects you will not see the categories in both of those projects. You will only see the categories that related to the specific project your query belongs to.
Interaction With Rules And Manually Assigned Categories
There are some non-obvious things to bear in mind when you upload documents with categories.
When a document is uploaded with categories this is only the initial set of categories. Rules and manual category assignment can and will change those categories further.If you want to opt out your content source from a rule you can use
NOT pubType:<CONTENT-SOURCE-NAME>
in your rule.If you re-upload a document a user had previously assigned a category to, then the user's category will be re-applied (potentially replacing the category initially uploaded).
Updated 5 months ago
Once you've uploaded all of your documents to your new custom data source, you must create a query to retrieve this data.