10. 🧠 Custom embeddings
By default, Tenyks generates embeddings (vector representations of data) after you upload your dataset.
On Tenyks' web platform, you can navigate through a feature called Embedding Viewer 🖼️, which enables you to identify patterns or clusters (groups of similar data points) in your dataset 🔍.
However, you can actually bring your own embeddings. Here's how to do it!`
- Define the location of yuor embeddings (e.g., Amazon S3, Azure, etc)
custom_embedding_location = {
"type": "aws_s3",
"s3_uri": "S3_URI_TO_CUSTOM_EMBEDDINGS_FOLDER",
"credentials": {
"aws_access_key_id": "XXXXXXXX",
"aws_secret_access_key": "XXXXXXXXXX",
"region_name": "XXXXXXXX",
},
}
- Upload the embeddings
dataset.upload_custom_embeddings(
embedding_name="my_embeddings",
embedding_location=custom_embedding_location
)
Here you can see an example of the expected JSON format
{
"image_embeddings": [
{
"file_name": "000005.png",
"embeddings": [
0.0941977963438363,
-0.7633055149197727,
0.2745402182841452,
-0.3159639551745177,
0.8309993806478222,
0.1842604575241451,
-0.17650862339911622,
.....,
0.9767816827921139,
-0.9716360654450817,
-0.4943288128254306,
-0.917719311642059
]
},
{
"file_name": "000002.png",
"embeddings": [
-0.3892604112721678,
-0.18501907737463008,
-0.3945027835161057,
-0.8961710284212987,
-0.7449565108954637,
....,
-0.1312412920235193,
0.8147100024219016,
0.9368189951449468,
-0.23959517885780435,
0.8640233210706019,
-0.24393422152741917
]
}
]
}
Updated about 2 months ago