Home

10. 🧠 Custom embeddings

By default, Tenyks generates embeddings (vector representations of data) after you upload your dataset.

On Tenyks' web platform, you can navigate through a feature called Embedding Viewer 🖼️, which enables you to identify patterns or clusters (groups of similar data points) in your dataset 🔍.

However, you can actually bring your own embeddings. Here's how to do it!`

  1. Define the location of yuor embeddings (e.g., Amazon S3, Azure, etc)
custom_embedding_location = {
    "type": "aws_s3",
    "s3_uri": "S3_URI_TO_CUSTOM_EMBEDDINGS_FOLDER",
    "credentials": {
        "aws_access_key_id": "XXXXXXXX",
        "aws_secret_access_key": "XXXXXXXXXX",
        "region_name": "XXXXXXXX",
    },
}
  1. Upload the embeddings
dataset.upload_custom_embeddings(  
    embedding_name="my_embeddings",  
    embedding_location=custom_embedding_location  
)

Here you can see an example of the expected JSON format

{
    "image_embeddings": [
        {
            "file_name": "000005.png",
            "embeddings": [
                0.0941977963438363,
                -0.7633055149197727,
                0.2745402182841452,
                -0.3159639551745177,
                0.8309993806478222,
                0.1842604575241451,
                -0.17650862339911622,
                .....,
                0.9767816827921139,
                -0.9716360654450817,
                -0.4943288128254306,
                -0.917719311642059
            ]
        },
        {
            "file_name": "000002.png",
            "embeddings": [
                -0.3892604112721678,
                -0.18501907737463008,
                -0.3945027835161057,
                -0.8961710284212987,
                -0.7449565108954637,
                ....,
                -0.1312412920235193,
                0.8147100024219016,
                0.9368189951449468,
                -0.23959517885780435,
                0.8640233210706019,
                -0.24393422152741917
              ]
        }
      ]
}