3. ⬆️ Uploading and processing your first video
This step assumes that you completed the previous two steps:
- Signing up for a Tenyks account
- Configuring your video source
Let's verify that we are in the correct workspace
workspace_key = "gabriel_data_workspace_976264f4"
tenyks.set_workspace(workspace_key)
""" Output
2024-09-26 17:08:27,748 - Tenyks - INFO - Workspace set to 'gabriel_data_workspace_976264f4'.
INFO:Tenyks:Workspace set to 'gabriel_data_workspace_976264f4'.
"""
We create a dataset:
- In Tenyks, a dataset is composed of images and annotations.
- For videos, we don't require annotations: the system will actually extract images (i.e., frames) from the video itself duing the uploading phase.
dataset = tenyks.create_dataset(
"paris_train_station"
)
We set our AWS credentials based on the user
we described in Section 2.1.1.
aws_access_key_id = "***************"
aws_secret_access_key = "****************************"
region_name = "us-east-2"
my_credentials = AWSCredentials(
aws_access_key_id=aws_access_key_id,
aws_secret_access_key=aws_secret_access_key,
region_name=region_name
)
In s3_uri
, make sure you only define the address up to the folder that contain the video.
- For instance, for a video on
s3://mytenyksbucket/datasets/surveillance.mp4
, we only require:s3://mytenyksbucket/datasets/
aws_video_location = AWSLocation(
type="aws_s3",
s3_uri="s3://mytenyksbucket/paris_train_station/",
credentials=my_credentials,
)
Now, let's ingest our video:
subsampling frequency
: How many frames per second (fps) are sampled or processed from the video.frames to subsample
: This specifies the total number of frames after the initial sampling.['person']
: This list contains the categories or labels that you want the object detection algorithm to look for within the video frames.confidence threshold
: Numerical value that sets the minimum confidence level for considering a detected object as valid. A confidence threshold of 0.005 means that if the object detection algorithm predicts an object (e.g., a person) with a confidence score lower than 0.005, that detection will be ignored.
Note that the product of subsampling frequency
* frames to subsample
should be EQUAL or LESS than the total number of seconds of your video.
dataset.upload_videos_from_cloud_and_ingest(
aws_video_location, # aws credentials and s3_uri
1, # subsampling frequency
100, # frames to subsample
['person', 'train'], # prompts for object detection
0.005 # object detection confidence threshold
)
We can verify the status of our dataset with get_dataset()
- As soon as the status is
DONE
, we can continue.
dataset = tenyks.get_dataset("paris_train_station")
dataset
""" Output
Dataset(client=<tenyks_sdk.sdk.client.Client object at 0x79a6e7ec4490>, workspace_name='gabriel_data_workspace_976264f4', key='paris_train_station', name='paris_train_station', owner='789f7e4c-2a89-45ef-86f2-abf7633332b1', owner_email='[email protected]', created_at=datetime.datetime(2024, 9, 26, 17, 21, 44), images_location=AWSLocation(type='aws_s3', s3_uri='s3://tenyks-prod-storage/gabriel_data_workspace_976264f4/paris_train_station/images/', credentials=None), metadata_location=AWSLocation(type='aws_s3', s3_uri='s3://tenyks-prod-storage/gabriel_data_workspace_976264f4/paris_train_station/metadata/', credentials=None), categories=[Category(name='suggested_annotation', color='#1F77B4', id=0)], models=[], status='DONE', n_images=100, iou_threshold=0.5)
"""
Updated 16 days ago