Endpoints are the actual deployments of your models on your specified GPU resource.
INITIALIZING
status.
Specifically, charges begin after the Initializing GPU
phase, where the endpoint waits to acquire the GPU.
The endpoint then downloads and loads the model onto the GPU, which usually takes less than a minute.