Create a real-time inference endpoint in Amazon SageMaker with Friendli Container backend. By utilizing Friendli Container in your SageMaker pipeline, you’ll benefit from the Friendli Engine’s speed and resource efficiency.
AmazonSageMakerFullAccess
policy.--hf-model-name
option of the Friendli Container.FRIENDLI_CONTAINER_SECRET
: Your Friendli Container Secret. Refer to Preparing Container Secret to learn how to get the container secret.SAGEMAKER_MODE
: This should be set to True
.SAGEMAKER_NUM_DEVICES
: Number of devices to use for tensor parallelism degree.SAGEMAKER_USE_S3
: This should be set to True
.SAGEMAKER_HF_MODEL_NAME
: The Hugging Face model name (e.g., mistralai/Mistral-7B-Instruct-v0.2
)HF_TOKEN
: The Hugging Face secret access token.