Search Optimal Policy and Running Friendli Container
To serve MoE models efficiently, it is required to run a policy search to explore the optimal execution policy.
Learn how to run the policy search at Running Policy Search.
When the optimal policy is successfully searched, the policy is compiled into a policy file, which can be used for creating serving endpoints.
And the engine starts to serve the endpoint using the optimal policy.