Tuesday, 6 March 2018

Amazon SageMaker has now added support for Auto Scaling which is now available

Auto Scaling can now be configured of the user’s endpoints from the Amazon SageMaker console, the AWS SDKs, and AWS Auto Scaling API which ultimately makes the capacity management easier. By utilizing the Amazon SageMaker you can now define the type and number of instances per endpoint to deliver the scale that will be required for the inferences. If the inferences volume changes then the users can change the type and number of instances that back each endpoint so that to adjust to the change. Auto Scaling allows you to adjust the inference capacity automatically to maintain the predictable performance at the low cost. 

No comments:

Post a Comment

Building a Hybrid LLM Development Workflow with Claude Code + Ollama

AI-assisted software development is evolving rapidly, and hybrid LLM workflows are becoming one of the most effective ways for engineers to ...