ML Ops

Meenakshisundaram Thandavarayan
3 min readJul 7, 2020

AI models are seen as the next generation of software. There is a greater increase in the adoption of artificial intelligence (AI) solutions within organizations and greater dependency on algorithms for decision making with impact to core business processes and decisions.

However, most AI initiatives experience challenges moving from development to production, compared to traditional software engineering. Gartner says, by 2021, at least 50% of AI projects will not be fully deployed. This is due to the lack of a framework and architecture to support model building, deployment, and monitoring, which inherently will affect the continuous delivery required for AI projects. Operationalization of machine learning is an imperative step in aligning AI investments with strategic business objectives — the “last mile”

AI Projects are seen as experimentation: Artificial intelligence (AI) projects are viewed as a machine learning (ML) model development [build, train, and validate] exercise. Far less time and less budget are spent on model operationalization and integration with core enterprise infrastructure and applications. This leads to a significant gap between pilot and production environments

Need for ML Ops

Multiple types of Models: Models can differ across various planes. The nature of how they are developed [Custom Models, Pre-trained Models, trainable models], their deployment runtime environment [Cloud, Edge], how they interact with other applications [API, Batch AI], and how they are consumed. This results in a variety of Model Operationalization Scenarios. There is a need for a standard way to address this dynamic nature of models

Regulatory & Compliance Requirements: AI models impact to core business processes has surfaced risks and regulation implications, requiring establishing a framework to govern ML-based solution

Focus on Model Fairness: The adoption of AI requires resolving concerns with regards to ethics, fairness, security, privacy, accountability, and transparency of AI Models. Explaining the prediction and interpretation of the models are no longer an option.

Need for Model Security: Can be exploited by individuals wanting to adversely impact the operation of AI systems. In addition, Because AI systems are not explicitly programmed, it is impossible to guarantee predictable behavior

The lag between Investment & Return: AI initiatives need adaptable, flexible approaches to their projects that allow for highly iterative development. AI initiatives experience a series of technical challenges from traditional software systems, such as the ability to standardize end-to-end workflows for repeatability and automation. There is often struggle with architecting a solution for automating end-to-end ML pipelines across data preparation, model building, deployment, and production due to lack of process and tooling know-how.

AI is Multi-disciplinary: Model Life cycle management requires a stronger collaboration between software engineers, data scientists, data engineers, machine learning architects, and business experts.

Multiple AI platforms: Multiple platforms are leveraged to build AI / ML Models. Each of the platforms is offering differing levels of maturity on the Model Life cycle Management

The objective of ML DevOps is to blend disparate Data engineering, model Engineering, Deployment, and inferencing architectures to support AI use cases

Standardization in developing AI models

Consistency in evaluating AI models

Automation and Orchestration of AI/ML models through different environments

Monitoring, auditing and reuse of models

Visibility, transparency, and explainability of AI outputs in an operational environment

Model Dev and Ops

--

--