Fawzi Rida

16 May 2023

Lire cet article en Français

Effortless MLOps: Streamline your ML workflows with Azure Machine Learning

This article is part of a series of posts about the Azure Machine Learning Service. I invite you to read the first part of this series dealing with Responsible AI dashboards enablement.

In this second part, we will see how to construct the backbone of an MLOPS platform using Azure Machine Learning Service and MLOps v2 accelerator.

Note: some details are avoided to prevent deviation from the scope, you will have useful links in each part.

ML Engineering

The field of Machine Learning (ML) is complex and requires a diverse skillsets, knowledge, and experience to build successful production-grade solutions. With all the existing specialties (Generative AI, NLP, Computer vision, forecasting, deep learning, classical ML, etc.), adding to that the complexity of model development, finding a profile that fills all subjects is just impossible.

The goal of ML engineering is not to master all the specialties, but to have good knowledge in ML and enough skills in software engineering, data engineering and infrastructure to increase the chances of getting ML projects to production and deliver reliable solutions.

According to Ben Wilson of Databricks, ML projects fail due to inexperienced data science team with large scale production grade model to solve a particular problem or simply fail to understand the desired outcome from the business. Look at the below picture where the reasons for ML projects failures are illustrated.

E-book Databricks sur l’ingénierie ML

ML engineering Ebook from Databricks

In the coming sections of this article, we will tackle the problem of Infrastructure and how to deploy production-ready ML platforms using MLOPS v2 Accelerator on Azure Machine Learning.

Machine Learning operations

MLOps is an extension of DevOps. It applies DevOps principles and practices to Machine Learning projects. More precisely, it concerns data management, deployment environment, model versioning and model monitoring, in order to enable reproducible experimentations.

We distinguish 5 levels of maturity:

No MLOps: manual builds and deployment
DevOps but no MLOps: automated app code build and test.
Automated Training: automated training pipelines and model management.
Automated Model Deployment: automated deployment and ML endpoints catalog.
Full MLOps: e2e automation.

For more information about MLOPS, please refer to Microsoft documentation.

MLOps v2 accelerator

We can find many articles, tools, and repos to deal with MLOps. But still… it’s overwhelming to setup everything from scratch and it needs skills in Software engineering, Infrastructure and DevOps without forgetting the Data end ML part, the core of ML projects.

Présentation de certains outils et plateformes très connus pour les MLOps

Some popular tools and platforms for MLOps, picture from NimbleBox.ai

Microsoft has delivered through Azure Cloud, Azure Machine Learning service which is a MLOps platform that offers Low code, no code and full code experiences. The manipulation of this service can be done on the portal but we recommend to use SDK kit and CLI for data scientist/ML engineers, and DevOps/MLOps engineers. This platform is equipped with the essential components to deal with ML operations tasks as shown below:

Ressources de Machine Learning Azure

Azure Machine Learning assets

Lately, Microsoft has delivered an MLOps framework, to setup e2e ML projects using AML service in just few minutes called MLOps v2 Accelerator, here is the repo’s project where you can find the deployment guidance using Azure DevOps or GitHub Actions.

To start the construction of your MLOps platform, the first thing to do is to clone the repo. Then go to sparse_checkout.sh. The execution of this shell script will generate and bootstrap a MLOps template.

As you can see the first lines are arguments that we can set in order to customize our MLOps project:

Set infrastructure as code tool (Terraform or Bicep)
Set ML project type (classic, computer vision or natural language processing)
Set MLOps version (CLI, SDK or Responsible AI-CLI)
Set orchestration tool (Azure Devops or Github Actions)

After that, we select where to save the project, its name, GitHub organization name and finally the project template (we can leave it as it is).

L’emplacement où enregistrer le projet, un nom, le nom de l’organisation GitHub

After setting all the wanted arguments, we create an empty GitHub repo and we authorize the authentication to it using SSH. Please refer to GitHub documentation.

After executing the sparse_checkout.sh, a template will be generated as shown below:

l’exécution de l’instruction sparse_checkout.sh

In this example we’ve generated a classical template, based on Terraform for IaC and Azure DevOps for orchestration using AML CLI.

Note: Once you have the core template you can customize it and tailor it to fit your ML project needs.

We notice few modules:

Data where we have a sample dataset.
Data-science where we have a conda environment configuration and source code of each step of ML development.
Infrastructure where we have DevOps pipeline configuration for IaC and Terraform modules to create related AML services.
MLOps where we have deployment configurations (Batch and Online) and DevOps pipelines for training pipeline, batch and online inferencing.

At this stage, notice the most important yml configuration files for the construction of:

Infrastructure pipeline
MLOps pipelines (Training, batch and Online inference)

yml configuration files for the construction

To proceed on Azure DevOps, import the GitHub repo to your Azure DevOps. Then go to “manage repositories”, select “security” and set the build service permissions as below:

référentiel GitHub dans votre plateforme Azure DevOps

To use Terraform as IaC tool we need its extension. In order to create Azure pipelines from the previous yml files we should create service principles with contributor roles. Please refer to Microsoft documentation for this task.

Once it’s done, go to Azure DevOps, Project setting in the bottom page, then Pipelines, select service connections, new service connection, select service principal (manual) and fill in the needed information of your service principle.

New Azure service connection

One last thing to configure is Pipelines: go to the 3 dots near the new pipeline button. Click and select Manage Security and Allow edit build pipeline.

Configure Pipelines

After finishing all these details we’re ready to create our pipelines.

Infrastructure as code

We start by implementing Terraform IaC pipeline to create our infrastructure.

Go to Pipelines, new pipeline, select Azure Repos Git, select the repo then existing Azure pipelines YAML file and finally select the configuration yml for IaC (tf-ado-deploy-infra.yml)

implémenter le pipeline Terraform IaC pour créer votre infrastructure

implémenter le pipeline Terraform IaC pour créer votre infrastructure SUITE

The result below is our IaC pipeline:

IaC pipeline

Before jumping to create another pipeline, let’s understand the IaC yaml template.

We load appropriate variables according to a specific branch. In this case, two sets of environments variables are defined, prod on main branch and dev on others than main. We set manual triggering and a ubuntu VM for the hosted agent (where the pipeline will run).

Définissez le déclenchement manuel et une Machine Virtuelle ubuntu pour l’agent hébergé

The config-infra-x file is where we define environment variables for each environment.

We have two stages; we create a blob storage on which we save the terraform.tfstate file of Terraform to simplify collaboration between the team and more others benefits. After that we deploy our infrastructure using Terraform workflow init, validate, plan and apply.

workflow Terraform

For more details about the used code for the steps click here.

As you can see, it’s pretty straightforward, and you don’t need huge experience in Terraform, because the heavy configuration is already done for us. To check the details of this configuration, go to infrastructure then modules, note that Azure Machine Learning relies on 4 services: Storage Account, Key Vault, Azure Container Registry and Application Insights.

MLOps pipelines

Once the infrastructure is delivered and our workspace is constructed, we can proceed to the creation of ML training, batch and online inferencing pipelines.

Training pipeline

This pipeline contains several tasks:

Install az cli
Install aml cli
Connect to workspace
Register environment
Create compute
Register data
Run pipeline

Batch inferencing pipeline

This pipeline contains several tasks:

Install az cli
Install aml cli
Connect to workspace
Create compute
Create endpoint
Create deployment
Test deployment

Online inferencing pipeline

This pipeline contains several tasks:

Install az cli
Install aml cli
Connect to workspace
Create endpoint
Create deployment
Allocate traffic
Test deployment

We followed the same previous steps to select the proper yml configuration to create the pipelines.

As you can see everything is done for us and at the end of the journey, we will have these 4 pipelines, each one performing a specific task to enable e2e MLOps cycle.

Pipelines e2e MLOps cycle

At this stage we’ve automated all the workflows from infrastructure to batch and online inference and we’ve created the essential foundations of our MLOps platform.

Now, what you need to do is to wrap your existing ML code into this template and wire things properly (for instance make use of existing data on Data lake). In other words, go to:

intégrer votre code ML existant

And edit these files by your code!

Here are some useful links to properly wrap your code and start adopting Azure MachineLlearning as your MLOps platform:

Note: you can find all details in how to guides, concepts, reference, … etc in AML documentation.

And to have another point of view, this is what we’ve just achieved: look at the global architecture enabled by the MLOps v2 accelerator, it’s just amazing!

MLOPS v2 Accelerator by Microsoft (* new components to come)

MLOPS v2 Accelerator by Microsoft (* new components to come)

Conclusion about starting an MLOps project

To conclude, starting an MLOps project has never been easier. You just need to rely on the generated template and customize it to fit your needs.

Would you like to be supported on your MLOps platform projects? Contact us!

This posts should interest you

GitHub Copilot: Feedback From an Expert

Eric Vernié

Software Development

11 April 2024

Developed by GitHub and OpenAI, professionals are calling GitHub Copilot a “revolutionary tool for developers,” since it combines Artificial Intelligence...

How To Develop A Custom Voice? Feedback from an expert

Alibek Jakupov

Artificial Intelligence

20 February 2024

Custom Neural Voice (CNV) is a component of Azure Cognitive Services that allows you to generate a personalized, synthetic voice...

How to Structure DLP Strategy Governance?

Romain DALLE

Security

9 January 2024

In our last post about data loss prevention, we showed you how to change your data loss prevention (DLP) strategies...

How to Create an Azure IoT Edge Module?

Taha Snoussi

Data

28 November 2023

Post co-written by Taha Snoussi and Jérémie Lalanne. We’ve been hearing more and more about Azure IoT Edge for...

How to Deploy DLP Policies?

Romain DALLE

Security

21 November 2023

Deploying measures to safeguard your company’s data at rest often involves at least two areas that work together: information protection...

Uncovering Fake Reviews with Azure Cognitive Services and Azure OpenAI: A Case Study

Alibek Jakupov

Data

2 November 2023

The rise of online reviews has led to a surge in content created by individuals. However, some of these reviews...

Comments

Thank you for the valuable information on the blog.I am not an expert in blog writing, but I am reading your content slightly, increasing my confidence in how to give the information properly. Your presentation was also good, and I understood the information easily.

Reply