<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>EDA Archives - OVHcloud Blog</title>
	<atom:link href="https://blog.ovhcloud.com/tag/eda/feed/" rel="self" type="application/rss+xml" />
	<link>https://blog.ovhcloud.com/tag/eda/</link>
	<description>Innovation for Freedom</description>
	<lastBuildDate>Fri, 03 Mar 2023 16:40:25 +0000</lastBuildDate>
	<language>en-GB</language>
	<sy:updatePeriod>
	hourly	</sy:updatePeriod>
	<sy:updateFrequency>
	1	</sy:updateFrequency>
	<generator>https://wordpress.org/?v=6.9.4</generator>

<image>
	<url>https://blog.ovhcloud.com/wp-content/uploads/2019/07/cropped-cropped-nouveau-logo-ovh-rebranding-32x32.gif</url>
	<title>EDA Archives - OVHcloud Blog</title>
	<link>https://blog.ovhcloud.com/tag/eda/</link>
	<width>32</width>
	<height>32</height>
</image> 
	<item>
		<title>Deploy a custom Docker image for Data Science project – Streamlit app for EDA and interactive prediction (Part 2)</title>
		<link>https://blog.ovhcloud.com/deploy-a-custom-docker-image-for-data-science-project-streamlit-app-for-eda-and-interactive-prediction-part-2/</link>
		
		<dc:creator><![CDATA[Eléa Petton]]></dc:creator>
		<pubDate>Tue, 11 Oct 2022 07:38:35 +0000</pubDate>
				<category><![CDATA[OVHcloud Engineering]]></category>
		<category><![CDATA[AI]]></category>
		<category><![CDATA[AI Deploy]]></category>
		<category><![CDATA[AI Solutions]]></category>
		<category><![CDATA[Artificial Intelligence]]></category>
		<category><![CDATA[Docker]]></category>
		<category><![CDATA[EDA]]></category>
		<category><![CDATA[Machine learning]]></category>
		<category><![CDATA[Open Source]]></category>
		<category><![CDATA[OVHcloud]]></category>
		<category><![CDATA[Public Cloud]]></category>
		<category><![CDATA[python]]></category>
		<category><![CDATA[PyTorch]]></category>
		<category><![CDATA[Streamlit]]></category>
		<guid isPermaLink="false">https://blog.ovhcloud.com/?p=23479</guid>

					<description><![CDATA[A guide to deploy a custom Docker image for a Streamlit app with AI Deploy. Welcome to the second article concerning custom Docker image deployment. If you haven&#8217;t read the previous one, you can read it on the following link. It was about Gradio and sketch recognition. When creating code for a Data Science project, [&#8230;]<img src="//blog.ovhcloud.com/wp-content/plugins/matomo/app/matomo.php?idsite=1&amp;rec=1&amp;url=https%3A%2F%2Fblog.ovhcloud.com%2Fdeploy-a-custom-docker-image-for-data-science-project-streamlit-app-for-eda-and-interactive-prediction-part-2%2F&amp;action_name=Deploy%20a%20custom%20Docker%20image%20for%20Data%20Science%20project%20%E2%80%93%20Streamlit%20app%20for%20EDA%20and%20interactive%20prediction%20%28Part%202%29&amp;urlref=https%3A%2F%2Fblog.ovhcloud.com%2Ffeed%2F" style="border:0;width:0;height:0" width="0" height="0" alt="" />]]></description>
										<content:encoded><![CDATA[
<p><em>A guide to deploy a custom Docker image for a <a href="https://streamlit.io/" data-wpel-link="external" target="_blank" rel="nofollow external noopener noreferrer">Streamlit</a> app with <strong>AI Deploy</strong>.</em></p>



<figure class="wp-block-gallery has-nested-images columns-default is-cropped wp-block-gallery-1 is-layout-flex wp-block-gallery-is-layout-flex">
<figure class="wp-block-image size-large"><img fetchpriority="high" decoding="async" width="1024" height="817" data-id="23517" src="https://blog.ovhcloud.com/wp-content/uploads/2022/10/image3-1024x817.jpeg" alt="streamlit app for eda and interactive prediction" class="wp-image-23517" srcset="https://blog.ovhcloud.com/wp-content/uploads/2022/10/image3-1024x817.jpeg 1024w, https://blog.ovhcloud.com/wp-content/uploads/2022/10/image3-300x239.jpeg 300w, https://blog.ovhcloud.com/wp-content/uploads/2022/10/image3-768x613.jpeg 768w, https://blog.ovhcloud.com/wp-content/uploads/2022/10/image3-1536x1225.jpeg 1536w, https://blog.ovhcloud.com/wp-content/uploads/2022/10/image3.jpeg 1620w" sizes="(max-width: 1024px) 100vw, 1024px" /></figure>
</figure>



<p><em>Welcome to the second article concerning <strong>custom Docker image deployment</strong>. If you haven&#8217;t read the previous one, you can read it on the following <a href="https://blog.ovhcloud.com/deploy-a-custom-docker-image-for-data-science-project-gradio-sketch-recognition-app-part-1/" data-wpel-link="internal">link</a>. It was about <a href="https://gradio.app/" data-wpel-link="external" target="_blank" rel="nofollow external noopener noreferrer">Gradio</a> and sketch recognition.</em></p>



<p>When creating code for a <strong>Data Science project</strong>, you probably want it to be as portable as possible. In other words, it can be run as many times as you like, even on different machines.</p>



<p>Unfortunately, it is often the case that a Data Science code works fine locally on a machine but gives errors during runtime. It can be due to different versions of libraries installed on the host machine.</p>



<p>To deal with this problem, you can use <a href="https://www.docker.com/" data-wpel-link="external" target="_blank" rel="nofollow external noopener noreferrer">Docker</a>.</p>



<p><strong>The article is organized as follows:</strong></p>



<ul class="wp-block-list">
<li>Objectives</li>



<li>Concepts</li>



<li>Load the trained PyTorch model </li>



<li>Build the Streamlit app with Python</li>



<li>Containerize your app with Docker</li>



<li>Launch the app with AI Deploy</li>
</ul>



<p><em>All the code for this blogpost is available in our dedicated <a href="https://github.com/ovh/ai-training-examples/tree/main/apps/streamlit/eda-classification-iris" data-wpel-link="external" target="_blank" rel="nofollow external noopener noreferrer">GitHub repository</a>. You can test it with OVHcloud <strong>AI Deploy</strong> tool, please refer to the <a href="https://docs.ovh.com/gb/en/publiccloud/ai/deploy/tuto-streamlit-eda-iris/" data-wpel-link="exclude">documentation</a> to boot it up.</em></p>



<h2 class="wp-block-heading">Objectives</h2>



<p>In this article, you will learn how to develop Streamlit app for two Data Science tasks: Exploratory Data&nbsp;Analysis (<strong>EDA</strong>) and prediction based on ML model.</p>



<p>Once your app is up and running locally, it will be a matter of containerizing it, then deploying the custom Docker image with AI Deploy.</p>



<figure class="wp-block-gallery has-nested-images columns-default is-cropped wp-block-gallery-2 is-layout-flex wp-block-gallery-is-layout-flex">
<figure class="wp-block-image size-large"><img decoding="async" width="1024" height="466" data-id="23521" src="https://blog.ovhcloud.com/wp-content/uploads/2022/10/image4-1024x466.jpeg" alt="objective of streamlit app deployment" class="wp-image-23521" srcset="https://blog.ovhcloud.com/wp-content/uploads/2022/10/image4-1024x466.jpeg 1024w, https://blog.ovhcloud.com/wp-content/uploads/2022/10/image4-300x137.jpeg 300w, https://blog.ovhcloud.com/wp-content/uploads/2022/10/image4-768x350.jpeg 768w, https://blog.ovhcloud.com/wp-content/uploads/2022/10/image4-1536x700.jpeg 1536w, https://blog.ovhcloud.com/wp-content/uploads/2022/10/image4.jpeg 1620w" sizes="(max-width: 1024px) 100vw, 1024px" /></figure>
</figure>



<h2 class="wp-block-heading">Concepts</h2>



<p>In Artificial Intelligence, you probably hear about the famous use case of the <a href="https://archive.ics.uci.edu/ml/datasets/iris" data-wpel-link="external" target="_blank" rel="nofollow external noopener noreferrer">Iris dataset</a>. <strong>How about learning more about the iris dataset?</strong></p>



<h3 class="wp-block-heading">Iris dataset</h3>



<p><strong>Iris Flower Dataset</strong> is considered as the <em>Hello World</em> for Data Science. The <a href="https://en.wikipedia.org/wiki/Iris_flower_data_set" data-wpel-link="external" target="_blank" rel="nofollow external noopener noreferrer">Iris Flower Dataset</a> contains <strong>four features</strong> (length and width of sepals and petals) of <strong>50 samples</strong> of <strong>three species</strong> of Iris:</p>



<ul class="wp-block-list">
<li>Iris setosa</li>



<li>Iris virginica</li>



<li>Iris versicolor</li>
</ul>



<p>The dataset is in <code>csv</code> format and you can also find it directly as a <code>dataframe</code>. It contains five columns namely: </p>



<ul class="wp-block-list">
<li>Petal length</li>



<li>Petal width</li>



<li>Sepal length</li>



<li>Sepal width</li>



<li>Species type</li>
</ul>



<p>The objective of the models based on this dataset is to classify the three <strong>Iris species</strong>. The measurements of petals and sepals are used to create, for example, a <strong>linear discriminant model</strong> to classify species.</p>



<figure class="wp-block-image size-large is-resized"><img decoding="async" src="https://blog.ovhcloud.com/wp-content/uploads/2022/10/image0-1024x864.jpeg" alt="iris dataset" class="wp-image-23522" width="646" height="545" srcset="https://blog.ovhcloud.com/wp-content/uploads/2022/10/image0-1024x864.jpeg 1024w, https://blog.ovhcloud.com/wp-content/uploads/2022/10/image0-300x253.jpeg 300w, https://blog.ovhcloud.com/wp-content/uploads/2022/10/image0-768x648.jpeg 768w, https://blog.ovhcloud.com/wp-content/uploads/2022/10/image0-1536x1297.jpeg 1536w, https://blog.ovhcloud.com/wp-content/uploads/2022/10/image0.jpeg 1592w" sizes="(max-width: 646px) 100vw, 646px" /></figure>



<p>❗ <code><strong>A model to classify Iris species was trained in a previous tutorial, in notebook form, which you can find and test<a href="https://github.com/ovh/ai-training-examples/blob/main/notebooks/computer-vision/image-classification/tensorflow/weights-and-biases/notebook_Weights_and_Biases_MNIST.ipynb" target="_blank" rel="noreferrer noopener nofollow external" data-wpel-link="external"> </a><a href="https://github.com/ovh/ai-training-examples/blob/main/notebooks/getting-started/pytorch/notebook_classification_iris.ipynb" data-wpel-link="external" target="_blank" rel="nofollow external noopener noreferrer">here</a>.</strong></code></p>



<p>This model is registered in an OVHcloud&nbsp;<a href="https://docs.ovh.com/gb/en/publiccloud/ai/cli/data-cli/" data-wpel-link="exclude">Object Storage container</a>.</p>



<p>In this article, the first objective is to create an app for Exploratory Data&nbsp;Analysis (<strong>EDA</strong>). Then you will see how to obtain interactive prediction.</p>



<h3 class="wp-block-heading">EDA</h3>



<p><strong>What is EDA in Data Science?</strong></p>



<p><a href="https://en.wikipedia.org/wiki/Exploratory_data_analysis" data-wpel-link="external" target="_blank" rel="nofollow external noopener noreferrer">Exploratory Data Analysis</a> (<strong>EDA</strong>) is a technique to analyze data with visual techniques. In this way, you have detailed information about the statistical summary of the data. </p>



<p>In addition, <strong>EDA</strong> allows duplicate values, outliers to be dealt with, and also to see certain trends or patterns present in the dataset.</p>



<p>For Iris dataset, the aim is to observe the source data on visual graphs using the <strong>Streamlit</strong> tool.</p>



<h3 class="wp-block-heading">Streamlit</h3>



<p><strong>What is Streamlit?</strong></p>



<p><a href="https://streamlit.io/" data-wpel-link="external" target="_blank" rel="nofollow external noopener noreferrer">Streamlit</a>&nbsp;allows you to transform data scripts into quickly shareable web applications using only the <strong>Python</strong> language. Moreover, this framework does not require front-end skills.</p>



<p>This is a time saver for the data scientist who wants to deploy an app around the world of data!</p>



<p>To make this app accessible, you need to containerize it using&nbsp;<strong>Docker</strong>.</p>



<h3 class="wp-block-heading">Docker</h3>



<p><a href="https://www.docker.com/" target="_blank" rel="noreferrer noopener nofollow external" data-wpel-link="external">Docker</a>&nbsp;platform allows you to build, run and manage isolated applications. The principle is to build an application that contains not only the written code but also all the context to run the code: libraries and their versions for example</p>



<p>When you wrap your application with all its context, you build a Docker image, which can be saved in your local repository or in the Docker Hub.</p>



<p>To get started with Docker, please, check this&nbsp;<a href="https://www.docker.com/get-started" target="_blank" rel="noreferrer noopener nofollow external" data-wpel-link="external">documentation</a>.</p>



<p>To build a Docker image, you will define 2 elements:</p>



<ul class="wp-block-list">
<li>the application code (<em>Streamlit app</em>)</li>



<li>the&nbsp;<a href="https://docs.docker.com/engine/reference/builder/" target="_blank" rel="noreferrer noopener nofollow external" data-wpel-link="external">Dockerfile</a></li>
</ul>



<p>In the next steps, you will see how to develop the Python code for your app, but also how to write the Dockerfile.</p>



<p>Finally, you will see how to deploy your custom docker image with&nbsp;<strong>OVHcloud AI Deploy</strong>&nbsp;tool.</p>



<h3 class="wp-block-heading">AI Deploy</h3>



<p><strong>AI Deploy</strong>&nbsp;enables AI models and managed applications to be started via Docker containers.</p>



<p>To know more about AI Deploy, please refer to this&nbsp;<a href="https://docs.ovh.com/gb/en/publiccloud/ai/deploy/getting-started/" data-wpel-link="exclude">documentation</a>.</p>



<h2 class="wp-block-heading">Load the trained PyTorch model </h2>



<p>❗ <strong><code>To develop an app that uses a machine learning model, you must first load the model in the correct format. For this tutorial, a PyTorch model is used and the Python file utils.py is used to load it</code></strong>.</p>



<p>The first step is to import the&nbsp;<strong>Python libraries</strong>&nbsp;needed to load a PyTorch model in the <code>utils.py</code> file.</p>



<pre class="wp-block-code"><code class="">import torch
import torch.nn as nn
import torch.nn.functional as F</code></pre>



<p>To load your <strong>PyTorch model</strong>, it is first necessary to define its model architecture by using the <code>Model</code> class defined previously in the part &#8220;<em>Step 2 &#8211; Define the neural network model</em>&#8221; of the <a href="https://github.com/ovh/ai-training-examples/blob/main/notebooks/getting-started/pytorch/notebook_classification_iris.ipynb" data-wpel-link="external" target="_blank" rel="nofollow external noopener noreferrer">notebook</a>.</p>



<pre class="wp-block-code"><code class="">class Model(nn.Module):
    def __init__(self):

        super().__init__()
        self.layer1 = nn.Linear(in_features=4, out_features=16)
        self.layer2 = nn.Linear(in_features=16, out_features=12)
        self.output = nn.Linear(in_features=12, out_features=3)

    def forward(self, x):

        x = F.relu(self.layer1(x))
        x = F.relu(self.layer2(x))
        x = self.output(x)

        return x</code></pre>



<p>In a second step, you fill in the access path to the model. To save this model in <code>pth</code> format, refer to the part &#8220;<em>Step 6 &#8211; Save the model for future inference</em>&#8221; of the <a href="https://github.com/ovh/ai-training-examples/blob/main/notebooks/getting-started/pytorch/notebook_classification_iris.ipynb" data-wpel-link="external" target="_blank" rel="nofollow external noopener noreferrer">notebook</a>.</p>



<pre class="wp-block-code"><code class="">path = "model_iris_classification.pth"</code></pre>



<p>Then, the <code>load_checkpoint</code> function is used to load the model&#8217;s checkpoint.</p>



<pre class="wp-block-code"><code class="">def load_checkpoint(path):

    model = Model()
    print("Model display: ", model)
    model.load_state_dict(torch.load(path))
    model.eval()

    return model</code></pre>



<p>Finally, the function <code>load_model</code> is used to load the model and to use it to obtain the result of the prediction.</p>



<pre class="wp-block-code"><code class="">def load_model(X_tensor):

    model = load_checkpoint(path)
    predict_out = model(X_tensor)
    _, predict_y = torch.max(predict_out, 1)

    return predict_out.squeeze().detach().numpy(), predict_y.item()</code></pre>



<p>Find out the full Python code <a href="https://github.com/ovh/ai-training-examples/blob/main/apps/streamlit/eda-classification-iris/utils.py" data-wpel-link="external" target="_blank" rel="nofollow external noopener noreferrer">here</a>.</p>



<p>Have you successfully loaded your model? Good job 🥳 !</p>



<p>Let&#8217;s go for the creation of the Streamlit app!</p>



<h2 class="wp-block-heading">Build the Streamlit app with Python </h2>



<p>❗ <code><strong>All the codes below are available in the <em>app.py</em> file. The key functions are explained in this article.<br>However, the "<em>main</em>" part of the <em>app.py</em> file is not described. You can find the complete Python code of the <em>app.py</em> file <a href="https://github.com/ovh/ai-training-examples/blob/main/apps/streamlit/eda-classification-iris/app.py" data-wpel-link="external" target="_blank" rel="nofollow external noopener noreferrer">here</a>.</strong></code></p>



<p>To begin, you can import dependencies for Streamlit app.</p>



<ul class="wp-block-list">
<li>Numpy</li>



<li>Pandas</li>



<li>Seaborn</li>



<li><code>load_model</code> function from utils.py</li>



<li>Torch</li>



<li>Streamlit</li>



<li>Scikit-Learn</li>



<li>Ploty</li>



<li>PIL</li>
</ul>



<pre class="wp-block-code"><code class="">import numpy as np
import pandas as pd
import seaborn as sns
from utils import load_model
import torch
import streamlit as st
from sklearn.datasets import load_iris
from sklearn.decomposition import PCA
import plotly.graph_objects as go
import plotly.express as px
from PIL import Image</code></pre>



<p>Then, you must load source dataset of <strong>Iris flowers</strong> to be able to extract the characteristics and thus, visualize data. Scikit-Learn allows to load this dataset without having to download it!</p>



<p>Next, you can separate the dataset in an <strong>input dataframe</strong> and an <strong>output dataframe</strong>.</p>



<p>Finally, this <code>load_data</code> function is cached so that you don&#8217;t have to download again the dataset.</p>



<pre class="wp-block-code"><code class="">@st.cache
def load_data():
    dataset_iris = load_iris()
    df_inputs = pd.DataFrame(dataset_iris.data, columns=dataset_iris.feature_names)
    df_output = pd.DataFrame(dataset_iris.target, columns=['variety'])

    return df_inputs, df_output</code></pre>



<p>The creation of this Streamlit app is separated into two parts.</p>



<p>Firstly, you can look into the creation of the EDA part. Then you will see how to create an interactive prediction tool using the PyTorch model.</p>



<h3 class="wp-block-heading">EDA on Iris Dataset</h3>



<p>As a first step, you can look at the source dataset by displaying different graphs using the Python <strong>Seaborn</strong> library.</p>



<p><strong>Seaborn Pairplot</strong> allows you to get the relationship between each variable present in <strong>Pandas</strong> dataframe. </p>



<p><code>sns.pairplot</code> plots the graph in pairs of several features in a grid format.</p>



<pre class="wp-block-code"><code class="">@st.cache(allow_output_mutation=True)
def data_visualization(df_inputs, df_output):

    df = pd.concat([df_inputs, df_output['variety']], axis=1)
    eda = sns.pairplot(data=df, hue="variety", palette=['#0D0888', '#CB4779', '#F0F922'])

    return eda</code></pre>



<p>Later, this function will display the following graph thanks to a call in the &#8220;<code><em>main</em></code>&#8221; of <code><a href="https://github.com/ovh/ai-training-examples/blob/main/apps/streamlit/eda-classification-iris/app.py" data-wpel-link="external" target="_blank" rel="nofollow external noopener noreferrer">app.py</a></code> file.</p>



<figure class="wp-block-image size-large is-resized"><img loading="lazy" decoding="async" src="https://blog.ovhcloud.com/wp-content/uploads/2022/10/image-1024x956.png" alt="iris data visualization / eda with sns.pairplot" class="wp-image-23487" width="756" height="706" srcset="https://blog.ovhcloud.com/wp-content/uploads/2022/10/image-1024x956.png 1024w, https://blog.ovhcloud.com/wp-content/uploads/2022/10/image-300x280.png 300w, https://blog.ovhcloud.com/wp-content/uploads/2022/10/image-768x717.png 768w, https://blog.ovhcloud.com/wp-content/uploads/2022/10/image.png 1460w" sizes="auto, (max-width: 756px) 100vw, 756px" /></figure>



<p>Here it can be seen that the&nbsp;<code><strong>setosa</strong> 0</code>&nbsp;variety is easily separated from the other two (<code><strong>versicolor</strong>&nbsp;1</code>&nbsp;and&nbsp;<code><strong>virginica</strong>&nbsp;2</code>).</p>



<p>Were you able to display your graph? Well done 🎉 !</p>



<p>So, let&#8217;s go to the <strong>interactive prediction</strong> tool 🔜 !</p>



<h3 class="wp-block-heading">Create an interactive prediction tool</h3>



<p>To create an interactive prediction tool, you will need several elements:</p>



<ul class="wp-block-list">
<li>Firstly, you need <strong>four sliders</strong> to play with the input parameters</li>



<li>Secondly, you have to create a function to display the <strong>Principal Component Analysis</strong> (<strong>PCA</strong>) graph to visualize the point corresponding to the output of the model</li>



<li>Thirdly, you can build a <strong>histogram</strong> representing the result of the prediction</li>



<li>Fourthly, you will have a function to <strong>display the image</strong> of the predicted Iris species</li>
</ul>



<p>Ready to go? Let&#8217;s start creating <strong>sliders</strong>!</p>



<h4 class="wp-block-heading">Create a sidebar with sliders for input data</h4>



<p>In order to facilitate the visual reading of the Streamlit app, sliders are added in a <strong>sidebar</strong>.</p>



<p>In this sidebar, four sliders are added so that users can choose the length and width of petals and sepals.</p>



<p><strong>How to create a slider?</strong> Well, nothing could be easier than with Streamlit!</p>



<p>You need to define the function <code>st.sidebar.slider()</code> to <strong>add a slider to the sidebar</strong>. Then you can specify arguments such as <strong>minimum</strong> and <strong>maximum</strong> values or the average value which will be the default value. Finally, you can specify the <strong>step</strong> of your slider.</p>



<p>❗ <code><strong>Here you can see the example for a single slider. Find the complete code of the other sliders on the GitHub repo <a href="https://github.com/ovh/ai-training-examples/blob/main/apps/streamlit/eda-classification-iris/app.py" data-wpel-link="external" target="_blank" rel="nofollow external noopener noreferrer">here</a>.</strong></code></p>



<pre class="wp-block-code"><code class="">def create_slider(df_inputs):

    sepal_length = st.sidebar.slider(
        label='Sepal Length',
        min_value=float(df_inputs['sepal length (cm)'].min()),
        max_value=float(df_inputs['sepal length (cm)'].max()),
        value=float(round(df_inputs['sepal length (cm)'].mean(), 1)),
        step=0.1)

    sepal_width = st.sidebar.slider(
        ...
        )

    petal_length = st.sidebar.slider(
        ...
        )

    petal_width = st.sidebar.slider(
        ...
        )

    return sepal_length, sepal_width, petal_length, petal_width</code></pre>



<p>Later, this function will be call in the &#8220;<code><em>main</em></code>&#8221; of the <code><a href="https://github.com/ovh/ai-training-examples/blob/main/apps/streamlit/eda-classification-iris/app.py" data-wpel-link="external" target="_blank" rel="nofollow external noopener noreferrer">app.py</a></code> file. Afterwards, you will see the following interface:</p>



<figure class="wp-block-image size-full is-resized"><img loading="lazy" decoding="async" src="https://blog.ovhcloud.com/wp-content/uploads/2022/10/image-1.png" alt="Streamlit sidebar with sliders" class="wp-image-23493" width="625" height="665"/></figure>



<p>Thanks to these <strong>sliders</strong>, you can now obtain the result of the prediction in an interactive way by playing on <strong>one or more parameters</strong>.</p>



<h4 class="wp-block-heading">Display PCA graph</h4>



<p>Once your sliders are up and running, you can create a function to display the graph of the <strong>Principal Component Analysis</strong> (<strong>PCA</strong>).</p>



<p><strong>PCA</strong> is a technique that transforms <strong>high-dimensional</strong> data into <strong>lower dimensions</strong> while retaining as much information as possible.</p>



<p><strong>What about the Iris dataset?</strong> The aim is to be able to display the point resulting from the model prediction on a<strong> two-dimensional graph</strong>.</p>



<p>The <code>run_pca</code> function below displays the <strong>two-dimensional</strong> graph with iris of the source dataset.</p>



<pre class="wp-block-code"><code class="">@st.cache
def run_pca():

    pca = PCA(2)
    X = df_inputs.iloc[:, :4]
    X_pca = pca.fit_transform(X)
    df_pca = pd.DataFrame(pca.transform(X))
    df_pca.columns = ['PC1', 'PC2']
    df_pca = pd.concat([df_pca, df_output['variety']], axis=1)

    return pca, df_pca</code></pre>



<p>Thereafter, the black point corresponding to the result of the prediction is placed on the same graph in the &#8220;<code><em>main</em></code>&#8221; of the Python <a href="https://github.com/ovh/ai-training-examples/blob/main/apps/streamlit/eda-classification-iris/app.py" data-wpel-link="external" target="_blank" rel="nofollow external noopener noreferrer"><code>app.py</code></a> file.</p>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="700" height="450" src="https://blog.ovhcloud.com/wp-content/uploads/2022/10/newplot-1-1.png" alt="Principal Component Analysis (PCA) Iris dataset" class="wp-image-23498" srcset="https://blog.ovhcloud.com/wp-content/uploads/2022/10/newplot-1-1.png 700w, https://blog.ovhcloud.com/wp-content/uploads/2022/10/newplot-1-1-300x193.png 300w" sizes="auto, (max-width: 700px) 100vw, 700px" /></figure>



<p>With this method you were able to visualize your point in space. However, the numerical result of the prediction is not filled in.</p>



<p>Therefore, you can also display the results as a histogram.</p>



<h4 class="wp-block-heading">Return predictions histogram</h4>



<p>At the output of the neural network, the results can be <strong>positive or negative</strong> and the highest value corresponds to the iris species predicted by the model.</p>



<p>To create a histogram, negative values can be removed. To do this, the predictions with <strong>positive values</strong> are extracted and sent to a list before being transformed into a dataframe.</p>



<p>The negative values are all replaced by the null value.</p>



<p>To summarize, the <code>extract_positive_value</code> function can be translated into the following mathematical formula: <br><code>f(prediction) = max(0, prediction)</code></p>



<pre class="wp-block-code"><code class="">def extract_positive_value(prediction):

    prediction_positive = []
    for p in prediction:
        if p &lt; 0:
            p = 0
        prediction_positive.append(p)

    return pd.DataFrame({'Species': ['Setosa', 'Versicolor', 'Virginica'], 'Confidence': prediction_positive})</code></pre>



<p>This function is then called to build the histogram in the &#8220;<code><em>main</em></code>&#8221; of the Python <a href="https://github.com/ovh/ai-training-examples/blob/main/apps/streamlit/eda-classification-iris/app.py" data-wpel-link="external" target="_blank" rel="nofollow external noopener noreferrer"><code>app.py</code></a> file. The library <code>plotly</code> allows to build this <strong>bar chart</strong> as follows.</p>



<pre class="wp-block-code"><code class="">fig = px.bar(extract_positive_value(prediction), x='Species', y='Confidence', width=400, height=400, color='Species', color_discrete_sequence=['#0D0888', '#CB4779', '#F0F922'])</code></pre>



<figure class="wp-block-image size-full is-resized"><img loading="lazy" decoding="async" src="https://blog.ovhcloud.com/wp-content/uploads/2022/10/newplot-2.png" alt="Histogram prediction iris species" class="wp-image-23499" width="388" height="388" srcset="https://blog.ovhcloud.com/wp-content/uploads/2022/10/newplot-2.png 400w, https://blog.ovhcloud.com/wp-content/uploads/2022/10/newplot-2-300x300.png 300w, https://blog.ovhcloud.com/wp-content/uploads/2022/10/newplot-2-150x150.png 150w, https://blog.ovhcloud.com/wp-content/uploads/2022/10/newplot-2-70x70.png 70w" sizes="auto, (max-width: 388px) 100vw, 388px" /></figure>



<h4 class="wp-block-heading">Show Iris species image</h4>



<p>The final step is to display the predicted iris image using a <strong>Streamlit button</strong>. Therefore, you can define the display_image function to select the correct image based on the prediction.</p>



<pre class="wp-block-code"><code class="">def display_img(species):

    list_img = ['setosa.png', 'versicolor.png', 'virginica.png']

    return Image.open(list_img[species])</code></pre>



<p>Finally, in the main Python code <code><a href="https://github.com/ovh/ai-training-examples/blob/main/apps/streamlit/eda-classification-iris/app.py" data-wpel-link="external" target="_blank" rel="nofollow external noopener noreferrer">app.py</a></code>, <code>st.image()</code> displays the image when the user requests it by pressing the &#8220;<code>Show flower image</code>&#8221; button.</p>



<pre class="wp-block-code"><code class="">if st.button('Show flower image'):
    st.image(display_img(species), width=300)
    st.write(df_pred.iloc[species, 0])</code></pre>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="347" height="327" src="https://blog.ovhcloud.com/wp-content/uploads/2022/10/image-2.png" alt="Streamlit button and image displayed" class="wp-image-23500" srcset="https://blog.ovhcloud.com/wp-content/uploads/2022/10/image-2.png 347w, https://blog.ovhcloud.com/wp-content/uploads/2022/10/image-2-300x283.png 300w" sizes="auto, (max-width: 347px) 100vw, 347px" /></figure>



<p><code><strong>❗ Again, you can find the full code <a href="https://github.com/ovh/ai-training-examples/blob/main/apps/streamlit/eda-classification-iris/app.py" data-wpel-link="external" target="_blank" rel="nofollow external noopener noreferrer">here</a></strong></code>.</p>



<p>Before deploying your Streamlit app, you can test it locally using the following command:</p>



<pre class="wp-block-code"><code class="">streamlit run app.py</code></pre>



<p>Then, you can test your app locally at the following address:&nbsp;<strong>http://localhost:8080/</strong></p>



<p>Your app works locally? Congratulations&nbsp;🎉 !</p>



<p>Now it’s time to move on to containerization!</p>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="883" height="975" src="https://blog.ovhcloud.com/wp-content/uploads/2022/10/image-5.png" alt="overview streamlit app for eda and prediction on iris data" class="wp-image-23508" srcset="https://blog.ovhcloud.com/wp-content/uploads/2022/10/image-5.png 883w, https://blog.ovhcloud.com/wp-content/uploads/2022/10/image-5-272x300.png 272w, https://blog.ovhcloud.com/wp-content/uploads/2022/10/image-5-768x848.png 768w" sizes="auto, (max-width: 883px) 100vw, 883px" /></figure>



<h2 class="wp-block-heading">Containerize your app with Docker</h2>



<p>First of all, you have to build the file that will contain the different Python modules to be installed with their corresponding version.</p>



<figure class="wp-block-image size-large"><img loading="lazy" decoding="async" width="1024" height="545" src="https://blog.ovhcloud.com/wp-content/uploads/2022/10/image5-1024x545.jpeg" alt="docker image data science" class="wp-image-23518" srcset="https://blog.ovhcloud.com/wp-content/uploads/2022/10/image5-1024x545.jpeg 1024w, https://blog.ovhcloud.com/wp-content/uploads/2022/10/image5-300x160.jpeg 300w, https://blog.ovhcloud.com/wp-content/uploads/2022/10/image5-768x409.jpeg 768w, https://blog.ovhcloud.com/wp-content/uploads/2022/10/image5-1536x818.jpeg 1536w, https://blog.ovhcloud.com/wp-content/uploads/2022/10/image5.jpeg 1591w" sizes="auto, (max-width: 1024px) 100vw, 1024px" /></figure>



<h3 class="wp-block-heading">Create the requirements.txt file</h3>



<p>The&nbsp;<code><a href="https://github.com/ovh/ai-training-examples/blob/main/apps/streamlit/eda-classification-iris/requirements.txt" data-wpel-link="external" target="_blank" rel="nofollow external noopener noreferrer">requirements.txt</a></code>&nbsp;file will allow us to write all the modules needed to make our application work.</p>



<pre class="wp-block-code"><code class="">pandas==1.4.4
numpy==1.23.2
torch==1.12.1
streamlit==1.12.2
scikit-learn==1.1.2
plotly==5.10.0
Pillow==9.2.0
seaborn==0.12.0</code></pre>



<p>This file will be useful when writing the&nbsp;<code>Dockerfile</code>.</p>



<h3 class="wp-block-heading">Write the Dockerfile</h3>



<p>Your&nbsp;<code><a href="https://github.com/ovh/ai-training-examples/blob/main/apps/streamlit/eda-classification-iris/Dockerfile" data-wpel-link="external" target="_blank" rel="nofollow external noopener noreferrer">Dockerfile</a></code>&nbsp;should start with the the&nbsp;<code>FROM</code>&nbsp;instruction indicating the parent image to use. In our case we choose to start from a classic Python image.</p>



<p>For this Streamlit app, you can use version&nbsp;<strong>3.8</strong>&nbsp;of Python.</p>



<pre class="wp-block-code"><code class="">FROM python:3.8</code></pre>



<p>Next, you have to to fill in the working directory and add all&nbsp;files into.</p>



<p><code><strong>❗&nbsp;Here you must be in the /workspace directory. This is the basic directory for launching an OVHcloud AI Deploy.</strong></code></p>



<pre class="wp-block-code"><code class="">WORKDIR /workspace
ADD . /workspace</code></pre>



<p>Install the&nbsp;<code>requirements.txt</code>&nbsp;file which contains your needed Python modules using a&nbsp;<code>pip install…</code>&nbsp;command:</p>



<pre class="wp-block-code"><code class="">RUN pip install -r requirements.txt</code></pre>



<p>Then, you can give correct access rights to OVHcloud user (<code>42420:42420</code>).</p>



<pre class="wp-block-code"><code class="">RUN chown -R 42420:42420 /workspace
ENV HOME=/workspace</code></pre>



<p>Finally, you have to define your default launching command to start the application.</p>



<pre class="wp-block-code"><code class="">CMD [ "streamlit", "run", "/workspace/app.py", "--server.address=0.0.0.0" ]</code></pre>



<p>Once your&nbsp;<code>Dockerfile</code>&nbsp;is defined, you will be able to build your custom docker image.</p>



<h3 class="wp-block-heading">Build the Docker image from the Dockerfile</h3>



<p>First, you can launch the following command from the&nbsp;<code>Dockerfile</code>&nbsp;directory to build your application image.</p>



<pre class="wp-block-code"><code class="">docker build . -t streamlit-eda-iris:latest</code></pre>



<p>⚠️&nbsp;<strong><code>The dot . argument indicates that your build context (place of the Dockerfile and other needed files) is the current directory.</code></strong></p>



<p>⚠️&nbsp;<code><strong>The -t argument allows you to choose the identifier to give to your image. Usually image identifiers are composed of a name and a version tag &lt;name&gt;:&lt;version&gt;. For this example we chose streamlit-eda-iris:latest.</strong></code></p>



<h3 class="wp-block-heading">Test it locally</h3>



<p>Now, you can run the following&nbsp;<strong>Docker command</strong>&nbsp;to launch your application locally on your computer.</p>



<pre class="wp-block-code"><code class="">docker run --rm -it -p 8501:8501 --user=42420:42420 <strong style="background-color: inherit;font-family: inherit;font-size: inherit">streamlit-eda-iris</strong><span style="background-color: inherit;font-family: inherit;font-size: 1rem;font-weight: inherit">:latest</span></code></pre>



<p>⚠️&nbsp;<code><strong>The -p 8501:8501 argument indicates that you want to execute a port redirection from the port 8501 of your local machine into the port 8501 of the Docker container.</strong></code></p>



<p>⚠️<code><strong>&nbsp;Don't forget the --user=42420:42420 argument if you want to simulate the exact same behaviour that will occur on AI Deploy. It executes the Docker container as the specific OVHcloud user (user 42420:42420).</strong></code></p>



<p>Once started, your application should be available on&nbsp;<strong>http://localhost:8080</strong>.<br><br>Your Docker image seems to work? Good job&nbsp;👍 !<br><br>It’s time to push it and deploy it!</p>



<h3 class="wp-block-heading">Push the image into the shared registry</h3>



<p>❗&nbsp;The shared registry of AI Deploy should only be used for testing purpose. Please consider attaching your own Docker registry. More information about this can be found&nbsp;<a href="https://docs.ovh.com/asia/en/publiccloud/ai/training/add-private-registry/" data-wpel-link="exclude">here</a>.</p>



<p>Then, you have to find the address of your&nbsp;<code>shared registry</code>&nbsp;by launching this command.</p>



<pre class="wp-block-code"><code class="">ovhai registry list</code></pre>



<p>Next, log in on the shared registry with your usual&nbsp;<code>OpenStack</code>&nbsp;credentials.</p>



<pre class="wp-block-code"><code class="">docker login -u &lt;user&gt; -p &lt;password&gt; &lt;shared-registry-address&gt;</code></pre>



<p>To finish, you need to push the created image into the shared registry.</p>



<pre class="wp-block-code"><code class="">docker tag streamlit-eda-iris:latest &lt;shared-registry-address&gt;/streamlit-eda-iris:latest
docker push &lt;shared-registry-address&gt;/streamlit-eda-iris:latest</code></pre>



<p>Once you have pushed your custom docker image into the shared registry, you are ready to launch your app 🚀 !</p>



<h2 class="wp-block-heading">Launch the AI Deploy app</h2>



<p>The following command starts a new job running your Streamlit application.</p>



<pre class="wp-block-code"><code class="">ovhai app run \
      --default-http-port 8501 \
      --cpu 12 \
      &lt;shared-registry-address&gt;/streamlit-eda-iris:latest</code></pre>



<h3 class="wp-block-heading">Choose the compute resources</h3>



<p>First, you can either choose the number of GPUs or CPUs for your app.</p>



<p><code><strong>--cpu 12</strong></code>&nbsp;indicates that we request 12 CPUs for that app.</p>



<p>If you want, you can also launch this app with one or more&nbsp;<strong>GPUs</strong>.</p>



<h3 class="wp-block-heading">Make the app public</h3>



<p>Finally, if you want your app to be accessible without the need to authenticate, specify it as follows.</p>



<p>Consider adding the&nbsp;<code><strong>--unsecure-http</strong></code>&nbsp;attribute if you want your application to be reachable without any authentication.</p>



<figure class="wp-block-video"><video height="998" style="aspect-ratio: 1917 / 998;" width="1917" controls src="https://blog.ovhcloud.com/wp-content/uploads/2022/10/Enregistrement-de-lécran-2022-10-05-à-11.52.19-1.mov"></video></figure>



<h2 class="wp-block-heading">Conclusion</h2>



<p>Well done 🎉&nbsp;! You have learned how to build your&nbsp;<strong>own Docker image</strong>&nbsp;for a dedicated&nbsp;<strong>EDA and interactive prediction app</strong>!</p>



<p>You have also been able to deploy this app thanks to&nbsp;<strong>OVHcloud’s AI Deploy</strong>&nbsp;tool.</p>



<p><em>In a third article, you will see how it is possible to deploy a Data Science project with an API for&nbsp;Spam classification.</em></p>



<h3 class="wp-block-heading" id="want-to-find-out-more">Want to find out more?</h3>



<h5 class="wp-block-heading"><strong>Notebook</strong></h5>



<p>You want to access the notebook? Refer to the&nbsp;<a href="https://github.com/ovh/ai-training-examples/blob/main/notebooks/getting-started/pytorch/notebook_classification_iris.ipynb" data-wpel-link="external" target="_blank" rel="nofollow external noopener noreferrer">GitHub repository</a>.</p>



<h5 class="wp-block-heading"><strong>App</strong></h5>



<p>You want to access to the full code to create the Streamlit app? Refer to the&nbsp;<a href="https://github.com/ovh/ai-training-examples/tree/main/apps/streamlit/eda-classification-iris" target="_blank" rel="noreferrer noopener nofollow external" data-wpel-link="external">GitHub repository</a>.<br><br>To launch and test this app with&nbsp;<strong>AI Deploy</strong>, please refer to&nbsp;our&nbsp;<a href="https://docs.ovh.com/gb/en/publiccloud/ai/deploy/tuto-streamlit-eda-iris/" data-wpel-link="exclude">documentation</a>.</p>



<h2 class="wp-block-heading">References</h2>



<ul class="wp-block-list">
<li><a href="https://towardsdatascience.com/how-to-run-a-data-science-project-in-a-docker-container-2ab1a3baa889" data-wpel-link="external" target="_blank" rel="nofollow external noopener noreferrer">How to Run a Data Science Project in a Docker Container</a></li>



<li><a href="https://medium.com/geekculture/create-a-machine-learning-web-app-with-streamlit-f28c75f9f40f" data-wpel-link="external" target="_blank" rel="nofollow external noopener noreferrer">Create a Machine Learning Web App with Streamlit</a></li>
</ul>
<img loading="lazy" decoding="async" src="//blog.ovhcloud.com/wp-content/plugins/matomo/app/matomo.php?idsite=1&amp;rec=1&amp;url=https%3A%2F%2Fblog.ovhcloud.com%2Fdeploy-a-custom-docker-image-for-data-science-project-streamlit-app-for-eda-and-interactive-prediction-part-2%2F&amp;action_name=Deploy%20a%20custom%20Docker%20image%20for%20Data%20Science%20project%20%E2%80%93%20Streamlit%20app%20for%20EDA%20and%20interactive%20prediction%20%28Part%202%29&amp;urlref=https%3A%2F%2Fblog.ovhcloud.com%2Ffeed%2F" style="border:0;width:0;height:0" width="0" height="0" alt="" />]]></content:encoded>
					
		
		<enclosure url="https://blog.ovhcloud.com/wp-content/uploads/2022/10/Enregistrement-de-lécran-2022-10-05-à-11.52.19-1.mov" length="6370587" type="video/quicktime" />

			</item>
	</channel>
</rss>
