Deep Learning:Covid19 Diagnosis by xray images

10 min readAug 31, 2020

Hi!

Deep Learning has a huge range of applications including health and diagnostics.

I would like to explain how we can deploy a complete cloud solution using CNN to classify covid pneumonia, normal pneumonia and non-pneumonia using X-ray scan.

For this solution, we’ll use transfer learning (ResNet50) and Convolutional Networks.

Which model do you think will have the best metrics?

Choosing a good dataset

Let’s start by choosing a good dataset. In kaggle there are many datasets about different subjects, and I found the following:

Chest X-ray (Covid-19 & Pneumonia)

Dataset contains chest x-ray images of Covid-19, Pneumonia and normal patients.

www.kaggle.com

I downloaded it to my notebook environment. To create a new notebook please click in the link “Jupyter” in your Azure Compute Instance.

Preparing the Data

The first thing you need to do is to connect to your workspace and upload the images into the default datastore to be ready for new experiments and register differents versions.

import azureml.core
from azureml.core import Workspace# Load the workspace from the saved config file
ws = Workspace.from_config()print(‘Ready to use Azure ML {} to work with {}’.format(azureml.core.VERSION, ws.name))datastore_covid = ws.get_default_datastore()datastore_covid.upload('./covid-img/COVID-19_Radiography_Database',                                    
                        target_path='covid-img/', 
                        overwrite=True, 
                        show_progress=True)

The following code gets a reference to the covid-img folder where you uploaded the images files. It is a pointer to the DataStore and it can be used to download the contents of the folder to the compute context where the data reference is being used

data_ref = datastore_covid.path('covid-img').as_download(path_on_compute='covid-img')print(data_ref)

let’s run the following cell to see how many normal, pneumonia or covid chestchest X-rays we have.

import tensorflow as tffilenames = tf.io.gfile.glob('./covid-img/train/*/*')print("TRAINING")
COUNT_NORMAL = len([filename for filename in filenames if "NORMAL" in filename])
print("Normal images count in training set: " + str(COUNT_NORMAL))COUNT_PNEUMONIA = len([filename for filename in filenames if "PNEUMONIA" in filename])
print("Pneumonia images count in training set: " + str(COUNT_PNEUMONIA))COUNT_COVID19 = len([filename for filename in filenames if "COVID19" in filename])
print("Covid19 images count in training set: " + str(COUNT_COVID19))print("TESTING")
filenames = tf.io.gfile.glob('./covid-img/test/*/*')COUNT_NORMAL = len([filename for filename in filenames if "NORMAL" in filename])
print("Normal images count in training set: " + str(COUNT_NORMAL))COUNT_PNEUMONIA = len([filename for filename in filenames if "PNEUMONIA" in filename])
print("Pneumonia images count in training set: " + str(COUNT_PNEUMONIA))COUNT_COVID19 = len([filename for filename in filenames if "COVID19" in filename])
print("Covid19 images count in training set: " + str(COUNT_COVID19))

In the original dataset we have some imbalance in pneumonia and Normal images. This problem can be solved creating a new set for a final validation or adding class_weight on fit.

weight_for_0 = (1 / count_covid) * (count_total)/3.0
weight_for_1 = (1 / count_normal) * (count_total)/3.0 
weight_for_2 = (1 / count_pneumonia) * (count_total)/3.0class_weight = {0: weight_for_0, 1: weight_for_1, 2: weight_for_2}
history = model.fit(
    training_dataset,
    steps_per_epoch = training_dataset.samples // batch_size,
    validation_data = validation_dataset, 
    validation_steps = validation_dataset.samples // batch_size,    
    epochs = hparam_epochs,
    class_weight = class_weight
    )

In this example we solve it moving several images to validation dataset.

Let’s see several images:

#read DataSet
TrainImage="./covid-img/train"
TestImage="./covid-img/test"#to get all image names in train file
Pneumonaimages = os.listdir(TrainImage + "/PNEUMONIA")
Normalimages = os.listdir(TrainImage + "/NORMAL")
COVID19images = os.listdir(TrainImage + "/COVID19")#plot PNEUMONIA
plt.figure(figsize=(9,9))
for i in range(3):
    plt.subplot(3, 3, i + 1)
    plt.imshow(
        plt.imread(
           os.path.join(TrainImage+"/PNEUMONIA",Pneumonaimages[i])),
    cmap='gray')
    plt.title("PNEUMONIA")
    
plt.show()#plot NORMAL
plt.figure(figsize=(9,9))
for i in range(3):
    plt.subplot(3, 3, i + 1)
    plt.imshow(
        plt.imread(
           os.path.join(TrainImage+"/NORMAL",Normalimages[i])),
    cmap='gray')
    plt.title("NORMAL")plt.show()
#plot COVID19
plt.figure(figsize=(9,9))
for i in range(3):
    plt.subplot(3, 3, i + 1)
    plt.imshow(
       plt.imread(
          os.path.join(TrainImage+"/COVID19",COVID19images[i])),
    cmap='gray')
    plt.title("COVID19")

Come on! we are ready to prepare the training and testing set:

datagen =ImageDataGenerator(
      samplewise_center=True,
      samplewise_std_normalization= True,
      width_shift_range=0.2,
      height_shift_range=0.2,
      shear_range=0.2,
      zoom_range=0.2,
      fill_mode='nearest'
                       )print("Preparing training dataset...")
training_dataset = datagen.flow_from_directory(
    './covid-img/train',
    target_size=pretrained_size, # resize to match model expected input
    batch_size=batch_size,
    shuffle=shuffle)print("Preparing testing dataset...")
testing_dataset = datagen.flow_from_directory(
    './covid-img/test',
    target_size=pretrained_size, # resize to match model expected input
    batch_size=batch_size,
    shuffle=shuffle)print("Preparing validation dataset...")
validation_dataset = datagen.flow_from_directory(
    './covid-img/validation',
    target_size=pretrained_size, # resize to match model expected input
    batch_size=batch_size)
    
classnames = list(training_dataset.class_indices.keys())
print("class names", classnames)

Prepare a Compute Target

One of the benefits of cloud compute is that it scales on-demand, enabling you to provision enough compute resources to process multiple runs of an experiment in parallel, each with different hyperparameter values.

You’ll create an Azure Machine Learning compute cluster in your workspace.

In my case, due the evaluation subscription I am using it only let me use 1 instance for the cluster, so I’m going to use the computer instance.

from azureml.core.compute import ComputeTarget, AmlCompute
from azureml.core.compute_target import ComputeTargetExceptiontarget_name = "ci-eva3"try:
    # Check for existing compute target
    compute_target = ComputeTarget(workspace=ws, name=target_name)
    print('Found existing cluster, use it.')
except ComputeTargetException:
    # If it doesn't already exist, create it
    try:
        compute_config =  AmlCompute.provisioning_configuration(vm_size='STANDARD_DS11_V2', max_nodes=2)
        compute_target = ComputeTarget.create(ws, target_name, compute_config)
        compute_target.wait_for_completion(show_output=True)
    except Exception as ex:
        print(ex)

Preparing the environment

Now, we are going to create a folder for the training script

import os, shutil# Create a folder for the experiment files
training_folder = ‘covid-training’
os.makedirs(training_folder, exist_ok=True)
print(“trainig folder created”)

Preparing the Training Scripts

this py script will be copied into the training_folder.

The following code build the topology of the ResNet50 and regular CNN, training the model and evaluate it, in base of the rest of the defined hyperparameters.

In this case we choose: learning rate, epoch and model type. You might add number of layers or neurons etc

%%writefile $training_folder/covid_training.py<...imports># Get parameters
parser = argparse.ArgumentParser()
parser.add_argument('--epochs', type=int, dest='hp_epochs', default=10, help='epoch hyperparam')
parser.add_argument('--model', type=str, dest='hp_model', default='Conv', help='model hyperparam')
parser.add_argument('--lr', type=float, dest='hp_lr', default=0.0001, help='learning rate hyperparam')args = parser.parse_args()
hparam_epochs = args.hp_epochs
hparam_lr = args.hp_lr
hparam_model = args.hp_modelrun = Run.get_context()pretrained_size = (440,440)
batch_size = 16
print("preparing data")#Preparing the Data
<previous cell with data processing>#Preparing the modelif (hparam_model == "ResNet50"):

   <ResNet50 code bellow>
else:
   <designed CNN code bellow>metricas = [
        'accuracy',
        tf.keras.metrics.Precision(name='precision'),
        tf.keras.metrics.Recall(name='recall')
    ]hp_optimizer = tf.keras.optimizers.Adam(learning_rate=hparam_lr)# Compile the model
model.compile(loss='categorical_crossentropy',
              optimizer=hp_optimizer,
              metrics=metricas)# Now print the full model
model.summary()# Train the model over hparam_epochs
checkpoint_cb = tf.keras.callbacks.ModelCheckpoint(
                "xray_model.h5",
                save_best_only=True)early_stopping_cb = tf.keras.callbacks.EarlyStopping(
                patience=10,
                restore_best_weights=True)history = model.fit(
    training_dataset,
    steps_per_epoch = training_dataset.samples // batch_size,
    validation_data = testing_dataset, 
    validation_steps = testing_dataset.samples // batch_size,
    epochs = hparam_epochs,
    callbacks=[checkpoint_cb, early_stopping_cb])#print accuracy and loss plot
fig, ax = plt.subplots(1, 4, figsize=(20, 3))
ax = ax.ravel()for i, met in enumerate(['precision', 'recall', 'accuracy', 'loss']):
    ax[i].plot(history.history[met])
    ax[i].plot(history.history['val_' + met])
    ax[i].set_title('Model {}'.format(met))
    ax[i].set_xlabel('epochs')
    ax[i].set_ylabel(met)
    ax[i].legend(['train', 'val'])run.log_image("metrics",plot=plt)# Use the model to predict the class
Y_pred = model.predict(testing_dataset)# The model returns a probability value for each class
# The one with the highest probability is the predicted class
y_pred = np.argmax(Y_pred, axis=1)loss, acc, prec, rec = model.evaluate(testing_dataset)

# Plot the confusion matrix
cm=confusion_matrix(testing_dataset.classes, y_pred)fig, ax = plt.subplots(figsize=(10,10))
heatmap=sns.heatmap(cm, annot=True, fmt='d',cmap="Greens")
plt.title('Y Actual, X predicted')
plt.show()
print("generado plot de cm")
run.log_image("CM",plot=plt)
print("volcado en log")run.complete()

For ResNet50 Code:

base_model = ResNet50(weights=None, include_top=False, input_shape=training_dataset.image_shape)# Freeze the already-trained layers in the base modelfor layer in base_model.layers:layer.trainable = False# Create layers for classification of our imagesx = base_model.outputx = Flatten()(x)#Full Connected Layersx = tf.keras.layers.Dense(128, activation='relu',bias_regularizer=regularizers.l2(1e-4))(x)#Add dropout to avoid Overfitx = tf.keras.layers.Dropout(0.2)(x)x = tf.keras.layers.Dense(64, activation='relu',bias_regularizer=regularizers.l2(1e-4))(x)x = Dense(len(classnames), activation='softmax')(x)model = Model(inputs=base_model.input, outputs=x)

For CNN Code:

model = Sequential()
#Step 1 Convolutionmodel.add(Conv2D(filters = 16,kernel_size= (3,3), #16 filters, character matrix 3x3input_shape = (180,180,3), #img 64x64 and 3 channelsactivation = "relu"))  #ensure lineality#Step 2 max pooling creationmodel.add(MaxPooling2D(pool_size = (2,2)))model.add(Conv2D(filters = 32,kernel_size= (3,3), activation = "relu"))model.add(MaxPooling2D(pool_size = (2,2)))tf.keras.layers.Dropout(0.2)model.add(Conv2D(filters = 64,kernel_size= (3,3),activation = "relu"))model.add(MaxPooling2D(pool_size = (2,2)))tf.keras.layers.Dropout(0.2)#Step 3 flatteningmodel.add(Flatten())#Step 4 full connectedmodel.add(Dense(units = 128, activation = "relu", bias_regularizer=regularizers.l2(1e-4)))model.add(Dense(units = 3, activation = "softmax"))

Notice we’re using softmax because the classification is into 3 classes. And the recomendation for CNN is use Relu as activation for hidden layers and Adam as optimizer

Run a Hyperdrive Experiment

Given that we‘re working with CNN, the most suitable estimator is TensorFlow. To reduce the execution time, we choose only 12 and 15 epochs.

from azureml.core import Experiment
from azureml.train.sklearn import SKLearn
from azureml.train.estimator import Estimator
from azureml.train.dnn import TensorFlow
from azureml.train.hyperdrive import GridParameterSampling, BanditPolicy, HyperDriveConfig, PrimaryMetricGoal, choice, normal
from azureml.train.hyperdrive import uniform, choice
from azureml.widgets import RunDetails
from azureml.train.hyperdrive import MedianStoppingPolicy# defining the hyperparams
params = GridParameterSampling(
    {
        '--epochs': choice(12,15),        
        '--lr':choice(0.000001,0.00001),
        '--model': choice('Conv','ResNet50')
        
    }hyper_estimator = TensorFlow(source_directory=training_folder,                                            
                  entry_script='covid_training_dense169.py',                                            
                  compute_target=compute_target,
                  conda_packages=['keras','matplotlib',
                                 'scikit-learn'],                      
                  framework_version='2.0'                                
                      )
early_termination_policy = MedianStoppingPolicy()# Configure hyperdrive settings
hyperdrive = HyperDriveConfig(estimator=hyper_estimator, 
                          hyperparameter_sampling=params,                           
                          primary_metric_name='Accuracy', 
                          primary_metric_goal=PrimaryMetricGoal.MAXIMIZE, 
                          policy=early_termination_policy,    
                          max_total_runs=3,
                          max_concurrent_runs=2)
# Run the experiment
experiment = Experiment(workspace = ws, name = 'covid_hyperdrive')
run = experiment.submit(config=hyperdrive)# Show the widget with the status of the Run experiment
RunDetails(run).show()
run.wait_for_completion()

After executing the previous cell, you can see its status and the azure ML Studio link for more details

And this is the information in Azure ML Studio for the execution.

2 paralell executions. 3 total executions.

The best model is CNN:

And the Confusion Matrix:

Determine the Best Performing Run

When all of the runs have finished, you can see the metrics of all executions:

for child_run in run.get_children():
    print(child_run.id, child_run.get_metrics())

With a good results in validation dataset:

import tensorflow as tf
from tensorflow import keras
import matplotlib.image as mpimg
from tensorflow.keras.preprocessing import image
from matplotlib import pyplot as plt
import numpy as np
from colored import fg, bg, attr
import seaborn as snsclassnames = ['COVID19','NORMAL','PNEUMONIA']Y_pred = model.predict(validation_dataset)
y_pred = np.argmax(Y_pred, axis=1)
y_actual = validation_dataset.classes
cm=confusion_matrix(y_actual, y_pred)
                          
fig, ax = plt.subplots(figsize=(5,5))
heatmap=sns.heatmap(cm, annot=True, fmt='d',cmap="Greens")
plt.title('Y Actual, X predicted')
plt.show()
    
filenames = tf.io.gfile.glob(str('./covid-img/validation/*/*'))for filename in filenames:               
        img = image.load_img(filename, target_size=(img_height, img_width))
        img_array = image.img_to_array(img)
        img_array = tf.expand_dims(img_array, 0) # Create a batch
        predictions = model.predict(img_array)                     
        score = tf.nn.softmax(predictions[0])
        if classnames[np.argmax(score)] not in filename:
            print("'%s %s belongs to %s." % (fg(1), filename, classnames[np.argmax(score)]))            
        else:    
            print("%s %s belongs to %s." % (fg(28), filename, classnames[np.argmax(score)]))

red ones are wrong predictions of the model and green are the good ones

Registry and Deploy

Now that you’ve found the best run, you can register the trained model.

from azureml.core import Model# Register modelbest_run.register_model(
               model_path='outputs/covid_model.pkl',  
               model_name='covid_model',
               tags={'Training context':'Hyperdrive'},
               properties={'Accuracy': best_run_metrics['Accuracy']}
               )   # List registered modelsfor model in Model.list(ws):
    print(model.name, 'version:', model.version)
    for tag_name in model.tags:
        tag = model.tags[tag_name]
        print ('\t',tag_name, ':', tag)    for prop_name in model.properties:
        prop = model.properties[prop_name]
        print ('\t',prop_name, ':', prop)
    print('\n')

We’re going to create a web service to be deployed. This code loads the registered model and return predictions.

%%writefile score_covid.py
import tensorflow as tf
import json
import joblib
import numpy as np
from azureml.core.model import Model
from tensorflow.keras.models import load_model# Called when the service is loaded
def init():
    global model
    # Get the path to the deployed model file and load it
    model_path = Model.get_model_path('covid_model')    
    model = load_model(model_path)
# Called when a request is received
def run(raw_data):
    # Get the input data as a numpy array
    data = np.array(json.loads(raw_data)['data'])
    data = tf.expand_dims(data, 0)
    # make prediction    
    y_hat = model.predict(data)    
    classnames = ['COVID19','NORMAL','PNEUMONIA']
    score = tf.nn.softmax(y_hat[0])   
    return classnames[np.argmax(score)]

We have also to prepare the enviroment:

from azureml.core.conda_dependencies import CondaDependencies# Add the dependencies for our model 
myenv = CondaDependencies()
myenv.add_conda_package("scikit-learn")
myenv.add_conda_package("tensorflow")
myenv.add_conda_package("keras")
myenv.add_conda_package("matplotlib")
# Save the environment config as a .yml file
env_file = "./covid_env.yml"
with open(env_file,"w") as f:
    f.write(myenv.serialize_to_string())
print("Saved dependency info in", env_file)
# Print the .yml file
with open(env_file,"r") as f:
    print(f.read())

And finally, deploy it:

from azureml.core.webservice import AciWebservice
from azureml.core.model import InferenceConfig# Configure the scoring environment
inference_config = InferenceConfig(runtime= "python",
                                   source_directory = folder_name,
                                   entry_script="score_covid.py",
                                   conda_file="covid_env.yml")deployment_config = AciWebservice.deploy_configuration
                    (cpu_cores = 1, 
                     memory_gb = 1)service_name = "covid-service"service = Model.deploy(ws, service_name, [model], 
                       inference_config, deployment_config)service.wait_for_deployment(True)
print(service.state)

And this is the Azure ML Studio info:

Using the Trained Model

import requests
import tensorflow as tf
import jsonpath = './covid-img/validation/COVID1936.jpg'
img = image.load_img(path, target_size=(180, 180))
img_array = image.img_to_array(img)input_json = json.dumps({"data": img_array.tolist()})
test = bytes(input_json, encoding='utf8')endpoint = service.scoring_uri
print('endpoint:',endpoint)# Set the content type
headers = { 'Content-Type':'application/json' }
predictions = requests.post(endpoint, input_json, headers = headers)
print("prediction:",predictions.json())