How to fix this while integrating Machine Learning UDF function to Azure Stream Analytics

Pradyumn Joshi 40 Reputation points
2024-06-04T15:28:29.2+00:00

Hi,

I am trying to integrate my ML endpoint with Azure Stream Analytics, but failing to do so as I cannot save the below details:

img

User's image

Here is my scoring script for the endpoint:

import os
import logging
import json
import joblib
from azureml.core.model import Model


from inference_schema.schema_decorators import input_schema, output_schema
from inference_schema.parameter_types.standard_py_parameter_type import StandardPythonParameterType

def init():
    global model
    global vectorizer
    logging.info(os.getenv("AZUREML_MODEL_DIR"))
    model_path = os.path.join(os.getenv("AZUREML_MODEL_DIR"), "trainingModel/sentimentPredictionModel.pkl")
    vectorizer_path = os.path.join(os.getenv("AZUREML_MODEL_DIR"), "trainingModel/sentimentVectorizer.pkl")
    
    # Load the models from the registered location
    model = joblib.load(model_path)
    vectorizer = joblib.load(vectorizer_path)
    logging.info("Init complete")
    
# Define the input schema
input_sample = "I like this"
input_schema_type = StandardPythonParameterType({'data':input_sample})

# Define the output schema
output_sample = {"prediction": [1]}
output_schema_type = StandardPythonParameterType(output_sample)

@input_schema('data', input_schema_type)
@output_schema(output_schema_type)

def run(data):
    print(data)
    logging.info(data)
    #value = json.loads(data)['data']
    #transformed_text = vectorizer.transform([value])
    transformed_text = vectorizer.transform([data])
    prediction = model.predict(transformed_text)
    return json.dumps({"prediction": prediction.tolist()})


According to my script, the schema it requires is {"data":"I like this"} as input and {"prediction":[1]} as output.

How can I correct this and save the function to invoke that in the stream analytics job.

Azure Machine Learning
Azure Machine Learning
An Azure machine learning service for building and deploying models.
2,688 questions
Azure Stream Analytics
Azure Stream Analytics
An Azure real-time analytics service designed for mission-critical workloads.
342 questions
0 comments No comments
{count} votes

1 answer

Sort by: Most helpful
  1. Sander van de Velde | MVP 30,711 Reputation points MVP
    2024-06-04T18:03:34.5233333+00:00

    Hello @Pradyumn Joshi,

    welcome to this moderated Azure community forum.

    The first possible issue is that no decent JSON is outputted by:

    json.dumps({"prediction": prediction.tolist()})
    

    It's not clear to me how 'prediction' looks like or what 'prediction.tolist()' produces.

    Perhaps the problem is just lack of quotation:

    {"data": I like this}
    

    This is not correct JSON.

    Please add some debug lines and check the output of the 'json.dumps()'

    Then I noticed you say '{"data":"I like this"}' but the code outputs '{"prediction": prediction.tolist()}'.

    Finally, Stream Analytics is complaining about yet another format containing 'inputName' or 'Inputs'.


    If the response helped, do "Accept Answer". If it doesn't work, please let us know the progress. All community members with similar issues will benefit by doing so. Your contribution is highly appreciated.