Relational Database for Automated Machine Learning

Will Spagnoli 6 Reputation points
2020-07-17T13:30:43.703+00:00

I'm trying to build a time-series Machine Learning experiment in Azure Machine Learning. However, I'm using outputs from previous functions which analyzes multiple factors using the same timestamp. For example, extracting all key phrases from customer surveys, and using it to forecast future sales. This creates a new row for each key phrase found, with all of the other survey data points and the same timestamp. This causes an error due to duplicate timestamps across multiple rows forecasting the same target value. I need to either make each timestamp/survey on row, convert the columns to a list/array, and have it iterate through each key phrase in that column, or use a relational database where the key phrases column is the foreign key to my table of keyphrases. Any recommendations on how to solve this? Thanks!

Azure Machine Learning
Azure Machine Learning
An Azure machine learning service for building and deploying models.
2,848 questions
{count} vote

1 answer

Sort by: Most helpful
  1. Ramr-msft 17,736 Reputation points
    2020-07-23T13:27:08.11+00:00

    @WillSpagnoli-5705 Is the idea that these key phrases will be used to help predict the sale, or are they being added with goal of helping explain the prediction? We believe these phrases, if used for prediction, will not be very useful as the survey may not be known into the future - so we'd be predicting both the survey phrases and the sale. If the timedelta is small or the survey keywords are generally the same across time, the feature may provide more value.

    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.