Predictive Modelling Process | Data Collection |
Data Cleaning | Feature Selection |
Model Training | Model Evaluation |
Model Tuning | Deployment |
The first step in the predictive modelling process, involving gathering relevant data from various sources. | The process of using data and statistical algorithms to create models that predict future outcomes or trends. |
The process of choosing the most relevant input variables to be used in the predictive model. | The process of removing errors and inconsistencies from the collected data before further analysis. |
The process of assessing the performance of the predictive model using validation data sets. | The stage where the predictive model is developed using the selected features and training data. |
Where the model is put into practical use for making predictions. | The step where adjustments are made to the model to improve its predictive accuracy and performance. |
Monitoring | Data Preparation |
Cross-Validation | Regression Analysis |
Decision Tree | Random Forest |
Support Vector Machine | Neural Network |
The process of cleaning, transforming, and organizing data before building a predictive model. | The ongoing process of evaluating the model's performance and making updates as needed to maintain accuracy. |
A statistical method used to examine the relationship between one dependent variable and one or more independent variables. | A technique used to assess the performance of a predictive model by splitting the data into multiple subsets. |
An ensemble learning method that constructs a multitude of decision trees at training time and outputs the mode of the classes as the prediction. | A flowchart-like structure in which each internal node represents a test on an attribute, each branch represents the outcome of the test, and each leaf node represents a class label. |
A network of interconnected nodes, similar to neurons in the brain, that processes information by mimicking the way the human brain functions. | A supervised machine learning algorithm that classifies data into different classes by finding the hyperplane that best separates the data points. |
Logistic Regression | K-Nearest Neighbors |
Gradient Boosting | Time Series Forecasting |
Ensemble Learning | |
A non-parametric method used for classification and regression that classifies a data point based on the majority class of its k nearest neighbors. | A statistical method used to model binary outcomes by estimating the probability that a given outcome is present. |
A technique used to forecast future values based on past data points in time order. | An ensemble learning method that builds a model in a stage-wise manner, with each new model addressing the errors of the previous models. |
A machine learning technique that combines multiple models to improve the overall performance and accuracy of the prediction. | |