Cross-Validation |
A technique used to assess the performance of a predictive model by splitting the data into multiple subsets. |
Data Cleaning |
The process of removing errors and inconsistencies from the collected data before further analysis. |
Data Collection |
The first step in the predictive modelling process, involving gathering relevant data from various sources. |
Data Preparation |
The process of cleaning, transforming, and organizing data before building a predictive model. |
Decision Tree |
A flowchart-like structure in which each internal node represents a test on an attribute, each branch represents the outcome of the test, and each leaf node represents a class label. |
Deployment |
Where the model is put into practical use for making predictions. |
Ensemble Learning |
A machine learning technique that combines multiple models to improve the overall performance and accuracy of the prediction. |
Feature Selection |
The process of choosing the most relevant input variables to be used in the predictive model. |
Gradient Boosting |
An ensemble learning method that builds a model in a stage-wise manner, with each new model addressing the errors of the previous models. |
K-Nearest Neighbors |
A non-parametric method used for classification and regression that classifies a data point based on the majority class of its k nearest neighbors. |
Logistic Regression |
A statistical method used to model binary outcomes by estimating the probability that a given outcome is present. |
Model Evaluation |
The process of assessing the performance of the predictive model using validation data sets. |
Model Training |
The stage where the predictive model is developed using the selected features and training data. |
Model Tuning |
The step where adjustments are made to the model to improve its predictive accuracy and performance. |
Monitoring |
The ongoing process of evaluating the model's performance and making updates as needed to maintain accuracy. |
Neural Network |
A network of interconnected nodes, similar to neurons in the brain, that processes information by mimicking the way the human brain functions. |
Predictive Modelling Process |
The process of using data and statistical algorithms to create models that predict future outcomes or trends. |
Random Forest |
An ensemble learning method that constructs a multitude of decision trees at training time and outputs the mode of the classes as the prediction. |
Regression Analysis |
A statistical method used to examine the relationship between one dependent variable and one or more independent variables. |
Support Vector Machine |
A supervised machine learning algorithm that classifies data into different classes by finding the hyperplane that best separates the data points. |
Time Series Forecasting |
A technique used to forecast future values based on past data points in time order. |