Printable Flashcards

Predictive Modelling Process	Data Collection
Data Cleaning	Feature Selection
Model Training	Model Evaluation
Model Tuning	Deployment

The first step in the predictive modelling process, involving gathering relevant data from various sources.	The process of using data and statistical algorithms to create models that predict future outcomes or trends.
The process of choosing the most relevant input variables to be used in the predictive model.	The process of removing errors and inconsistencies from the collected data before further analysis.
The process of assessing the performance of the predictive model using validation data sets.	The stage where the predictive model is developed using the selected features and training data.
Where the model is put into practical use for making predictions.	The step where adjustments are made to the model to improve its predictive accuracy and performance.

The process of cleaning, transforming, and organizing data before building a predictive model.	The ongoing process of evaluating the model's performance and making updates as needed to maintain accuracy.
A statistical method used to examine the relationship between one dependent variable and one or more independent variables.	A technique used to assess the performance of a predictive model by splitting the data into multiple subsets.
An ensemble learning method that constructs a multitude of decision trees at training time and outputs the mode of the classes as the prediction.	A flowchart-like structure in which each internal node represents a test on an attribute, each branch represents the outcome of the test, and each leaf node represents a class label.
A network of interconnected nodes, similar to neurons in the brain, that processes information by mimicking the way the human brain functions.	A supervised machine learning algorithm that classifies data into different classes by finding the hyperplane that best separates the data points.

A non-parametric method used for classification and regression that classifies a data point based on the majority class of its k nearest neighbors.	A statistical method used to model binary outcomes by estimating the probability that a given outcome is present.
A technique used to forecast future values based on past data points in time order.	An ensemble learning method that builds a model in a stage-wise manner, with each new model addressing the errors of the previous models.
	A machine learning technique that combines multiple models to improve the overall performance and accuracy of the prediction.