# Glossary: Report Column Definitions

Column	Definition
key	A unique ID representing the model identified. This key can be broken apart by `__` to identify the separate pipeline elements and finally which rank it received within the search (e.g., grid or random)
scaler	Identifies the scaler used within the pipeline
feature_selector	Identifies the feature selector used within the pipeline
algorithm	Identifies the algorithm used within the pipeline
searcher	Identifies the hyper parameter search method used within the pipeline
scorer	Identifies the scoring method used to assess models within the search
accuracy	The models accuracy against the generalization dataset
acc_95_ci	The models 95% CI for accuracy against the generalization dataset (reported as an array representing lower and upper bounds respectively)
mcc	Matthews correlation coefficient or the mean square contingency coefficient which measures the quality of the binary classification assessed against the generalization dataset
avg_sn_sp	The models average sensitivity and specificity against the generalization dataset
roc_auc	The models ROC AUC score against the generalization dataset
roc_auc_95_ci	The models 95% CI for ROC AUC against the generalization dataset (reported as an array representing lower and upper bounds respectively)
f1	The models F1 score against the generalization dataset
sensitivity	The models sensitivity against the generalization dataset
sn_95_ci	The models 95% CI for sensitivity against the generalization dataset (reported as an array representing lower and upper bounds respectively)
specificity	The models specificity against the generalization dataset
sp_95_ci	The models 95% CI for specificity against the generalization dataset (reported as an array representing lower and upper bounds respectively)
prevalence	The models prevalence against the generalization dataset
pr_95_ci	The models 95% CI for prevalence against the generalization dataset (reported as an array representing lower and upper bounds respectively)
ppv	The models positive predictive value against the generalization dataset
ppv_95_ci	The models 95% CI for positive predictive value against the generalization dataset (reported as an array representing lower and upper bounds respectively)
npv	The models negative predictive value against the generalization dataset
npv_95_ci	The models 95% CI for negative predictive value against the generalization dataset (reported as an array representing lower and upper bounds respectively)
tn	The number of true negatives identified by the model against the generalization dataset
tp	The number of true positives identified by the model against the generalization dataset
fn	The number of false negatives identified by the model against the generalization dataset
fp	The number of false positives identified by the model against the generalization dataset
selected_features	The features selected by the feature selector of the pipeline
feature_scores	The score or importance of each feature
best_params	The hyper parameters found by the pipeline's search for the model identified
test_fpr	An array representing the false positive rate at various threshold values (used to plot an ROC AUC curve) for the training dataset
test_tpr	An array representing the true positive rate at various threshold values (used to plot an ROC AUC curve) for the training dataset
training_roc_auc	The models ROC AUC score against the training dataset using a train/test split with cross validation
roc_delta	The absolute value of the difference between the generalization ROC AUC score and the training ROC AUC score
generalization_fpr	An array representing the false positive rate at various threshold values (used to plot an ROC AUC curve) for the generalization dataset
generalization_tpr	An array representing the true positive rate at various threshold values (used to plot an ROC AUC curve) for the generalization dataset
brier_score	The models Brier score against the generalization dataset
fop	An array representing the fraction of positives at various threshold values (used to plot a reliability curve) for the generalization dataset
mpv	An array representing the mean predicated probability at various threshold values (used to plot a reliability curve) for the generalization dataset
precision	An array representing the precision (aka positive predictive value) at various threshold values (used to plot a precision recall curve) for the generalization dataset
recall	An array representing the recall (aka sensitivity) at various threshold values (used to plot a precision recall curve) for the generalization dataset

← Glossary: Common machine learning terms in MILO-ML Glossary: Performance Column Definitions →