| key | A unique ID representing the model identified. This key can be broken apart by __ to identify the separate pipeline elements and finally which rank it received within the search (e.g., grid or random) |
| scaler | Identifies the scaler used within the pipeline |
| feature_selector | Identifies the feature selector used within the pipeline |
| algorithm | Identifies the algorithm used within the pipeline |
| searcher | Identifies the hyper parameter search method used within the pipeline |
| scorer | Identifies the scoring method used to assess models within the search |
| accuracy | The models accuracy against the generalization dataset |
| acc_95_ci | The models 95% CI for accuracy against the generalization dataset (reported as an array representing lower and upper bounds respectively) |
| mcc | Matthews correlation coefficient or the mean square contingency coefficient which measures the quality of the binary classification assessed against the generalization dataset |
| avg_sn_sp | The models average sensitivity and specificity against the generalization dataset |
| roc_auc | The models ROC AUC score against the generalization dataset |
| roc_auc_95_ci | The models 95% CI for ROC AUC against the generalization dataset (reported as an array representing lower and upper bounds respectively) |
| f1 | The models F1 score against the generalization dataset |
| sensitivity | The models sensitivity against the generalization dataset |
| sn_95_ci | The models 95% CI for sensitivity against the generalization dataset (reported as an array representing lower and upper bounds respectively) |
| specificity | The models specificity against the generalization dataset |
| sp_95_ci | The models 95% CI for specificity against the generalization dataset (reported as an array representing lower and upper bounds respectively) |
| prevalence | The models prevalence against the generalization dataset |
| pr_95_ci | The models 95% CI for prevalence against the generalization dataset (reported as an array representing lower and upper bounds respectively) |
| ppv | The models positive predictive value against the generalization dataset |
| ppv_95_ci | The models 95% CI for positive predictive value against the generalization dataset (reported as an array representing lower and upper bounds respectively) |
| npv | The models negative predictive value against the generalization dataset |
| npv_95_ci | The models 95% CI for negative predictive value against the generalization dataset (reported as an array representing lower and upper bounds respectively) |
| tn | The number of true negatives identified by the model against the generalization dataset |
| tp | The number of true positives identified by the model against the generalization dataset |
| fn | The number of false negatives identified by the model against the generalization dataset |
| fp | The number of false positives identified by the model against the generalization dataset |
| selected_features | The features selected by the feature selector of the pipeline |
| feature_scores | The score or importance of each feature |
| best_params | The hyper parameters found by the pipeline's search for the model identified |
| test_fpr | An array representing the false positive rate at various threshold values (used to plot an ROC AUC curve) for the training dataset |
| test_tpr | An array representing the true positive rate at various threshold values (used to plot an ROC AUC curve) for the training dataset |
| training_roc_auc | The models ROC AUC score against the training dataset using a train/test split with cross validation |
| roc_delta | The absolute value of the difference between the generalization ROC AUC score and the training ROC AUC score |
| generalization_fpr | An array representing the false positive rate at various threshold values (used to plot an ROC AUC curve) for the generalization dataset |
| generalization_tpr | An array representing the true positive rate at various threshold values (used to plot an ROC AUC curve) for the generalization dataset |
| brier_score | The models Brier score against the generalization dataset |
| fop | An array representing the fraction of positives at various threshold values (used to plot a reliability curve) for the generalization dataset |
| mpv | An array representing the mean predicated probability at various threshold values (used to plot a reliability curve) for the generalization dataset |
| precision | An array representing the precision (aka positive predictive value) at various threshold values (used to plot a precision recall curve) for the generalization dataset |
| recall | An array representing the recall (aka sensitivity) at various threshold values (used to plot a precision recall curve) for the generalization dataset |