post https://api-lib.bambu.life/api/autoMl/v2/mlp
Description
This endpoint is to train the data using MLP that is developed as a component of the auto-sklearn.
-
This allows to define the hidden-layer-depth (default 1), number of nodes per layer (range 16-216), activation function (identity, logistic, tanh, relu), solver (lbfgs, sgd, adam).
-
Main parameters: a) timeLeftForThisTask: the total time for running this classification task (it can be set for a smaller number to estimate how long would a run lasts); b) perRunTimeLimit: the limit for 1 run
-
Model can be saved for evaluation and prediction (the pickle file can be large)
-
Output:
a) training score
b) evaluation: accuracy, classification-report, multilabel confusion matrix
Request Body
Name | Datatype | Description | Mandatory | Sample values | List of possible values | No. of decimal places | Notes |
---|---|---|---|---|---|---|---|
clientId | String | It is meant for identification purpose as information are stored under unique id value | Yes | ||||
timeLeftForThisTask | Integer | Total training time is the time limit in seconds for the search of optimum combination of models that has the best performance evaluation . Increasing the time gives a higher chance of finding better models | Yes | 30 | >=30 | 0 | |
perRunTimeLimit | Integer | It is time limit to train a single machine learning model to find the optimum tuning. Model fitting will be terminated if the machine learning algorithm runs over the time limit. It is good to set the time high enough so that the machine learning algorithm is able to fit on the training data and find optimum hyperparameter | Yes | 1000 | >0 | 0 | |
metric | String | Metric is used to measure and evaluate the performance of the machine learning algorithm. Different metric will give different performance result for trained model | Yes | "precision" | 'recall''precision''accuracy''f1Macro', 'f1Sample' | ||
evaluation | Boolean | If True, the output will contain validation results based on the selected metric as well as training resultsIf False, the output will only contain training results and training score | Yes | true | true or false | ||
saveModel | Boolean | Save model in database | Yes | true | true or false |
Response Body
Name | Datatype | Description | Sample Value | No. of decimal places | Notes |
---|---|---|---|---|---|
classificationReport | Dictionary | A report containing performance measures of the algorithm. The dictionary contains 2 main keys: avgs and labels. avgs is an array of dictionaries that has keys of avgName and metrics. Valid avgName are: macro avg, micro avg, samples avg, weighted avg. Valid metrics are: precision, recall, f1-score, support. labels is an array of dictionaries that has keys of goalType value and metrics of precision, recall, f1-score, support | "classification_report": { "avgs": { "macro avg": { "f1-score": 0.35233, "precision": 0.45698, "recall": 0.33252, "support": 756 }, "micro avg": { "f1-score": 0.68188, "precision": 0.73994, "recall": 0.63228, "support": 756 }, "samples avg": { "f1-score": 0.65133, "precision": 0.71667, "recall": 0.66, "support": 756 }, "weighted avg": { "f1-score": 0.61201, "precision": 0.6255, "recall": 0.63228, "support": 756 } }, "labels": { "business": { "f1-score": 0, "precision": 0, "recall": 0, "support": 16 }, ... "wedding": { "f1-score": 0.52727, "precision": 0.78378, "recall": 0.39726, "support": 73 } } } | number: 5 dp | |
f1-score | Number | F1 Score is the weighted average of Precision and Recall. Therefore, this score takes both false positives and false negatives into account. F1 is usually more useful than accuracy, especially if you have an uneven class distribution. Accuracy works best if false positives and false negatives have similar cost. If the cost of false positives and false negatives are very different, it’s better to look at both Precision and Recall. F1 Score = 2(Recall Precision) / (Recall + Precision) | 5 | ||
precision | Number | Precision is the ratio of correctly predicted positive observations to the total predicted positive observations. High precision relates to the low false positive rate. Precision = TP/TP+FP | 5 | ||
recall | Number | Recall is the ratio of correctly predicted positive observations to all the positive observations in actual class. Recall = TP/TP + FN | 5 | ||
support | Integer | Support is the number of occurrences of each class in y_true | 0 | ||
evaluationScore | Number | It describes the validation score based on validation dataset | 0.45698 | 5 | |
multilabelConfusionMatrix | Array of dictionaries | It computes a confusion matrix for each class or sample (for this scenario, it is goalType) in the dataset. multilabelConfusionMatrix has keys of goalType , falseNegative, falsePositive, trueNegative and truePositive | "multilabelConfusionMatrix": [ { "falseNegative": 41, "falsePositive": 31, "name": "retirement", "trueNegative": 205, "truePositive": 23 }, { "falseNegative": 29, "falsePositive": 149, "name": "growWealth", "trueNegative": 63, "truePositive": 59 },... } | 0 | |
trainingScore | Number | It describes the training score based on training dataset | 0.41273 | 5 |