A decision tree is a supervised machine-learning technique for solving classification and regression issues. It is a tree-like model with internal nodes representing attribute tests, branches representing test results, and leaf nodes representing class labels or continuous values. The algorithm partitions data recursively based on the attribute with the most information gain or minimal impurity, minimizing entropy. However, it is susceptible to overfitting and can be improved through pruning or establishing a minimum number of instances per leaf node.

Unlock the Power of Decision Tree: Dive into 'Decision Tree in Python' on AKSTATS!

Let us see an example of an R program for a Decision Tree classifier using the 'PimaIndiansDiabetes' dataset from the "mlbench" package as the input data. We'll also include accuracy measures and a decision tree diagram.:

Please make sure to install the required packages before running the program by using the install.packages() function if necessary.

Load required libraries
library(mlbench) library(rpart) library(rpart.plot) library(caret)
Load the Pima Indians Diabetes dataset: We then load the "Pima Indians Diabetes" dataset using the data() function.
data(PimaIndiansDiabetes)
Split data into training and testing sets: Next, we split the data into training and testing sets using the createDataPartition() function from the "caret" package.
set.seed(123) trainIndex = createDataPartition(PimaIndiansDiabetes$diabetes, p = 0.7, list = FALSE) trainData = PimaIndiansDiabetes[trainIndex, ] testData = PimaIndiansDiabetes[-trainIndex, ] trainData testData

We train the Decision Tree classifier using the rpart() function from the "rpart" package. The method parameter is set to "class" for classification. 
Note: Refer to the R help command, to know about the sections in the model

Train Decision Tree classifier
model = rpart(diabetes ~ ., data = trainData, method = "class")
Here, if you want to know more about how the model is fitted which means the prob, loss, and variable importance. You can use summary(model) to see it. Not only in this case, but you can also view it in any of the models, especially in the R program

Plot decision tree
rpart.plot(model)
Click the photo and zoom it for a better view
Make predictions on the test set
predictions = predict(model, testData, type = "class")
Confusion matrix
confusionMatrix(predictions, testData$diabetes)
📊 The Decision Tree classifier achieved an accuracy of 76% on the PimaIndiansDiabetes dataset! 🎯🔍🔢 The confusion matrix helps know the model's performance.
Check the "Accuracy Measures" post to interpret more about the result.
Previous Post Next Post

Translate

AKSTATS

Learn it 🧾 --> Do it 🖋 --> Get it 🏹📉📊