machine learning · 2024
A deep learning project using convolutional neural networks (CNNs) to classify images of flowers from the Oxford 102 Category Flower Dataset, achieving 75.12% test accuracy.

This project develops and evaluates a deep convolutional neural network (CNN) for automated classification of flower species from images using the Oxford 102 Category Flower Dataset [3]. The network was implemented in PyTorch 2.2.2, trained on HPC resources (Viking cluster with H100 GPU), and achieved a final accuracy of 75.12% on the test set after extensive hyperparameter optimization.
Image classification represents a fundamental task in computer vision with applications spanning from medical image analysis [1] to botanical research [3]. Convolutional Neural Networks (CNNs) have emerged as the dominant approach for image classification since their breakthrough in 2012 [2].
The task of flower species classification presents unique challenges compared to simpler classification problems such as license plate recognition or manufacturing defect detection. The primary difficulties include:
Automated flower classification has significant practical applications:
Dataset: Oxford 102 Category Flower Dataset
The Oxford 102 Category Flower Dataset contains images of flowers commonly occurring in the United Kingdom, with each class consisting of between 40 and 258 images. The dataset uses predefined splits:
Each image was preprocessed by resizing to 224 × 224 pixels to match standard CNN input dimensions.
To address the limited training data and class imbalance issues [13], extensive data augmentation was implemented:
training_transform = transforms.Compose([ transforms.Resize((256, 256)), transforms.RandomRotation(30), transforms.RandomHorizontalFlip(), transforms.RandomVerticalFlip(), transforms.ColorJitter(brightness=0.2, contrast=0.1, saturation=0.1, hue=0.1), transforms.RandomAffine(degrees=20, translate=(0.1, 0.1), scale=(0.8, 1.2)), transforms.RandomResizedCrop(224), transforms.ToTensor(), transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225]) ])
This augmentation strategy effectively doubled the training dataset size from 1,020 to 2,040 images, improving model generalization [12].
The CNN architecture draws inspiration from VGG [8] and ResNet [9] designs, incorporating modern deep learning principles [6]:

Convolutional Stack (9 layers):
Fully Connected Classifier:
The model optimization employs cross-entropy loss [10], mathematically defined as:
where:
The Adam optimizer [11] was selected for parameter updates:
where:
Hyperparameter Configuration:
| Parameter | Tested Values | Selected Value |
|---|---|---|
| Learning Rate | ([0.001, 0.0001, 0.00001]) | (0.0001) |
| Batch Size | ([8, 16, 32, 64]) | 8 (train), 64 (val/test) |
| Epochs | ([1, 1200]) | 1000 |
| Image Size | Various | (224 \times 224) |
Learning Rate Scheduling:
where and step size = 500 epochs.
The training process involved:
Hardware & Software:
The final trained model achieved a test accuracy of 75.12% across all 102 flower categories, representing significant improvement over the initial baseline accuracy of 4%.

A detailed examination of the confusion matrix reveals that the model's performance varies significantly across different flower species, largely due to visual similarities and dataset imbalances.
Notably, flowers with similar appearances—such as certain lilies, orchids, and daisies—were often confused with one another, resulting in lower per-class accuracy. This trend highlights the limitations of the model when distinguishing between species with overlapping visual features, particularly in the presence of complex backgrounds or occlusions. The confusion matrix further indicates that classes with fewer training samples are more susceptible to misclassification, underscoring the importance of balanced datasets and targeted data augmentation for underrepresented categories.
Overall, while the model demonstrates strong performance on visually distinctive and well-represented classes, it struggles with species that are either underrepresented or visually similar to others. Future work could address these challenges by incorporating advanced augmentation techniques, leveraging transfer learning from larger botanical datasets, or integrating attention mechanisms to help the model focus on subtle discriminative features.
This project successfully demonstrated the effectiveness of deep CNNs for automated flower species classification, achieving 75.12% accuracy on the challenging Oxford 102 dataset. The performance is particularly noteworthy given the small dataset size and high inter-class similarity.
This research was conducted using the Viking High-Performance Computing cluster provided by the University of York. I also thank the IT Services and Research IT team for computational support and infrastructure maintenance.