The number of Internet of Things (IoT) devices has increased considerably in the past few years, which resulted in an exponential growth of cyber attacks on IoT infrastructure. As part of a defense in depth approach to network security, intrusion detection systems (IDS) have acquired a key role as they attempt to detect malicious activities promptly and efficiently. In this thesis, an investigation on the use of ensemble learning and federated learning as methods to develop IDS in IoT environment is proposed. Three main contributions are offered, which were evaluated on two open-source datasets, namely ToN IoT and CICIDS2017. The first contribution is a novel method based on a combination of ensemble models. The method uses ensemble stacking and boosting to detect anomalies in IoT traffic. Three machine learning models, namely kNN, Decision Tree and Logistic Regression, are used as the base learners for the stacking model. The XGBoost model is used as the meta learner. Results show that the proposed model is capable of high accuracy, precision, recall and F1-Score in both datasets in binary and multi-class classification. Secondly, this thesis proposes another novel IDS approach based on a stacking ensemble of deep learning (DL) models. This approach is named Deep Integrated Stacking for the IoT (DIS-IoT), as it combines four different DL models into a fully connected DL layer, creating a standalone ensemble stacking model. Results demonstrate that DIS-IoT is capable of a high level of accuracy with a very low False Positive rate (FPR) in both datasets improving on other standard, standalone, DL methods. Results from this set of experiments were also compared against results available in the literature, which were obtained from similar approaches on the ToN IoT dataset. DIS-IoT achieves comparable performance with others in binary classification, but outperforms them in multi-class classification. The third contribution uses Federated Learning (FL) as an alternative, distributed, method to a centralized intrusion detection model. The FL model is composed of four clients and one server. Data analysis was performed at the client side, each using their own portion of the dataset. No data sharing between participants occurred, hence maintaining data privacy. The results from the experiments demonstrated that a collaborative federated system using horizontal data partitioning and the FedAvg aggregation algorithm, can have a comparable performance with a centralized model, making it a viable option for an IoT IDS. Moreover, several other federated averaging algorithms were evaluated in order to verify their efficacy in this setting. These were FedAvgM, FedAdam and FedAdagrad. The experiments demonstrated that FedAvg and FedAvgM were the most efficient options in the given scenario. However, further research in alternative, larger, settings are required to evaluate FedAdam and FedAdagrad more accurately.
Date of Award | 2023 |
---|
Original language | English |
---|
Awarding Institution | - Glasgow Caledonian University
|
---|
Supervisor | Huaglory Tianfield (Supervisor) & Vassilis Charissis (Supervisor) |
---|
Stacking Ensemble and Federated Learning for IoT Intrusion Detection
Lazzarini, R. (Author). 2023
Student thesis: Doctoral Thesis › Doctor of Philosophy (PhD)