Abstract:
Federated learning (FL) is a privacy-preserving machine learning approach that enables the training of models across multiple decentralized edge devices without exchanging raw data. However, local models trained only on local data often fail to generalize well to unseen samples. Moreover, in the context of an end-to-end ML model at scale, it is not feasible to repeatedly train from scratch whenever new data arrives. Therefore, it is essential to employ continual learning to update models on the fly instead of retraining them from scratch. Continual Federated Learning enhances the efficiency, privacy, and scalability of federated learning systems by learning new tasks while preventing catastrophic forgetting of previous tasks. The primary challenge of Continual Federated Learning is global catastrophic forgetting, where the accuracy of the global model trained on new tasks declines on the old tasks. In this work, we propose a novel strategy, Bayesian Gradient Descent in Continual Federated Learning(CFL-BGD) to overcome catastrophic forgetting. We derive new local optimization problems, based on Bayesian continual learning and FL principles. We conduct extensive experiments on Permuted MNIST and Split MNIST without task boundaries, demonstrating the effectiveness of our method in handling non-IID data distributions with varying levels of heterogeneity, and in mitigating global catastrophic forgetting. Unlike other continual learning methods like EWC, which take some core action based on task boundaries, our approach does not require any knowledge of task boundaries, making it more versatile and practical. The results show that our method significantly improves the performance and robustness of the global model across various tasks, highlighting the potential of our strategy in real-world federated learning applications.