Abstract:
An immense amount of data is generated daily usingmodern technologies in autonomous vehicles, IoT, smart grids, etc. But unfortunately, this data generated at the edge cannot be used for any machine learning model training due to privacy concerns or expensive computational costs. The data must be stored at the central server for any machine learning process to occur. To overcome the problem faced due to the traditional machine learning approach decentralized technique called Federated Learning started to gain popularity. Federated Learning allows multiple clients in the network to collaborate and learn a global machine learning model, which can be passed to all the edge devices for locally training a model while maintaining privacy since data is present at the edge device. The availability of annotated data is one of the challenges of supervised federated learning. Moreover, sometimes it is difficult for a global model to perform well for all the clients in the network due to the presence of heterogeneous data. In this thesis, a novel Personalized unsupervised Federated AutoEncoders,pFedAE, is proposed with the main motivation that local and global latent space representations of all the clients in the network. The optimisation framework of the autoencoder is divided into two parts global and per-client local optimisation frameworks. We have adopted two evaluation strategies to evaluate the latent space representation at both global and local levels. We demonstrated that pFedAE under both evaluation strategies performed better than the other baselines. pFedAE, most importantly, leads to faster convergence, is scalable for different numbers of clients, is effective in varying data distribution across clients and is robust to different numbers of local epochs. Using t-SNE projections and angle histogram plots, a comparison of the pFedAE with other baselines for latent space is also demonstrated in the later part of the thesis.