IIIT-Delhi Institutional Repository

Fairness in machine learning

Show simple item record

dc.contributor.author Mittal, Vani
dc.contributor.author Shah, Rajiv Ratn (Advisor)
dc.date.accessioned 2026-04-15T07:42:32Z
dc.date.available 2026-04-15T07:42:32Z
dc.date.issued 2025-08
dc.identifier.uri http://repository.iiitd.edu.in/xmlui/handle/123456789/1883
dc.description.abstract Ensuring fairness in machine learning systems has become increasingly crucial as these models are deployed in socially consequential domains such as lending, hiring, and criminal justice. A significant challenge in fairness evaluation arises when sensitive attributes, such as race or gender, are unavailable due to privacy constraints or demographic scarcity. Traditional approaches rely on off-the-shelf proxy models to infer missing sensitive attributes; however, such methods can misrepresent true fairness, leading to potentially biased decisions. This thesis investigates the theoretical and practical framework proposed by Zhu et al. (2023) for fairness evaluation using weak proxies. We systematically implement a controlled pipeline to generate synthetic datasets via a Gaussian Copula, train multiple weak and independent proxy classifiers, and aggregate their predictions using ensemble techniques. Our experimental setup evaluates the impact of proxy quality, ensemble size, and noise on fairness metrics, particularly focusing on Equalized Odds, across three datasets: Adult, COMPAS, and synthetic Gaussian data. The results demonstrate that naive use of proxy-sensitive attributes can underestimate true disparities, while ensembles of weak proxies, when appropriately calibrated, provide accurate and robust fairness estimates. Furthermore, introducing differential privacy via controlled noise allows us to study the trade-off between privacy and fairness, showing that even noisy proxies can yield reliable estimates when combined with generative modeling and majority voting. This work validates the theoretical claims of weak proxy sufficiency, highlights the critical conditions required for reliable fairness measurement, and provides practical guidelines for deploying privacy-preserving fairness audits in scenarios where sensitive information is partially or entirely inaccessible. en_US
dc.language.iso en_US en_US
dc.publisher IIIT-Delhi en_US
dc.subject Fairness in Machine Learning en_US
dc.subject Weak Proxies en_US
dc.subject Sensitive Attributes en_US
dc.subject Privacy-Preserving Fairness en_US
dc.subject Equalized Odds en_US
dc.subject Fairness Evaluation en_US
dc.subject Proxy-Based Auditing en_US
dc.title Fairness in machine learning en_US
dc.type Thesis en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search Repository


Advanced Search

Browse

My Account