Explainable Fairness Evaluation for Text-to-Image Diffusion Models in Underrepresented Cultural Contexts

Authors

  • Haocheng Hao Department of Computer Science and Engineering, University at Buffalo, Buffalo, NY, USA.
  • Nitin J. Banerjee School of Electrical Engineering and Computer Science, Oregon State University, Corvallis, OR, USA.
  • Arun Mehra Department of Computer Science, University of New Hampshire, Durham, NH, USA.

Keywords:

explainable artificial intelligence, fairness evaluation, text-to-image diffusion models, cultural bias, underrepresented communities, sociotechnical systems, model governance

Abstract

Text-to-image diffusion models have achieved remarkable progress in generating high-fidelity visuals from natural language prompts, yet they frequently perpetuate and amplify cultural biases that marginalize underrepresented communities. Traditional fairness evaluation methods, which rely on aggregate statistical metrics, often fail to capture the nuanced ways in which cultural context interacts with model representations, and they provide little insight into why certain groups are disadvantaged. This paper proposes a comprehensive framework for explainable fairness evaluation specifically designed for text-to-image diffusion models operating in underrepresented cultural contexts. The framework integrates explainable artificial intelligence techniques, including feature attribution, concept-based explanations, and counterfactual reasoning, to produce interpretable diagnostics that link model outputs to specific cultural signals in training data and model internals. We examine the structural trade-offs inherent in designing such an evaluation infrastructure, including the tension between explanatory completeness and computational tractability, the challenge of grounding fairness metrics in culturally situated knowledge, and the governance implications of deploying these tools across global deployment pipelines. Through a detailed system-level analysis, we consider how the architecture of diffusion models, from latent space representations to cross-attention mechanisms, shapes the propagation of cultural biases. A case illustration drawn from recent empirical findings on cultural gaps in text-to-image generation [17] highlights the need for evaluation methods that move beyond superficial demographic parity toward a richer understanding of sociotechnical alignment. The paper further discusses policy implications, sustainability of fairness evaluation under data and resource constraints, and the role of community participation in defining fairness standards. We conclude by outlining future directions for building robust, transparent, and culturally aware fairness evaluation systems that can guide the responsible deployment of generative AI in diverse global settings.

References

1. Rombach, R., Blattmann, A., Lorenz, D., Esser, P., & Ommer, B. (2022). High-resolution image synthesis with latent diffusion models. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 10684–10695).

2. Bender, E. M., Gebru, T., McMillan-Major, A., & Shmitchell, S. (2021). On the dangers of stochastic parrots: Can language models be too big? In Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency (pp. 610–623).

3. Buolamwini, J., & Gebru, T. (2018). Gender shades: Intersectional accuracy disparities in commercial gender classification. In Proceedings of the 1st Conference on Fairness, Accountability and Transparency (pp. 77–91).

4. Selvaraju, R. R., Cogswell, M., Das, A., Vedantam, R., Parikh, D., & Batra, D. (2017). Grad-CAM: Visual explanations from deep networks via gradient-based localization. In Proceedings of the IEEE International Conference on Computer Vision (pp. 618–626).

5. Saharia, C., Chan, W., Saxena, S., Li, L., Whang, J., Denton, E., ... & Norouzi, M. (2022). Photorealistic text-to-image diffusion models with deep language understanding. In Advances in Neural Information Processing Systems, 35, 36479–36494.

6. Zhao, J., Wang, T., Yatskar, M., Ordonez, V., & Chang, K. W. (2017). Men also like shopping: Reducing gender bias amplification using corpus-level constraints. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing (pp. 2979–2989).

7. Bolukbasi, T., Chang, K. W., Zou, J., Saligrama, V., & Kalai, A. (2016). Man is to computer programmer as woman is to homemaker? Debiasing word embeddings. In Advances in Neural Information Processing Systems, 29.

8. Danks, D., & London, A. J. (2017). Algorithmic bias in autonomous systems. In Proceedings of the 26th International Joint Conference on Artificial Intelligence (pp. 4691–4697).

9. Raji, I. D., Gebru, T., Mitchell, M., Buolamwini, J., Lee, J., & Denton, E. (2020). Saving face: Investigating the ethical concerns of facial recognition auditing. In Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society (pp. 145–151).

10. Lundberg, S. M., & Lee, S. I. (2017). A unified approach to interpreting model predictions. In Advances in Neural Information Processing Systems, 30.

11. Kim, B., Wattenberg, M., Gilmer, J., Cai, C., Wexler, J., Viegas, F., ... & Sayres, R. (2018). Interpretability beyond feature attribution: Quantitative testing with concept activation vectors (TCAV). In International Conference on Machine Learning (pp. 2668–2677).

12. Goyal, Y., Wu, Z., Ernst, J., Batra, D., Parikh, D., & Lee, S. (2019). Counterfactual visual explanations. In International Conference on Machine Learning (pp. 2376–2384).

13. Hendricks, L. A., Akata, Z., Rohrbach, M., Donahue, J., Schiele, B., & Darrell, T. (2018). Generating visual explanations with a bag of concepts. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 7382–7391).

14. Mittelstadt, B. D., Allo, P., Taddeo, M., Wachter, S., & Floridi, L. (2016). The ethics of algorithms: Mapping the debate. Big Data & Society, 3(2).

15. Holstein, K., Wortman Vaughan, J., Daumé III, H., Dudík, M., & Wallach, H. (2019). Improving fairness in machine learning systems: What do industry practitioners need? In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems (pp. 1–16).

16. Bhardwaj, R., Majumder, N., & Poria, S. (2021). Cultural considerations in AI: A survey. arXiv preprint arXiv:2106.02691.

17. Shi, C., Li, S., Guo, S., Xie, S., Wu, W., Dou, J., ... & Chua, T. S. (2025). Where Culture Fades: Revealing the Cultural Gap in Text-to-Image Generation. arXiv preprint arXiv:2511.17282.

18. Birhane, A. (2021). Algorithmic injustice: A relational ethics approach. Patterns, 2(2), 100205.

19. Beery, S., Horn, G. V., & Perona, P. (2020). Recognition in terra incognita. In Proceedings of the European Conference on Computer Vision (pp. 456–473).

20. Hovy, D., & Spruit, S. L. (2016). The social impact of artificial intelligence: A review. In Proceedings of the 15th International Conference on Autonomous Agents and Multiagent Systems (pp. 1–7).

21. Selbst, A. D., Boyd, D., Friedler, S. A., Venkatasubramanian, S., & Vertesi, J. (2019). Fairness and abstraction in sociotechnical systems. In Proceedings of the Conference on Fairness, Accountability, and Transparency (pp. 59–68).

22. Kaur, H., Nori, H., Jenkins, S., Caruana, R., Wallach, H., & Wortman Vaughan, J. (2020). Interpreting interpretability: Understanding data scientists’ use of interpretability tools. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems (pp. 1–14).

23. Raji, I. D., & Buolamwini, J. (2019). Actionable auditing: Investigating the impact of publicly naming biased performance results of commercial AI products. In Proceedings of the 2019 AAAI/ACM Conference on AI, Ethics, and Society (pp. 429–435).

24. Suresh, H., & Guttag, J. (2021). A framework for understanding unintended consequences of machine learning. Communications of the ACM, 64(5), 68–76.

25. Rudin, C. (2019). Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead. Nature Machine Intelligence, 1(5), 206–215.

Downloads

Published

2026-05-22

How to Cite

Haocheng Hao, Nitin J. Banerjee, & Arun Mehra. (2026). Explainable Fairness Evaluation for Text-to-Image Diffusion Models in Underrepresented Cultural Contexts. Computational Intelligence Systems, 4(1). Retrieved from https://scivexus.org/index.php/CIS/article/view/309