Toward an Internal Governance Architecture for General-Purpose AI Systems

Wesley Sutherland

Authors

Wesley Sutherland College of Engineering, Iowa State University

Abstract

The rapid evolution of general-purpose artificial intelligence (GPAI) has transitioned from narrow task-optimization to expansive, autonomous agency, creating a governance vacuum that traditional external regulatory frameworks are ill-equipped to fill. Current governance models rely heavily on post-hoc filtering, application-level constraints, and reactive policy enforcement, which treat the AI system as a black box. This paper argues that such externalized governance is structurally insufficient for managing the emergent risks of deceptive alignment, goal drift, and systemic brittleness inherent in high-dimensional agentic systems. We propose a transition toward an Internal Governance Architecture (IGA) that embeds normative constraints, safety protocols, and accountability mechanisms directly into the system’s latent reasoning layers and architectural substrate. By synthesizing perspectives from systems engineering, socio-technical infrastructure, and the foundational necessity of internal reasoning traces, this research explores the feasibility of accountability-by-design. We analyze the structural trade-offs between computational efficiency and the depth of internal monitoring, the sustainability of deploying resource-intensive auditing frameworks at scale, and the policy implications for global alignment standards. The discussion emphasizes the necessity of penetrating the internal logic of autonomous systems to ensure fairness and robustness in multi-agent ecosystems. We conclude by providing a roadmap for the institutionalization of internal governance, suggesting that the stability of the future socio-technical landscape depends on our ability to govern the intelligence of AI from the inside out, moving beyond superficial constraints to a deeper dimensional integration of oversight.

References

1.Chen, L. (2026). Beyond External Constraints: The Missing Dimension of AI Governance. Available at SSRN 6449738.

2.Amodei, D., Olah, C., Steinhardt, J., Christiano, P., Schulman, J., & Mané, D. (2016). Concrete problems in AI safety. arXiv preprint arXiv:1606.06565.

3.Bostrom, N. (2014). Superintelligence: Paths, dangers, strategies. Oxford University Press.

4.Elhage, N., Nanda, N., Olsson, C., Henighan, T., Joseph, N., Mann, B., ... & Olah, C. (2021). A mathematical framework for transformer circuits. Transformer Circuits Thread.

5.Perrow, C. (1984). Normal accidents: Living with high-risk technologies. Basic Books.

6.Hubinger, E., van Merwijk, C., Mikulik, V., Joichi, S., & Garrabrant, S. (2019). Risks from learned optimization in advanced machine learning systems. arXiv preprint arXiv:1906.01820.

7.Russell, S. (2019). Human compatible: Artificial intelligence and the problem of control. Viking.

8.Christian, B. (2020). The alignment problem: Machine learning and human values. W. W. Norton & Company.

9.Crawford, K. (2021). The atlas of AI: Power, politics, and the planetary costs of artificial intelligence. Yale University Press.

10.Hill, M. D., & Janapa Reddi, V. (2019). Hardware-enabled AI security. Communications of the ACM, 62(1), 48-56.

11.Zyskind, G., & Nathan, O. (2015). Decentralizing privacy: Using blockchain to protect personal data. 2015 IEEE Security and Privacy Workshops, 180-184.

12.O’Neil, C. (2016). Weapons of math destruction: How big data increases inequality and threatens democracy. Crown.

13.Gabriel, I. (2020). Artificial intelligence, values and alignment. Minds and Machines, 30(3), 411-437.

14.Selbst, A. D., Boyd, D., Friedler, S. A., Venkatasubramanian, S., & Vertesi, J. (2019). Fairness and abstraction in sociotechnical systems. Proceedings of the 2019 Conference on Fairness, Accountability, and Transparency, 59-68.

15.Rahwan, I., Cebrian, M., Obradovich, N., Bongard, J., Bonnefon, J. F., Breazeal, C., ... & Wellman, M. (2019). Machine behaviour. Nature, 568(7753), 477-486.

16.Dafoe, A. (2018). AI governance: A research agenda. Governance of AI Program, Future of Humanity Institute, University of Oxford.

17.Whittlestone, J., Nyrup, R., Alexandra, H., & Cave, S. (2019). The role and limits of principles in AI ethics: Towards a focus on tensions. Proceedings of the 2019 AAAI/ACM Conference on AI, Ethics, and Society.

18.Calo, R. (2017). Artificial intelligence policy: A primer and roadmap. UC Davis Law Review, 51, 399.

19.Jobin, A., Ienca, M., & Vayena, E. (2019). The global landscape of AI ethics guidelines. Nature Machine Intelligence, 1(9), 389-399.

20.Floridi, L., & Cowls, J. (2019). A unified framework of five principles for AI in society. Harvard Data Science Review, 1(1).

21.Tegmark, M. (2017). Life 3.0: Being human in the age of artificial intelligence. Knopf.

22.Wiener, N. (1960). Some moral and technical consequences of automation. Science, 132(3436), 1355-1358.

23.Ord, T. (2020). The precipice: Existential risk and the future of humanity. Hachette Books.

24.Noble, S. U. (2018). Algorithms of oppression: How search engines reinforce racism. NYU Press.

25.Pasquale, F. (2015). The black box society: The secret algorithms that control money and information. Harvard University Press.

26.Birhane, A. (2021). Algorithmic injustice: A relational ethics approach. Patterns, 2(2).

27.Eubanks, V. (2018). Automating inequality: How high-tech tools profile, police, and punish the poor. St. Martin's Press.

28.Zuboff, S. (2019). The age of surveillance capitalism: The fight for a human future at the new frontier of power. PublicAffairs.

29.Pearl, J. (2019). The book of why: The new science of cause and effect. Basic Books.

30.Sutton, R. S., & Barto, A. G. (2018). Reinforcement learning: An introduction. MIT Press.

Toward an Internal Governance Architecture for General-Purpose AI Systems

Authors

Abstract

References

Downloads

Published

How to Cite

Issue

Section

License

Current Issue

Information

Make a Submission

Journal Information

Indexing & Infrastructure