The last decade has been marked by an ever-increasing need to better assess the impact of new AI technologies. This is particularly true this year with recent development with text-to-image models and large language models (e.g., Open AI’s ChatGPT, Anthropic’s Claude) that can be considered general-purpose AI systems: systems that can be used for a broad set of use cases and have not been built with a specific purpose or user in mind. However, there are limitations with existing approaches for assessing the socio-technical implications of AI systems, especially once we consider the rise of new legislations for regulating AI and the need for legally enforceable norms. In response to this need, new frameworks are emerging around the idea that AI should not only be designed following codes of ethics, but also “auditable.”
AI audits aspire to be similar to the role that audits play in the governance of financial and other sectors. Attempts to audit general-purpose AI raises significant issues that have not be addressed satisfactorilly. First, in what sense should an audit of a general-purpose system be different from existing AI impact assessments? Can existing frameworks still be used or adapted? Second, AI auditing is an emerging set of practices, and it has yet to become a well-established field with clear and actionable norms. Third, how can we meaningfully audit a system that has vast domain of application and almost infinite use-cases?
This project will address these challenges with two streams of research activities and knowledge mobilization initiatives. Each stream will span over one project year.
In Stream 1 , we will assess current auditing practices for AI systems, determine current best practices and challenges, and identify the new challenges raised by general-purpose systems.