Treasury Just Told Every Bank in America to Govern AI. The Infrastructure to Do It Doesn’t Exist.
The U.S. Department of the Treasury released the Financial Services AI Risk Management Framework last week, the first of 6 planned operational resources targeting banks, insurers, and fintechs, and it landed in the same news cycle as the Pentagon handing OpenAI a defense contract and banning every federal agency from using Anthropic. The spectacle absorbed the oxygen. The substance went to Treasury.
The framework adapts NIST’s broader AI Risk Management Framework into sector-specific guidance for financial institutions, accompanied by an AI Lexicon designed to standardize the vocabulary that regulators, compliance officers, technology providers, and bank examiners use when they discuss what AI systems are doing inside the institutions they oversee. The framework includes a questionnaire for assessing institutional AI adoption maturity and a matrix of 230 control objectives organized by lifecycle stage, so that a community bank in its first year of deploying AI and a global systemically important institution running hundreds of models can both locate themselves within the same governance architecture and understand what the federal government expects of them.
This is the most significant federal action on AI governance in financial services since the original NIST AI RMF shipped in January 2023, and its significance lies less in any individual control objective than in what the document reveals about the structural condition of the industry it addresses.
The Diagnosis Treasury Just Made Public
Treasury’s rationale is stated with unusual directness. Financial institutions increasingly rely on AI to support decision-making, customer engagement, and operational functions, and inconsistent terminology combined with uneven risk management practices have created challenges for effective governance and oversight. That sentence, buried in a press release, is a federal agency acknowledging that the financial sector’s AI governance problem is fundamentally epistemological, that the barrier to governing these systems is the absence of shared meaning across the organizations deploying them, and that no amount of model capability addresses a condition in which the people responsible for oversight cannot agree on what the terms in their own policies refer to.
The AI Lexicon exists because Treasury determined that financial institutions, their regulators, and their technology vendors have been using the same words to describe different things, which means that every governance framework, every audit protocol, and every risk assessment built on those words inherits an ambiguity that compounds with every model deployed, every vendor onboarded, and every regulatory examination conducted under the assumption that shared vocabulary implies shared understanding.
The 230 control objectives exist because Treasury determined that the NIST AI RMF, while conceptually sound, operates at a level of abstraction that leaves a compliance officer at a regional bank without actionable guidance on what to actually do when the institution begins deploying AI into lending decisions, fraud detection, customer onboarding, or regulatory reporting. The FS AI RMF attempts to close that gap by translating principles into controls, which is exactly the kind of work that institutions need and exactly the kind of work that exposes a deeper structural problem the moment you ask how those controls get enforced.
The Gap Between Defining Controls and Enforcing Them
Every major financial institution today operates in a multi-model environment. They run systems built on OpenAI, Anthropic, Google, Meta, and an expanding array of open-weight alternatives, each with its own architecture, its own safety taxonomy, its own internal representation of what knowledge means and how confidently the system holds it. The bank sits in the middle of that architecture, responsible under Treasury’s new framework for 230 control objectives spanning the entire AI lifecycle, and possessing no infrastructure capable of verifying compliance across providers at the point of inference.
This condition is worth dwelling on because it reveals a category of problem that governance frameworks are structurally unable to resolve on their own. A framework can define what “explainability” requires. It cannot verify that the term means the same thing when applied to a transformer-based language model from one provider and a gradient-boosted decision tree from another and a proprietary ensemble method from a third, all operating simultaneously inside the same institution, all subject to the same regulatory expectations, all producing outputs that a bank examiner will eventually ask to see documented in a common audit format.
The mathematics for this kind of cross-model verification exist. Procrustes alignment, Centered Kernel Alignment, fidelity gating, these are techniques developed in transfer learning research for measuring whether two representations of the same concept actually represent the same thing. They have been validated in academic settings and published in peer-reviewed literature. What they have not been is operationalized into production infrastructure that a regulated financial institution can deploy between its AI providers and its compliance architecture to produce the kind of continuous, mathematically grounded audit trail that 230 control objectives implicitly require.
The industry that has emerged to address AI governance in financial services has oriented itself primarily around observability, monitoring dashboards that watch what models do after the fact and alert compliance teams when outputs drift beyond predefined thresholds. That approach has value, and it addresses a real need, but it operates in fundamentally the same mode as the quarterly model review it was designed to augment: retrospective, periodic, and incapable of preventing the governance failure it detects. What 230 control objectives across the full AI lifecycle actually demand is something categorically different, infrastructure that operates at the point of inference, that verifies meaning preservation in real time, that produces audit artifacts as a byproduct of operation rather than as a product of after-the-fact reconstruction.
The Next 5 Resources Will Compound This
Treasury has announced that the AI Lexicon and FS AI RMF are the first 2 of 6 planned deliverables, with the remaining 4 addressing identity, fraud, explainability, and data practices. Each of those domains intensifies the multi-model governance problem in ways that are worth anticipating.
Identity verification across AI systems requires a mechanism for confirming that the entity a model identifies as “the same customer” in one context is the same entity another model identifies in a different context, across different architectures, different training distributions, and different representational schemas. Fraud detection that spans providers requires a shared basis for evaluating when anomalous behavior detected by one model corresponds to anomalous behavior detected by another, which is a question about representational alignment that monitoring dashboards are architecturally incapable of answering. Explainability guidance will need to reckon with the fact that what constitutes an explanation varies by model architecture, and that a bank examiner asking “why did the model make this decision” will expect an answer that holds across providers, which requires infrastructure capable of translating between explanatory frameworks. Data practice standards will need to address provenance across the entire inference chain, tracking how information transforms as it moves through multiple models, which requires the kind of continuous fidelity measurement that only a runtime layer positioned between providers and the enterprise can supply.
Each resource Treasury releases will make the case for this infrastructure more explicit, and each will arrive into an industry that has not yet built it.
What This Means for the Institutions That Have to Comply
The institutions that build or adopt cross-model governance infrastructure in the next 12 to 18 months will be the ones capable of demonstrating compliance when the full weight of Treasury’s 6-resource program comes into effect. The institutions that treat each resource as a standalone policy exercise, updating their governance documents and scheduling additional quarterly reviews, will find themselves in the position that every regulated industry eventually discovers when operational requirements outpace procedural responses: technically compliant on paper and structurally incapable of demonstrating it under examination.
Treasury did something genuinely important with this release. It moved AI governance in financial services from principles to operations, from aspirational statements about responsible AI to 230 specific control objectives that compliance teams can map against their actual deployments. The Acting Deputy Secretary said it directly: implementing the AI Action Plan requires practical resources that institutions can use, and establishing a common language for AI and a tailored framework for managing AI risks in financial services helps protect consumers while supporting responsible innovation.
That is the right ambition. The question that remains is whether the infrastructure necessary to fulfill it will be built by the institutions that need it, acquired from the companies building it, or simply assumed to exist by the regulators defining the requirements, which is how governance gaps become systemic risks, and how systemic risks become the kind of failures that retrospective analyses describe as having been obvious in hindsight.
Source: U.S. Department of the Treasury — Treasury Releases Two New Resources to Guide AI Use in the Financial Sector, February 20, 2026.
ARX is building the stateful runtime layer for enterprise AI — governance, institutional memory, and cognitive portability across providers, models, and regulatory jurisdictions. Learn more at arxqm.com.
#AIGovernance #FinancialServices #NIST #AIRiskManagement #Treasury #Banking #RegulatoryCompliance #EnterpriseAI #ModelRisk #AIInfrastructure #Fintech #ARX