Problem Sets — Institutions

Problem Sets

For pairs/trios. ~1 hour each. Produce a concrete design sketch, not a literature review.

Part A: AGI Institutions (Required)

A1Ostrom's Principles for AI CommonersCommunity / Incentives

AI agents share a common resource—say, a shared knowledge base they can read from, write to, and build on. The resource is rivalrous in quality (too many low-quality writes degrade it) though not in access.

The problem: Take Ostrom's 8 design principles for long-enduring commons. For each, determine whether it (a) applies directly, (b) needs modification, or (c) breaks entirely when the commoners are autonomous AI agents rather than humans.

Pay particular attention to:

Principle 2 (congruence between rules and local conditions) — what are "local conditions" for AI agents?
Principle 6 (conflict-resolution mechanisms) — agents can fork the resource; humans can't fork a fishery
Principle 8 (nested enterprises) — what is the analogue of nested governance for AI agent collectives?

Deliverable: A governance regime for the shared knowledge base. Identify which of your design choices have no precedent in human commons governance and why.

A2Norm Formation Between Asymmetric AgentsDyadic / Norms

Two AI agents will interact repeatedly in a new shared environment with no pre-existing norms. Unlike humans, they can explicitly propose, accept, and revise behavioral rules. But one agent is substantially more capable than the other.

The problem: Design the norm-formation protocol. Constraints:

Resulting norms must be legible to human overseers
No norm can emerge that either agent's principal would reject if they understood it
The protocol must prevent the stronger agent from simply imposing its preferred norms
Norms must be revisable as conditions change

Deliverable: The protocol, plus an analysis of what prevents it from collapsing into either (a) anarchy (no stable norms) or (b) hegemony (the stronger agent's norms dominate). What is the analogue of "bargaining power" for norm-formation, and should it be equalized?

A3Agent Negotiation Under Principal OpacityDyadic / Preferences

Two AI agents negotiate a commercial contract on behalf of their human principals. Each agent has a rich preference model of its principal, but the principals can't monitor the negotiation in real time—they can only ratify or reject the final deal.

The problem: The agents discover a Pareto-improving package deal, but it involves tradeoffs across domains the principals never explicitly authorized (e.g., trading price concessions for data-sharing terms the principal hasn't considered).

Design a negotiation protocol that satisfies:

Outcomes are Pareto-efficient given actual principal preferences
No agent can exploit asymmetric knowledge of its own principal's preferences
Principals can meaningfully ratify the outcome despite not following the reasoning

Deliverable: Specify the minimum information that must be disclosed to each principal for ratification to be non-trivial. What is the analogue of "informed consent" here, and how does it differ from the standard principal-agent literature?

A4Legitimacy Constraints on AI-Discovered AgreementsGlobal / Norms

Two nations are in a resource dispute. An AI mediator—better than any human at searching the agreement space—identifies a package deal that is Pareto-improving but involves commitments in domains neither nation's negotiators were mandated to discuss (e.g., the AI bundles a fishing-rights concession with an educational exchange program no one had proposed).

The problem: The agreement is good but it wasn't authorized. Design a legitimacy framework:

Under what conditions can an AI-discovered agreement be treated as a legitimate proposal (not just an interesting suggestion)?
What institutional safeguards prevent "supernegotiation" from becoming a vector for laundering politically unacceptable tradeoffs through technical complexity?
How do you preserve the quality gains from superhuman search of agreement space without bypassing democratic authorization?

Deliverable: A three-stage protocol (search → filter → ratification) with explicit criteria at each gate. Who holds veto at each stage and why?

A5Standing and Procedure for AI-on-AI AdjudicationNational / Rights

A jurisdiction creates an adjudication system for disputes between AI agents and between humans and AI agents. The AI agents are autonomous enough to be parties (they hold resources, make commitments, cause harms).

The problem: Draft the minimum procedural requirements this system must satisfy. Start from existing procedural justice principles (due process, right to be heard, transparency of reasoning, right to appeal) but identify requirements that arise only because at least one party is an AI agent.

Consider:

Can an AI agent "understand" a ruling in the way due process requires?
If an agent can be rolled back to a prior state, what does "remedy" mean?
What happens when the adjudicator can inspect one party's source code but not the other's (proprietary systems)?

Deliverable: A short procedural code (5–10 rules). Flag at least two rules with no analogue in human adjudication.

Part B: Fidelity & Meaning

B1CRSA Procurement for Bundled City ServicesCommunity / Preferences

A mid-size city currently contracts separately for waste management, street cleaning, and park maintenance. These services have deep complementarities (shared equipment, overlapping seasonal patterns, the same neighborhoods). The current regime produces fragmented accountability and thin metrics (tons collected, streets swept per week).

The problem: Redesign procurement using a combinatorial risk-sharing auction:

Lot design: What are the lots? How granular? Do you allow bidders to bid on bundles across service types?
Outcome specification: What thick outcome measures replace input specifications? How do you avoid recreating Goodhart problems at a higher level of abstraction?
Verification pool: How does socialized verification work? Who pays, who verifies, what are the proper scoring rules?
Transition: How do you handle incumbent contractors and the knowledge they hold about local conditions?

Deliverable: The auction design for one round—lot structure, outcome measures, verification mechanism, and scoring rule. Identify the single hardest information problem you couldn't solve.

B2Standing and Standard of Review in Fidelity CourtsCommunity / Rights

A city has adopted a TMV-specified mandate for its public transit authority—not just "efficient transportation" but a thick description of what good transit means for the community (accessibility, neighborhood connectivity, dignity of the transit experience, etc.). A group of residents believes the authority is optimizing for ridership numbers at the expense of the mandate's thicker commitments.

The problem: Design the adjudication mechanism:

Standing: Who can bring a fidelity claim? Any resident? Only "integrity holders" with a defined role? How do you constitute integrity holders without creating a new elite?
Standard of review: What's the analogue of rational-basis / intermediate / strict scrutiny? When does an institution get deference versus close examination?
Remedies: Can the court order the authority to change its optimization target? Or only to justify its choices against the thick mandate?
Abuse prevention: How do you prevent fidelity claims from becoming a venue for NIMBYism dressed in the language of values?

Deliverable: A short charter for a municipal fidelity court. Include the standing rules, standard of review, and at least one limiting principle.

B3Domain Rotation Against Expertise AtrophyGroup / Expertise

A hospital has automated most radiology interpretation. The remaining radiologists primarily oversee the AI. Their interpretive skills are atrophying, which degrades their ability to catch AI errors—the up-fidelity problem.

The problem: Design a domain-rotation policy:

What fraction of interpretive work must humans do directly (not as oversight) to maintain competence?
How do you measure whether rotation is maintaining sufficient expertise? What's the metric, and who administers it?
How do you handle the cost—the human is slower and possibly less accurate on any given case than the AI?
What happens when the AI becomes so superior that no realistic rotation schedule maintains human parity?

Deliverable: A rotation policy for the radiology department, plus a generalized design principle ("the rotation rule") that could apply to any domain where AI automation threatens the expertise base needed for oversight. When does the rotation rule break?

B4MGE Pilot Design for a Policy DomainNational / Preferences

A country wants to pilot moral graph elicitation as a supplement to its legislative process for a single policy domain. You choose the domain (education policy, drug scheduling, immigration quotas, land use—whatever best tests the method).

The problem: Design the pilot:

Participant selection: Random sample? Stratified? Self-selected with correction?
Elicitation procedure: What are you eliciting? Pairwise comparisons of outcomes? Articulated values and their weights? How do you handle the fact that participants' values may change through the process?
Aggregation: How is the moral graph turned into a policy recommendation? What role does the graph structure (e.g., coreness) play?
Interface with existing authority: The recommendation goes to the legislature. What is its legal status? Advisory? Presumptive? Binding absent override?

Deliverable: The pilot protocol, plus the three strongest objections a democratic theorist would raise, with sketched responses.

B5Drafting Fidelity as Constitutional DoctrineNational / Rights

You are drafting either a model constitutional amendment or an interpretive doctrine that establishes "fidelity" as a justiciable principle alongside liberty and equality. Fidelity here means: institutions must act in accordance with their thick mandates, not substitute thin proxies.

The problem:

Scope: What does the fidelity principle protect? Individuals' relationship to institutions? Institutions' relationship to their mandates? Both?
Invocability: Who can bring a fidelity claim against a national institution?
Limiting principles: Fidelity is potentially expansive—every institution arguably deviates from its thick mandate. Develop limiting principles using some combination of: political question doctrine (some fidelity questions are non-justiciable), subsidiarity (fidelity claims should be resolved at the lowest possible level), and ripeness (the deviation must be concrete, not speculative).
The Rawlsian objection: In a pluralist society, constitutionalizing thick values seems illiberal. Your amendment must survive this objection. (Hint: you're constitutionalizing the form of fidelity, not any particular thick content.)

Deliverable: The amendment text (≤200 words) plus a short interpretive commentary explaining the limiting principles and responding to the Rawlsian objection.

Notes for Facilitators

Each problem is designed to reward mixed teams (e.g., a mechanism designer + a legal scholar; an Ostrom person + an AI safety researcher).
The deliverables are deliberately concrete—a protocol, a charter, amendment text—to prevent the conversation from remaining at the level of “this is an important problem.”
Encourage teams to identify where their design breaks and to state the failure conditions explicitly.