🪴 Berwin Gan

          • Hyperloglog
          • SQL Lite Clone
              • REOrdering Patches Improves Vision Models
              • Muon - An optimizer for hidden layers in neural networks
              • Asymmetric REINFORCE for off-Policy Reinforcement Learning: Balancing positive and negative rewards
              • PPO vs EPO
            • Distillation Robustifies Unlearning
            • Language Models in Plato's Cave
            • Learning Compositional Models of the World
            • Location and Editing Factual Associations in GPT
            • STP: Self-Play LLM Theorem Provers with Iterative Conjecturing and Proving
            • Strategic Classification
          • Large Language Model Agents 🧠 (CS 294/197-196)
      • Aggregate Voting Rank 🗳️
      • Covering Discs and Orthants 📐
      • GPU Mode
      • Lambda Calculus 🧮
    Home

    ❯

    tags

    ❯

    Tag: self-play

    Tag: self-play

    1 item with this tag.

    • Jun 28, 2025

      STP: Self-Play LLM Theorem Provers with Iterative Conjecturing and Proving

      • self-play
      • agents

    Created with Quartz v4.4.0 © 2025

    • GitHub
    • Discord Community