ðŸŠī Berwin Gan

          • Geometric Folding Algorithms: Linkages, Origami, Polhedra
            • FlexOlmo: Open Language Models for Flexible Data Use
            • Luminal - Search-Based Deep Learning Compilers
            • Modern Position Encodings in Transformers (RoPE/Yarn and PaTH)
              • REOrdering Patches Improves Vision Models
              • How Attention Sinks Keep Language Models Stable
              • Muon - An optimizer for hidden layers in neural networks
            • Distillation Robustifies Unlearning
            • H-Net
            • Hierarchical Reasoning Model
            • Language Models in Plato's Cave
            • Learning Compositional Models of the World
            • STP: Self-Play LLM Theorem Provers with Iterative Conjecturing and Proving
      • Aggregate Voting Rank ðŸ—ģïļ
      • Covering Discs and Orthants 📐
      • Lambda Calculus ðŸ§Ū
    Home

    âŊ

    Notes 🗒ïļ

    âŊ

    Machine Learning ðŸĪ–

    âŊ

    GPU Mode

    Folder: Notes-🗒ïļ/Machine-Learning-ðŸĪ–/GPU-Mode

    3 items under this folder.

    • Aug 29, 2025

      Modern Position Encodings in Transformers (RoPE/Yarn and PaTH)

      • Aug 26, 2025

        FlexOlmo: Open Language Models for Flexible Data Use

        • MoE
        • transformer
        • ScaleML
      • Jul 13, 2025

        Luminal - Search-Based Deep Learning Compilers

        • compiler
        • search
        • kernels

      Created with Quartz v4.4.0 ÂĐ 2025

      • GitHub
      • Discord Community