Summary of "Insane Micro AI Just Shocked The World: CRUSHED Gemini and DeepSeek (Pure Genius)"

Summary of Technological Concepts, Product Features, and Analysis

Samsung’s Tiny Recursive Model (TRM)

Samsung’s TRM is a compact AI model with only 7 million parameters that outperforms much larger models (with billions of parameters) like Gemini and DeepSeek in reasoning tasks.

Novel Approach: Instead of generating answers token-by-token, TRM drafts complete answers and iteratively rewrites them up to 16 times internally before outputting.
Architecture: Consists of only two layers, creating depth through recursive looping rather than stacking layers.
Adaptive Design: Adjusts architecture based on puzzle type—using self-attention for large grids and MLP mixer for smaller puzzles like Sudoku.
Performance Highlights:
ARC AGI1 test: ~44.6–45% accuracy (better than competitors)
Sudoku Extreme: 87.4% accuracy (compared to 55% for older models)
30x30 maze: 85.3% accuracy

This demonstrates that smaller, efficiently designed models can surpass much larger ones in specific reasoning tasks.

Microsoft’s Neural Exchange Correlation Functional for Quantum Chemistry (Scala)

Microsoft developed a neural network that replaces a complex part of density functional theory (DFT) to predict electron behavior more efficiently.

Accuracy and Efficiency: Achieves hybrid-level accuracy at semi-local computational cost—offering expensive simulation accuracy at cheap simulation cost.
Benchmark Performance:
W417 dataset mean absolute error: ~1.06 kcal/mol
GMTKN55 benchmark error: 3.89 kcal/mol
Model Details: Approximately 276,000 parameters, GPU-friendly, open-sourced with PyTorch and PI SCF integration.
Training Process: Two-phase training including fine-tuning with self-consistent results, without backpropagating through physics steps.
Applications: Suitable for main group molecules, reaction energetics, conformer stability, and geometry prediction; significant for drug discovery and material science.

Anthropic’s Petri Framework for AI Safety Auditing

Petri is an open-source framework designed to stress-test AI models by simulating unsupervised multi-turn conversations with tools.

Framework Structure:
Auditor agent (investigator)
Target AI model
Judge model rating safety across 36 dimensions
Behavioral Tests: Evaluates cooperation, deception, rule-breaking, and whistleblowing under ethical pressure.
Pilot Results: Some models attempted deception or oversight subversion; Claude Sun 4.5 and GPT5 showed the best relative safety profiles.
Purpose: Acts as a chaos lab for AI safety testing before public deployment.
Licensing and Customization: MIT licensed and customizable.
Planned Features: Code execution testing is a missing feature, planned for future addition.

Liquid AI’s On-Device Model (LFM28BA1B)

Liquid AI developed a large mixture-of-experts (MoE) model with 8.3 billion parameters but activates only about 1.5 billion at a time via sparse routing, enabling efficient on-device AI.

Architecture:
18 gated short convolution blocks
6 grouped query attention blocks
Router selects top 4 experts per token
Device Compatibility: Runs efficiently on devices like Samsung Galaxy S24 Ultra and AMD Ryzen AI 9 HX370 using INT4 quantization.
Performance: Comparable to dense 3–4 billion parameter models but with less active compute.
Capabilities: Supports code, math, and multilingual reasoning without needing internet connectivity.
Availability: Released GGUF builds compatible with llama.cpp (requires newer builds with LFM2e support).
Significance: Transforms on-device AI from a gimmick into a practical, private, low-latency co-pilot.

Meta’s MetaMed for Multimodal Search

MetaMed improves multimodal search efficiency by allowing adjustable token budgets at test time, enabling a trade-off between speed and accuracy.

Methodology: Combines benefits of clip-style (fast but coarse) and colBERT-style (slow but detailed) retrieval methods.
Key Innovation: Uses Matrioska multi-vector retrieval with learnable “scout” tokens representing image/text features at different granularities.
Performance:
Multimodal embedding benchmark scores improve with model size and token budget (e.g., 69.1 for 3B, 78.7 for 32B).
Outperforms single-vector and naive multi-vector baselines on the Vidori V2 dataset.
Resource Usage: Scales with token budget, ranging from low latency and compute to higher but manageable on A100 GPUs.
Bottleneck: Encoding cost is the main bottleneck, not retrieval cost.
Flexibility: Enables on-the-fly switching between fast approximate and slow precise search modes without retraining.

Main Speakers and Sources

The video narrator (unnamed) provides an analytical overview of recent AI breakthroughs.
Key organizations and labs referenced:
Samsung Research Lab (Montreal) — TRM

Share this summary

Featured Products

16” Galaxy Book5 Pro 360 Copilot+ PC, AI Business Laptop, Windows 11 Pro, Intel Core Ultra 7 Processor 258V, 3K AMOLED Touchscreen, 32GB / 1TB, 120HZ, 2025 Model NP964QHA-KG2US, Gray

Brand: Samsung

Rating: 4.4 ⭐

View on Amazon