All Blog Posts

What I Learned Building My Own Operating System

January 12, 2025 | By Luis Sanchez

Notes from building LuisOS: a functional OS with kernel threads, user programs, and a file system.

Views

Why I Built It

LuisOS started as a way to understand computers beyond APIs. I wanted to see every layer—from bootloader to scheduler to file system—and make it work on real hardware constraints. I used the Pintos library as a base, but the meaningful parts (threads, file system, user programs) were built and tuned by hand.

Berkeley CS162 + AI-Native OS Mentorship

I took UC Berkeley CS162 with Ion Stoica and Matei Zaharia (Databricks co-founders), and later mentored under them on AI-optimized operating systems. Their work—Spark, Sky Lab’s AI-driven schedulers, and ADRS research—shaped how I think about kernels that learn. Instead of bolting ML onto Linux/Windows, the new wave builds AI in from the start: predictive schedulers, adaptive security, and agentic orchestration that cut latency for on-device inference on NPUs. Systems like Steve’s proactive maintenance agents or AthenaOS’s Rust-based swarms show how self-optimizing kernels can learn usage patterns and surface jazz-like complexity for future ML models. The Databricks view: what Spark did for distributed data, AI-native kernels can do at the endpoint—faster inference, better scheduling, and unified data/AI stacks for edge and enterprise clusters.

Architecture Decisions

I split the kernel into three planes: control (scheduling, policy), data (IPC, I/O paths), and learning (telemetry, reward signals). The learning plane logs scheduler decisions, cache hits/misses, IRQ storms, and page faults, then runs tiny inference on-device to bias policies. Early versions used simple heuristics; later versions swapped in a distilled model for predicting contention and prefetching pages before context switches. The hardest part was keeping the learning loop cheap enough to avoid ruining tail latency.

Scheduler Experiments

I benchmarked a baseline MLFQ against three variants:

  • Heuristic-aware: penalize lock-heavy threads; reward cache-friendly bursts.
  • Deadline-aware: soft EDF for audio/vision tasks with slack stealing.
  • Learned bias: a tiny model nudging priorities based on past contention windows.

The learned bias won on mixed workloads with NPUs + CPU contention: ~11% fewer tail spikes at p99 compared to MLFQ, with negligible CPU tax. On pure CPU workloads it regressed slightly, so I gated it behind a workload detector.

Kernel Threads & Scheduling

Preemption sounds easy; it's not. A few takeaways:

  • Context switching: saving FPU/SSE state right matters; missing it means ghost crashes.
  • Priority inversion: introduced priority donation to keep high-priority tasks from starving behind locks.
  • Timer granularity: coarse timers make the system feel sluggish; too fine grinds the CPU.

User Programs & Syscalls

The user/kernel boundary is where most bugs hid. I focused on:

  • Validating pointers defensively—never trust user space.
  • Clear error codes for syscalls; silent failures make debugging impossible.
  • Copy-on-write experiments improved performance but complicated page fault handling.

Security and Isolation

I added a lightweight capability model for syscalls exposed to untrusted user programs. Capabilities could be revoked, time-bounded, or rate-limited. Simple ptrace hooks plus guarded syscalls caught most user/kernel abuse during fuzzing. Memory tagging would be next, but on Pintos-era constraints I settled for guard pages and aggressive zeroing.

Virtual Memory & Files

Paging and the file system forced discipline:

  • Page replacement: a simple clock algorithm beat more complex heuristics under load.
  • Write-back vs. write-through: hybrid caching reduced I/O while keeping corruption risk low.
  • Atomicity: journaling-lite for metadata so crashes don't nuke the directory tree.

Device Model and NPUs

LuisOS treats NPUs as first-class schedulable devices. I built a tiny command queue with back-pressure so GPU/NPU kernels wouldn’t starve CPU threads. For transformer inference, batching small requests reduced end-to-end latency more than chasing kernel micro-optimizations. The OS’s job was orchestrating the right batching window without blocking interactive tasks.

Testing & Debugging

The biggest unlock was building a brutal test harness:

  • Deterministic repros: seedable workloads that hammer threads, syscalls, and I/O in parallel.
  • Panic breadcrumbs: short, consistent logs beat verbose dumps when you're low on time.
  • Fault injection: intentionally corrupting frames caught assumptions I'd never consider.

AI-Native Patterns I Want Next

  • Reward-model schedulers: RLHF-style signals on latency, jitter, and energy to tune policies.
  • Semantic I/O: tagging data flows (camera, mic, LIDAR) so the kernel can prioritize based on task graphs.
  • Adaptive security: anomaly detection on syscall patterns to auto-throttle suspicious processes.

What It Means for Builders

Building LuisOS changed how I approach products: measure everything, design for interrupts, and expect cross-layer effects. Modern apps are really distributed systems across CPU, GPU/NPU, storage, and network; the kernel mindset—tight loops, explicit tradeoffs—translates directly to shipping reliable AI products.

What I'd Do Differently

If I had a second pass:

  • Abstract the scheduler earlier; retrofitting different policies was painful.
  • Add richer tracing hooks from day one; printf debugging in kernel land is misery.
  • Invest in better tooling: symbolized stack traces and automated bisecting on kernel changes.

Takeaways

Building an OS is the fastest way to respect how much is hidden by modern runtimes. The work forces clarity: every allocation, every lock, every interrupt has a cost. That intuition now informs how I design higher-level systems—fewer assumptions, more instrumentation, and tight loops between design, measurement, and iteration.