System Architecture
8 packages. Strict unidirectional data flow. Zero circular dependencies. Built as a Turborepo + pnpm monorepo with TypeScript end-to-end.
Data Flow Pipeline
Data flows in one direction: Source → Parse → Graph → Query. No shortcuts, no circular dependencies.
Tree-sitter extracts functions, classes, imports, routes, DB ops, env vars, and events from TS, JS, Python, Rust, SQL, Markdown.
Nodes & edges ingested into Neo4j via batch UNWIND upserts. QueryCache with 30s TTL. Exponential retry backoff.
Chokidar monitors file changes with 500ms debounce. Incremental re-parse — only changed files are re-indexed.
AI agents query via 21 MCP tools (+ 9 resources, 6 prompts). Developers use 38 CLI commands. Viz dashboard for interactive exploration.
8 Packages
Each package has a single responsibility and strict boundaries. The dependency graph is unidirectional: core → parser/graph → watcher/mcp-server → cli.
Types, config (Zod), errors, logger (Pino)
Tree-sitter multi-lang extraction
Neo4j driver, queries, cache, retry
Chokidar file watcher with debounce
21 AI tools, 9 resources, 6 prompts, role-scoped access, sampling
React + Three.js/Cytoscape.js dashboard
PR impact analysis webhook, auto-comments blast radius
38 commands, standalone bundle
Module Responsibilities
| Module | Depends On | Exposes |
|---|---|---|
| @nomik/core | Nothing (leaf) | Types, Config, Logger |
| @nomik/parser | core | parseFile(), parseProject() |
| @nomik/graph | core | GraphService, createGraphService |
| @nomik/watcher | core, parser, graph | createWatcher() |
| @nomik/mcp-server | core, graph | 21 MCP tools + 9 resources + 6 prompts |
| @nomik/viz | core (types only) | Web dashboard |
| @nomik/github-bot | core, graph, parser | PR impact webhook |
| @nomik-ai/cli | All packages | 38 CLI commands |
Design Principles
Strict Boundaries
No circular dependencies. core is the leaf dependency. Each package has a single, well-defined responsibility.
Pipeline Architecture
Parsing and ingestion are separate steps with backpressure. File changes flow through watcher → parser → graph.
Observability
Structured logging via Pino everywhere. Every operation is traceable with timestamps and context.
Typed Errors
NomikError with code, severity, and recoverable flag. No untyped exceptions — every failure path is explicit.
Multi-Project Isolation
projectId injected at every layer — nodes, edges, queries, mutations. One Neo4j instance serves multiple projects.
Performance First
Batch UNWIND upserts, QueryCache with 30s TTL, exponential retry backoff. 100K-line projects scan in under 60s.
Technology Stack
| Component | Technology | Why |
|---|---|---|
| Parser | Tree-sitter (official TS bindings) | Incremental, multi-language, battle-tested |
| Graph DB | Neo4j Community (Bolt) | Cypher queries, APOC traversals, free |
| MCP SDK | @modelcontextprotocol/sdk | Reference implementation by Anthropic |
| CLI | Commander.js | Standard Node.js CLI framework |
| Visualization | React + Cytoscape.js + Three.js | 2D/3D interactive graph exploration |
| Build | Turborepo + tsup + pnpm | Fast builds, workspace caching, disk-efficient |