AI-Powered Legacy SQL Intelligence

01 · The Opportunity

Complex systems deserve better tooling

A large SQL codebase built over years carries deep business logic and hard-won institutional knowledge. The opportunity is to surface that knowledge — making it accessible to AI tools and new team members alike, without losing any of what the team has built.

Systems evolve faster than documentation

This is true of every production system at scale. The stored procedure inventory grew organically to serve real business needs — the challenge now is giving AI tools the context map they need to work effectively with it.

Deep knowledge is an asset — and a risk

The team holds significant expertise about how this system works. That expertise is valuable. This strategy captures it in a structured, durable format so it's available to everyone — not just the people who have been here longest.

AI tools need structured context to be reliable

Without a system map, any AI tool will explore blindly — burning through its working memory before it can reason about dependencies. The .md strategy gives Claude Code the context it needs to operate correctly on complex systems.

Onboarding complexity grows with the system

The more interconnected the stored procedures, the longer it takes for a new team member to build a reliable mental model. Structured context files dramatically reduce that ramp-up time for anyone joining the project.

2–3h

context-building overhead per task

3–6mo

typical ramp-up for new team members

Days

to generate a full knowledge base with this strategy

02 · Our Solution

A three-layer intelligent stack

The solution combines an autonomous AI agent with a context management strategy and enterprise security — each layer doing exactly what it's best at, and working with the existing codebase rather than around it.

Layer 01

NVIDIA NemoClaw

The autonomous agent. Runs an OpenClaw agent inside a secure OpenShell sandbox. Reads 500+ SQL files in batches, analyzes stored procedures, and generates the documentation files — with zero risk of data exfiltration.

One-command install and deploy
Kernel-level security enforcement
Runs once to generate the knowledge base
Policy-based — auditable via YAML

Layer 02

.md Context Strategy

The knowledge base. A hierarchy of Markdown files that maps the entire SQL codebase into structured, navigable documentation. Not human docs — AI-optimized context designed to be loaded efficiently.

CLAUDE.md: global rules, always loaded
context-index.md: navigation routing
context-sql-[domain].md: per-module inventory
Lives in Git — versioned like code

Layer 03

Claude Code · or Junie

The daily driver. Developers interact with the .md files as source of truth through whichever AI coding agent the team already uses. Both Claude Code and JetBrains Junie are fully compatible with this strategy — and we already hold Junie licenses.

Reads context files at session start
Loads only the relevant module per task
Operates at 20–30% context — not 80%+
Updates .md files after each task

✓ NemoClaw runs once (or when the codebase changes significantly) to generate the knowledge base. Claude Code or Junie runs daily using that knowledge base. They serve distinct roles — NemoClaw is the mapping agent, the coding agent is the development assistant.

02b · Tool Flexibility

The team already has Junie licenses —
and the strategy works natively

The .md context strategy is not tied to any specific AI coding tool. It is based on an open standard that both Claude Code and JetBrains Junie read natively — which means there is no additional tooling cost for the daily development layer.

Claude Code

CLAUDE.md

Native context file. Auto-loaded at session start. Designed specifically for Claude Code workflows.

$ claude
// CLAUDE.md loaded automatically

Junie (IntelliJ)

AGENTS.md

Equivalent open standard. Auto-loaded by Junie at session start. Recognized by multiple agents simultaneously.

// AGENTS.md loaded automatically
// same strategy, same result

File mapping — zero migration effort

Claude Code

CLAUDE.md
context-index.md
context-sql-[domain].md
context-sql-[domain].md

→

Junie

AGENTS.md (rename)
context-index.md (unchanged)
context-sql-[domain].md (unchanged)
context-sql-[domain].md (unchanged)

      Only one file needs to be renamed. All other context files are identical — the strategy is tool-agnostic.
    

Autonomous agent

Junie navigates and reads files autonomously — the same way Claude Code does. It can load context-index.md, identify which domain file is relevant, and load it without manual intervention.

LLM-agnostic

Junie supports Anthropic Claude, OpenAI, Google, and others via BYOK. The team can use the same Claude model through Junie — maintaining quality parity with Claude Code.

Zero additional cost

The team already holds Junie licenses. Using Junie as the daily driver means the development layer of this strategy has no incremental licensing cost — only NemoClaw and the LLM API are new.

✓ Practical implication: the pilot can run entirely on existing Junie licenses. The only new cost is the LLM API calls made by NemoClaw during the one-time generation phase, plus whatever API usage the team accumulates through Junie daily. There is no mandatory Claude Code subscription required for this strategy to work.

02c · Hardware Readiness

The team's Mac setup is
already sufficient

The entire stack runs on Apple Silicon Macs — no additional hardware required. With 64 GB of unified memory, the team can run local LLM inference directly on their machines, keeping sensitive data entirely on-premise if needed.

Hardware

Mac Studio M1 Max

✓ Confirmed compatible

Unified Memory

64 GB

✓ Runs 70B models locally

Additional hardware cost

$0

Zero — existing machines

Three options for running local inference on Apple Silicon — from simplest to most performant:

Recommended · Simplest

Ollama

One-command install. Metal GPU acceleration automatic. Exposes an OpenAI-compatible API on localhost:11434 — exactly what NemoClaw needs as inference backend.

        # Install

        brew install ollama

        # Run Nemotron (18 GB, 98 tok/s)

        ollama run nemotron3-nano-30b

Fastest · Apple-native

MLX

Apple's own ML framework — built from scratch for unified memory architecture. Noticeably faster tokens/sec than Ollama on the same model. Developer-oriented, no GUI.

        # Install

        pip install mlx-lm

        # Run Nemotron via mlx-community

        mlx_lm.generate \
  --model mlx-community/\
  Nemotron-3-Nano-30B-4bit

Easiest · With UI

LM Studio

Desktop app with model browser, Metal acceleration built in, and a local server compatible with OpenAI API. Good for exploring models before committing to one for the NemoClaw workflow.

        # Download from

        lmstudio.ai

        # Search and download

        Nemotron-3-Nano-30B

What fits in 64 GB unified memory

Nemotron 3 Nano 30B

18 GB

98 tok/s · best balance

Llama 3.3 70B (Q4)

~40 GB

6.5 tok/s · highest quality

OS + apps overhead

~8 GB

reserved always

Headroom (with 30B)

38 GB

free for work + context

    ⚠ One known limitation: NemoClaw has a DNS bug on macOS that prevents routing local inference through its native privacy router. The workaround is a two-line manual fix to the sandbox /etc/hosts after onboarding. NVIDIA is tracking this in GitHub issue #260. Expected to be resolved as NemoClaw exits alpha.
  

✓ Bottom line: the team's existing Mac Studio hardware is fully capable of running this stack — including local LLM inference with no token costs and no data leaving the machine. Combined with existing Junie licenses, the only net-new cost for this strategy is the one-time NemoClaw generation phase via API.

03 · Technical Foundation

Why AI tools need structured context
to work with large codebases

Before the .md strategy makes sense, it helps to understand one fundamental constraint of every AI model: the context window. This is why the strategy is necessary — and why without it, any AI tool will struggle on a system of this size, regardless of how good the tool is.

Claude's working memory — what's on the desk right now

Your question Claude's answer SQL file A.sql SQL file B.sql SQL file C.sql CLAUDE.md Tool outputs System prompt SQL file D.sql SQL file E.sql SQL file F.sql ↗ falling off

      ⚠ Every session starts with a fresh window. There is no persistent memory from previous sessions. The context window is the only working memory available — and it resets when the session ends.
    

Drag the slider to see what happens to response quality as more SQL files get loaded into a session:

SQL files read 30 files

System

Mem

Conv

SQL

← free space

System + tools (~8%) CLAUDE.md + memory (~4%) Conversation (~6%) SQL files (variable)

Response quality

Excellent — Claude is sharp and focused ~18% used

With 30 files loaded, ~18% of context is used. Claude reasons accurately about dependencies and edge cases.

Quality doesn't drop all at once. It degrades gradually — in ways that look like carelessness:

Context at 70–100% full

✗Forgets decisions made 20 messages ago
✗Contradicts earlier analysis
✗Misses SP dependencies
✗Re-asks for files already read
✗Proposes changes conflicting with system rules

Context at 20–40% full

✓Tracks cross-SP dependencies accurately
✓Consistent reasoning throughout
✓Respects rules set at session start
✓Catches dangerous operations
✓Reliable from first to last message

⚠ On a system with hundreds of stored procedures, if an AI tool reads them all in a single session to "understand the system," it burns through most of its working memory before writing a single line. The actual task then runs on a crowded window — and subtle quality issues can be hard to spot. This is a limitation of the tool, not the codebase.

04 · The .md Strategy

A curated knowledge layer
that grows with the team

The goal isn't to give the AI more context — it's to give it the right context, on demand. The .md file hierarchy is a structured navigation system that maps the entire SQL codebase into something an AI agent can traverse purposefully rather than blindly.

// Claude Code or Junie reads these layers in order — each one narrows scope before loading anything

Always loaded

CLAUDE.md — global rules, naming, critical warnings ~4% tokens

↓ agent knows where to look

On-demand nav

context-index.md — "for billing → read context-sql-billing.md" ~2%

↓ agent selects the relevant module

Domain context

context-sql-[module].md — SP inventory, dependencies, execution order ~6–8%

↓ agent reads only the SPs needed

Targeted reads

3–5 specific .sql files — targeted, not blind ~5–10%

What each file does

CLAUDE.md

The rules that can never be ignored. DB engine, naming conventions, schemas, and SPs that require approval before touching. Auto-loaded every session — the agent reads this before your first message. Keep it under 50 lines: every line loads every time.

context-index.md

The navigation map. One job: tell the agent which domain file to load for each type of task. Also holds the master WARNING list of SPs that require special care across the entire system. Without this file, the agent would have to guess or explore.

context-sql-[domain].md

The real inventory — one per business domain. For each stored procedure: name, one-line purpose, tables it reads, tables it writes, other SPs it calls, execution order if sequenced, and warnings for destructive operations. This is what turns a folder of files into a navigable system.

domain-map.md

Generated first by NemoClaw before any domain analysis. Clusters all 500+ SQL files into business domains by analyzing file names, prefixes and folder structure. It becomes the blueprint that guides the rest of the generation process.

Step 0 — Assess your codebase size first

The generation strategy scales with the size of the codebase. The number of SQL files determines how many context files you'll need and how to organize the generation sessions.

Tier 1 · Small

10 – 80 files

~1–2 hours total

→Single analysis session covers everything
→One context-sql.md for the whole codebase
→CLAUDE.md + index + one domain file

Tier 2 · Medium

80 – 400 files

~1–2 days total

→One focused session per business domain
→One context-sql-[domain].md per domain
→context-index.md as navigation hub

Tier 3 · Large ← Our case

400+ files

~2–4 days total

→Script pre-catalogs files before agent runs
→NemoClaw clusters into domains automatically
→Hierarchical index for large domain groups

Generation workflow — step by step

01

Create CLAUDE.md manually

This is the only file written by hand. Under 40 lines — global conventions, DB engine, naming rules, and any SPs that must never be modified without approval.

          CLAUDE.md — example structure
          ## System

          - DB Engine: SQL Server 2019

          - Schema: dbo (default), finance, audit

          - Naming: SPs prefixed usp_, tables UPPER_SNAKE

          ## Critical Rules

          - Never modify usp_close_period without DBA approval

          - All financial SPs must log to AUDIT_LOG table

02

Run a single analysis session

Point the agent at the SQL folder. It reads all files, identifies what each SP does, and generates the full inventory in one pass.

          Prompt to Claude Code / Junie
          Read all .sql files in /sql/. For each stored procedure,

          generate an entry in /docs/context/context-sql.md with:

            - SP name and one-line description

            - Tables it reads (SELECT/FROM/JOIN)

            - Tables it writes (INSERT/UPDATE/DELETE)

            - Other SPs it calls (EXEC)

            - Any warnings (destructive ops, no rollback)

          Group entries by apparent business domain.

03

Generate the navigation index

Ask the agent to build the routing layer from what it just cataloged.

          Prompt
          Based on context-sql.md, create context-index.md.

          For each domain: "To work on [domain] → read context-sql.md §[section]".

          Add a WARNING section for any SPs flagged as destructive.

04

Result — your file structure

docs/context/
  CLAUDE.md ← always loaded, ~30 lines
  context-index.md ← navigation hub
  context-sql.md ← full SP inventory

01

Map file prefixes to business domains

Before any agent session, identify domains by looking at filename patterns. This determines how many context files you'll generate.

          Domain mapping by filename prefix
          // Examples — adapt to your naming conventions

          usp_bill_*  → billing

          usp_pay_*   → payments

          usp_rep_*   → reporting

          usp_inv_*   → inventory

          usp_cl_*    → period-close (critical)

02

One focused session per domain

Never mix domains in a single session — context bloat causes missed dependencies. One domain = one session = one context file.

          Prompt template — repeat per domain
          Focus only on the billing domain: all files prefixed usp_bill_.

          Generate /docs/context/context-sql-billing.md with:

            - Full SP inventory (name, purpose, tables R/W, calls made)

            - Dependency map: which SPs call which

            - Recommended execution order if applicable

            - Cross-domain calls flagged explicitly

            - Destructive operations flagged with [WARN]

03

Build the master index from all domain files

          Prompt — run after all domain files exist
          Read all context-sql-*.md files in /docs/context/.

          Generate context-index.md with:

            - Domain routing: "For [task] → read [file]"

            - Cross-domain dependency summary

            - Master WARNING list (all [WARN] flags across all domains)

            - Table ownership: which domain writes to each key table

04

Result — your file structure

docs/context/
  CLAUDE.md ← always loaded
  context-index.md ← master navigation
  context-sql-billing.md
  context-sql-payments.md
  context-sql-reporting.md
  context-sql-inventory.md
  context-sql-period-close.md ← [WARN] flags

01

Pre-catalog all files with a script

For 400+ files, start with a script that generates a raw file manifest. This gives the agent a structured list to work from instead of discovering the filesystem blindly — which would consume context before any analysis begins.

          macOS / bash — generate manifest
          # Run once from the SQL root directory

          find ./sql -name "*.sql" -type f \

            | awk -F/ '{print NF-1, $0}' \

            | sort > docs/context/raw-manifest.txt

          # Output: every .sql file listed with its path

          # Agent uses this as its working index

02

NemoClaw agent clusters files into domains

Let the agent analyze file names and folder paths to propose the domain grouping. Don't do this manually — it takes seconds and the agent spots patterns humans miss.

          Prompt to NemoClaw agent
          Read docs/context/raw-manifest.txt. Based on file names,

          folder paths and naming prefixes, propose a domain grouping.

          Output domain-map.md with:

            - Domain name and file count

            - Files that don't fit any domain (flag as "uncategorized")

            - Suggested sub-domains for any group with 60+ files

03

Process each domain in a dedicated session

If the team has multiple Junie licenses, different team members can run different domains in parallel — cutting generation time significantly.

          Parallel session assignment
          // Team member A — billing + payments (~60 files)

          → generates context-sql-billing.md

          → generates context-sql-payments.md

          // Team member B — reporting + period-close (~80 files)

          → generates context-sql-reporting.md

          → generates context-sql-period-close.md

          // Team member C — inventory + shared utilities

          → generates context-sql-inventory.md

          → generates context-sql-shared.md

04

Build hierarchical index

For very large systems, the index may need two levels — a master router and sub-indexes per major area — so the navigation layer itself stays lean.

          Two-level index structure
          context-index.md ← master: routes to sub-indexes

          context-index-financial.md ← billing, payments, period-close

          context-index-operations.md ← inventory, logistics, shared

05

Review, commit to Git, and you're done

The team reviews the generated files — domain experts validate that the SP descriptions and warnings are accurate — then commits everything to the repository like any other code change.

docs/context/
  CLAUDE.md
  raw-manifest.txt ← script output
  domain-map.md ← agent-generated clustering
  context-index.md ← master router
  context-index-financial.md
  context-index-operations.md
  modules/
    context-sql-billing.md
    context-sql-payments.md
    context-sql-reporting.md
    context-sql-period-close.md
    context-sql-inventory.md
    context-sql-shared.md

What the output looks like — expand each file

CLAUDE.md — global conventions (always loaded)▼

# CLAUDE.md — Acme Corp ERP

## System
- DB Engine: SQL Server 2019, instance PROD-SQL-01
- Default schema: dbo · Financial data: finance
- Naming: SPs prefixed usp_, tables UPPER_SNAKE_CASE
- Context files: /docs/context/

## Critical Rules
- [NEVER] Modify usp_cl_period_close without DBA sign-off
- [NEVER] DELETE on MOVEMENTS table directly — use usp_void_movement
- All writes to finance schema must use explicit transactions
- Log table: AUDIT_LOG — all financial SPs must insert here on completion

## Navigation
Read context-index.md to find which file to load for any task.

context-index.md — the navigation hub▼

# Context Index

## Task Routing
| Task | Load this file |
| Invoice generation / billing | context-sql-billing.md |
| Payment processing | context-sql-payments.md |
| Financial reports / exports | context-sql-reporting.md |
| Month-end / period close | context-sql-period-close.md ⚠ |
| Inventory / stock movements | context-sql-inventory.md |
| Shared utilities / helpers | context-sql-shared.md |

## Cross-Domain Dependencies
- usp_bill_generate calls usp_pay_validate_credit (billing → payments)
- usp_cl_period_close calls usp_rep_snapshot_balances (close → reporting)
- All domains call usp_util_log_audit from context-sql-shared.md

## ⚠ WARNING — Review Required Before Touching
- usp_cl_period_close: irreversible fiscal period seal, no rollback
- usp_pay_batch_transfer: moves funds, triggers external bank API
- usp_inv_purge_old: bulk DELETE with no undo

context-sql-billing.md — domain SP inventory▼

# Billing Domain — SP Inventory
// Generated by NemoClaw analysis agent · review before committing

### usp_bill_generate_invoice
Purpose: Creates a new invoice record from an approved order
Reads:   ORDERS, ORDER_ITEMS, CUSTOMERS, TAX_CONFIG
Writes: INVOICES, INVOICE_LINES, AUDIT_LOG
Calls:   usp_pay_validate_credit, usp_util_log_audit
Params: @order_id INT, @created_by NVARCHAR(50)
[Safe to modify with tests]

### usp_bill_cancel_invoice
Purpose: Voids an invoice and triggers credit note generation
Reads:   INVOICES, PAYMENTS
Writes: INVOICES (status), CREDIT_NOTES, AUDIT_LOG
Calls:   usp_pay_reverse_charge, usp_util_log_audit
[WARN] Cannot cancel if payment already settled — check PAYMENTS.status first

## Table Ownership (this domain writes to)
INVOICES, INVOICE_LINES, CREDIT_NOTES

## Execution Order (full billing cycle)
1. usp_bill_validate_order
2. usp_bill_generate_invoice
3. usp_bill_send_notification
4. usp_bill_update_order_status

How Claude Code / Junie uses this every day

Once the files are generated and committed, this is what a normal development session looks like:

Without the .md strategy

1. Developer opens session

2–4. Agent reads 80+ SP files trying to understand the system → 60–70% context gone

5. Task runs on a crowded window — subtle inconsistencies appear

6. Developer reviews output, finds issues, corrects

With the .md strategy

1. Developer opens session — CLAUDE.md loads automatically

2. Agent reads context-index.md → identifies relevant domain file

3. Loads context-sql-[domain].md → reads 3–5 targeted SQL files

4. Works with full dependency visibility — ~25% context used

5. Updates context file with anything new learned. Done.

    Standard session opener — paste this at the start of any task
    Read context-index.md. Today's task: [describe the task].

    Tell me which context file to load, then load it.

    Summarize what you know about this domain before we start.

Do

After modifying any SP, update the relevant context-sql-[module].md as part of the task
Treat context file updates as part of "done" — not optional cleanup
Keep CLAUDE.md under 50 lines — it loads every session, every token counts
Re-run NemoClaw on a domain when that domain has significant new SPs

Don't

Put credentials, connection strings or server IPs in any context file
Try to generate all domain files in one giant session — it degrades quality
Mix domain analysis with actual code changes in the same session
Treat context files as documentation for humans — they are behavior instructions for the AI

Going deeper — the two-track knowledge base

The SQL files tell the agent what the code does. But this system was built on top of a specific tool that has its own logic, rules, and concepts. Without understanding how that tool works, the agent can read the code but can't reason about why it does what it does. That's where the second knowledge base comes in.

Track A — SQL Knowledge Base

What the code does

Generated from the 500+ SQL files. Tells the agent which stored procedures exist, what tables they read and write, what dependencies they have, and what order they must run in.

        usp_ledger_register

        Reads: PERIODS, ACCOUNTS

        Writes: LEDGER_ENTRIES, AUDIT_LOG

        Calls: usp_validate_period

Track B — System Knowledge Base

Why the code does it

Generated from the tool's documentation library. Tells the agent what each table represents in the system, what business rules the code is enforcing, and what constraints can never be violated.

        LEDGER_ENTRIES

        Central transaction register of

        the accounting module. Period

        must be open before any write.

With both tracks — generated in the right order

      usp_ledger_register — registers an accounting transaction in the general ledger.

      LEDGER_ENTRIES is the central transaction table of the financial module.

      Calls usp_validate_period first — required by system rules: period must be open before any ledger write.

      [WARN] Writing during a closed period will corrupt the audit trail — the system has no rollback for this.

      This level of annotation is only possible when the agent reads the docs first — it understands the system before it reads the code.
    

Complete file structure — both tracks

TRACK A · docs/context/

CLAUDE.md ← global rules, always loaded
context-index.md ← unified navigation map
modules/
  context-sql-billing.md
  context-sql-payments.md
  context-sql-reporting.md
  context-sql-period-close.md
  context-sql-inventory.md
  context-sql-shared.md

TRACK B · docs/system/

system-index.md ← doc library map
modules/
  system-financial.md ← financial module rules
  system-inventory.md ← inventory module rules
  system-reporting.md ← reporting module rules
  system-glossary.md ← table/entity definitions
  system-rules.md ← critical business constraints

context-index.md — unified routing for both tracks

Task
Load SQL context
Load System context
Period close
context-sql-period-close.md
system-financial.md
Invoice generation
context-sql-billing.md
system-financial.md
Inventory movements
context-sql-inventory.md
system-inventory.md
Unknown table/entity
—
system-glossary.md

Generation order — why Track B must come first

Phase 0

Pre-catalog

Script generates manifest of all SQL files and PDF documents. NemoClaw proposes domain clustering for both simultaneously.

Phase 1 — first

Doc Extraction

NemoClaw reads the tool's documentation by module. Extracts entity definitions, business rules, system constraints, and critical flows. Generates system-[domain].md files.

Phase 2 — armed

SQL Analysis

NemoClaw analyzes SQL files with system-[domain].md already loaded. Annotates each SP with real semantic meaning, not just technical facts. Generates context-sql-[domain].md files.

Phase 3

Unification

context-index.md routes both tracks. Team reviews, commits to Git. Knowledge base is live and complete from day one.

SQL first (previous approach)

Agent reads SQL files without context. Produces technical annotations only — table names, call chains, execution order. A second pass adds system meaning afterward.

        usp_ledger_register

        Writes: LEDGER_ENTRIES

        Calls: usp_validate_period

        // why? unknown at this stage

Docs first (correct approach)

Agent reads SQL files already knowing the system. Produces technically complete AND semantically rich annotations in a single pass — no second pass needed.

        usp_ledger_register

        Writes: LEDGER_ENTRIES ← central accounting register

        Calls: usp_validate_period ← period must be open

        [WARN] closed period write corrupts audit trail

✓ Docs first means the agent reasons about the code the same way a domain expert would — understanding what the system is doing and why, not just cataloging what the code says. The context-sql-[domain].md files produced this way are complete and actionable from the first pass.

A session with both tracks active

Developer: "I need to modify the period close process"

Step 1

CLAUDE.md auto-loaded → agent knows global rules and naming conventions

Step 2

Reads context-index.md → "period close: load context-sql-period-close.md + system-financial.md"

Step 3

context-sql-period-close.md → knows which SPs, execution order, tables, dependencies, [WARN] flags

Step 4

system-financial.md → knows what LEDGER_ENTRIES represents, why period validation exists, what can't be violated

Step 5

Reads 3–4 targeted SQL files · works with full technical + functional context · ~28% window used

✓ The agent arrives at the task knowing both the code structure and the business system behind it — without the developer having to explain either. That's the compounded value of the two-track knowledge base.

05 · NemoClaw Layer

NVIDIA NemoClaw —
the autonomous generation engine

Announced at GTC 2026, NemoClaw is OpenClaw with enterprise-grade security baked in. It runs an autonomous agent inside an OpenShell security sandbox — the agent that will analyze all 500+ SQL files and write the .md knowledge base for us.

NVIDIA GTC 2026 · Mar 16 · Open Source · Apache 2.0

"Every company needs an OpenClaw strategy" — Jensen Huang

NemoClaw is OpenClaw + OpenShell: a secure runtime that enforces policy-based guardrails at the kernel level. The agent runs inside an isolated sandbox with network deny-by-default, filesystem isolation, and all LLM calls proxied through the OpenShell gateway. API keys never touch the agent — they're injected at runtime.

// How the stack works for our SQL analysis task

Task input

"Analyze /sql-input/, generate .md context files"

you

↓

OpenClaw agent

Reasoning loop · batch processing · file writer

agent

↓ all calls intercepted

OpenShell sandbox

Network deny · filesystem isolation · Landlock + seccomp + netns

security

↓ proxied + logged

LLM gateway

Claude API or local Nemotron — API key never reaches agent

inference

↓

Output

CLAUDE.md · context-index.md · context-sql-[module].md × N

result

⚠ Alpha software as of March 16, 2026. APIs may change between releases. Run on an isolated dev machine or Brev cloud instance — not directly on a machine with access to production databases. This is intentional timing: by the time this project is approved, NemoClaw will be more mature.

06 · Security

Enterprise security
by design, not by policy

Security isn't a configuration option here — it's the fundamental architecture. OpenShell enforces isolation at the kernel level. Policies can't be bypassed by the agent, even if the agent is compromised.

Network isolation

Egress is denied by default. The agent cannot reach any external host unless explicitly listed in the policy YAML. For our use case: only the Claude API endpoint is allowed. Any other outbound attempt is blocked instantly and logged.

Filesystem isolation

The agent can only write to /sandbox and /tmp. Your SQL files are mounted read-only at /sql-input — the agent can read them but cannot modify, delete, or copy them anywhere outside the defined output path.

Process isolation

Landlock, seccomp, and network namespaces enforce at the kernel level. The agent cannot escalate privileges, fork unauthorized processes, or modify its own security constraints. This is the same model used in browser tab isolation.

Inference isolation

All LLM API calls route through the OpenShell gateway proxy. The agent never holds the API key — it's injected at runtime. This means even if the agent were compromised, it couldn't exfiltrate credentials.

Audit trail

Every LLM call, every network request (allowed or blocked), every file write is logged. The audit_log: true configuration captures what was sent to the model and what came back — full compliance traceability.

GitOps policy management

Security policies live in a YAML file that can be version-controlled and reviewed via pull request. Hot-reloadable — changes take effect without restarting the sandbox. This means policy changes have the same review process as code changes.

policy.yaml — the complete security policy for this use case network:
  default: deny # deny ALL outbound by default
  allow:
    - host: api.anthropic.com # only Claude API
      port: 443
      proto: https

filesystem:
  mounts:
    - host: /path/to/sql/files
      sandbox: /sql-input
      readonly: true # agent can read, never modify
    - host: /path/to/md-output
      sandbox: /md-output
      readonly: false

inference:
  provider: anthropic
  model: claude-sonnet-4-6
  audit_log: true # every LLM call logged

✓ This policy file is the answer to "can you prove the agent won't send data externally without permission?" — show it, and the answer is: it's deny-by-default enforced at the kernel. Not a prompt. Not a guideline. A hard block.

07 · Implementation Plan

From zero to operational
in 6 steps

The full implementation path — from installing NemoClaw to having Claude Code use the generated knowledge base in daily development.

01

Install NemoClaw

One command installs the CLI, OpenShell runtime, Docker sandbox, and the onboarding wizard. Requires Ubuntu 22.04+, Node.js 20+, Docker. Sandbox image is ~2.4 GB.

          Install + onboard
          # Install NemoClaw (1 command)

          curl -fsSL https://www.nvidia.com/nemoclaw.sh | bash

          # Run the interactive setup wizard

          nemoclaw onboard

          # Set inference provider key

          export ANTHROPIC_API_KEY="sk-ant-..."

02

Configure security policy

Write the policy YAML and apply it to the sandbox. Network deny-by-default, SQL files mounted read-only, audit logging enabled.

          Apply policy
          # Save policy.yaml (see Security section above)

          nemoclaw policy apply sql-agent policy.yaml

          # Verify sandbox is configured correctly

          nemoclaw sql-agent status

03

Write agent identity (SOUL.md)

Define who the agent is and what it must/must not do. This is the agent's persistent identity — a system prompt that survives across runs.

          SOUL.md — agent identity inside sandbox
          ## Identity

          You are a SQL codebase analyst. Read SPs, generate docs.

          You do not execute SQL. You do not connect to databases.

          ## Rules

          - Process files in batches of max 25 per domain

          - Flag DELETE/DROP/TRUNCATE with [WARN]

          - Never include credentials, IPs, or data values in output

          - Write all output to /md-output/

04

Run the analysis agent

Connect to the sandbox and give the agent its task. For 500+ files, expect 2–4 hours. The agent runs autonomously — you can disconnect and reconnect.

          Start analysis
          nemoclaw sql-agent connect

          # Paste this task to the agent:

          Analyze all SQL files in /sql-input/.

          Step 1: Group files by business domain → /md-output/domain-map.md

          Step 2: For each domain, generate context-sql-[domain].md

          Step 3: Generate context-index.md with routing + warnings

          Step 4: Generate CLAUDE.md with global conventions

          Process one domain fully before starting the next.

          Monitor progress
          nemoclaw sql-agent logs --follow

          openshell gateway status # view security events
        

05

Extract and commit to Git

Copy the generated .md files from the mounted output directory to your project repository. Review, then commit like any other code.

          Extract output
          cp -r /path/to/md-output/* ./docs/context/

          # Review and commit

          git add docs/context/

          git commit -m "feat: add AI-generated SQL context files"

docs/context/
  CLAUDE.md ← always loaded by Claude Code
  context-index.md
  domain-map.md
  context-sql-billing.md
  context-sql-payments.md
  context-sql-reporting.md
  context-sql-period-close.md

06

Daily use with Claude Code

From this point, NemoClaw is no longer needed day-to-day. Developers use Claude Code with the generated .md files as source of truth. The standard workflow:

          Start any work session
          # Claude auto-loads CLAUDE.md. Then:

          Read context-index.md. Today's task: [describe task].

          Tell me which context files to load, then load them.

          Summarize what you know before we start.

          End of task — keep the knowledge base current
          Update context-sql-[module].md with what we learned today.

          Add any new SPs to the inventory.

          Flag any new [WARN] conditions you identified.

✓ Re-run NemoClaw only when the SQL codebase changes significantly (new module, major refactor). For individual SP additions or changes, Claude Code updates the context files inline during the task.

08 · ROI & Timeline

What this costs vs.
what it returns

The investment is small. The return starts in week one and compounds every time a developer doesn't have to rediscover the system from scratch.

~$0

Incremental daily tooling cost

The team already holds Junie licenses. Using Junie as the daily driver means no new subscription is needed for the development layer. Only NemoClaw (open source) and LLM API usage are net-new costs.

~3 days

To generate full knowledge base

NemoClaw agent runs autonomously over 500+ SQL files. Would take weeks to do manually — if anyone knew the system well enough.

Week 1

When ROI turns positive

If one developer saves 2–3 hours of exploration time on their first task after setup, the pilot cost is recovered in the first week.

Phase 01

Setup

1–2 days

Install NemoClaw
Configure security policy
Write agent identity

Phase 02

Analysis

2–4 days

Agent runs autonomously
Generates all .md files
Review + commit to Git

Phase 03

Pilot

Weeks 1–4

2 devs using Claude Code
Track exploration time saved
Refine context files

Phase 04

Scale

Month 2+

Roll out to full team
Knowledge base compounds
Onboarding drastically faster

What the team gains

+A structured, navigable map of the system — generated automatically
+Context-building overhead cut from hours to minutes per task
+New team member ramp-up time significantly reduced
+SP dependencies visible before any change is made

Why now is the right time

→NemoClaw just launched — early adoption builds institutional advantage
→Pilot cost is minimal — proves value before full commitment
→Knowledge base compounds in value the longer it's maintained
→Team expertise is preserved in a durable, versioned format

09 · FAQ

Answers to the questions
management will ask

Can Claude leak our SQL schemas or stored procedures to the internet?+

No. OpenShell enforces network isolation at the kernel level. The policy is deny-by-default — the agent physically cannot make a network request to any host except the explicitly approved LLM endpoint. This is not a prompt-based restriction; it's a kernel-enforced sandbox. We can show the policy.yaml file and prove it. If you choose the local Nemotron model option instead of Claude API, there are zero external network calls at all.

Is NemoClaw ready for enterprise use?+

NemoClaw is in alpha as of March 2026 — NVIDIA explicitly says "do not use in production." We are not using it in production. We're using it on an isolated development machine to generate documentation files. The risk profile of a documentation generation task is very low compared to production operations. By the time this project is fully approved and rolled out, NemoClaw will be more mature. The strategic value of starting now is avoiding tech debt and being positioned ahead of the curve.

What happens to the .md files if NemoClaw changes or breaks?+

Nothing. The .md files are plain Markdown text files committed to Git. Once generated, they are completely independent of NemoClaw. Claude Code uses them directly — no NemoClaw dependency in daily operations. NemoClaw is only needed when we want to regenerate or update the knowledge base. The output is durable regardless of what happens to the tool that created it.

What if the AI generates inaccurate context files?+

The generated .md files are reviewed by the team before being committed to Git — the same review process as any code change. The team's domain knowledge is the final check. In practice, the agent is very good at extracting structural information (which tables an SP reads and writes, which SPs it calls) since this is explicit in the code. Business logic descriptions benefit from a team member adding nuance during the review pass. The combination of automated extraction and team review produces better results than either alone.

Does this change how developers work day to day?+

The core workflow stays the same — developers make every decision and review everything before it's applied. What changes is the preparation phase. Instead of spending 2–3 hours reading through stored procedures to understand the context before starting a task, Claude Code can navigate the knowledge base and surface the relevant parts in minutes. The team's expertise and judgment remain central — the tool handles the information retrieval work.

What does it cost to run long-term?+

Since the team already holds Junie licenses, the daily development layer has no incremental cost. NemoClaw is open source under Apache 2.0 — also free. The only ongoing cost is the LLM API usage: the Claude API calls that Junie routes through OpenShell during NemoClaw's generation phase, and any daily API usage the team accumulates through Junie's BYOK configuration. For teams that want zero external API cost, Junie can be pointed at a locally-hosted Nemotron model — in which case the entire stack runs on-premise with no per-token charges.

How do we keep the .md files up to date as the codebase changes?+

Two mechanisms. For routine changes (modifying an SP, adding a new one), Claude Code updates the relevant context-sql-[module].md file at the end of the task — part of the definition of done. For major changes (new module, large refactor), we re-run the NemoClaw agent on the affected domain to regenerate that context file. This keeps the knowledge base current without requiring full re-analysis every time.

Unlock the Full Potentialof a Complex SQL Systemwith AI-Assisted Intelligence

Complex systems deserve better tooling

Systems evolve faster than documentation

Deep knowledge is an asset — and a risk

AI tools need structured context to be reliable

Onboarding complexity grows with the system

A three-layer intelligent stack

NVIDIA NemoClaw

.md Context Strategy

Claude Code · or Junie

The team already has Junie licenses —and the strategy works natively

The team's Mac setup isalready sufficient

Why AI tools need structured contextto work with large codebases

A curated knowledge layerthat grows with the team

Create CLAUDE.md manually

Run a single analysis session

Generate the navigation index

Result — your file structure

Map file prefixes to business domains

One focused session per domain

Build the master index from all domain files

Result — your file structure

Pre-catalog all files with a script

NemoClaw agent clusters files into domains

Process each domain in a dedicated session

Build hierarchical index

Review, commit to Git, and you're done

Do

Don't

NVIDIA NemoClaw —the autonomous generation engine

"Every company needs an OpenClaw strategy" — Jensen Huang

Enterprise securityby design, not by policy

Network isolation

Filesystem isolation

Process isolation

Inference isolation

Audit trail

GitOps policy management

From zero to operationalin 6 steps

Install NemoClaw

Configure security policy

Write agent identity (SOUL.md)

Run the analysis agent

Extract and commit to Git

Daily use with Claude Code

What this costs vs.what it returns

Answers to the questionsmanagement will ask

Unlock the Full Potential
of a Complex SQL System
with AI-Assisted Intelligence

The team already has Junie licenses —
and the strategy works natively

The team's Mac setup is
already sufficient

Why AI tools need structured context
to work with large codebases

A curated knowledge layer
that grows with the team

NVIDIA NemoClaw —
the autonomous generation engine

Enterprise security
by design, not by policy

From zero to operational
in 6 steps

What this costs vs.
what it returns

Answers to the questions
management will ask