Performing comprehensive code analysis requires careful orchestration of multiple phases, whether you're running locally on your machine, remotely on a server, or as part of a CI/CD pipeline. Each phase plays a critical role in ensuring code quality, security, and maintainability. Let's explore how a modern workflow manager handles this complexity across different execution environments.
The 5 Phases of Code Analysis
A robust code analysis workflow consists of 5 distinct phases, each with its own time allocation and responsibilities:
| Phase | Time Allocation | Purpose |
|---|---|---|
| Prepare | 2% | Environment setup and file detection |
| Agent (AI) | 25% | AI-powered code review |
| Code Analysis | 30% | Static analysis and security scanning |
| Upload Results | 2% | Publishing analysis artifacts |
| Cleanup Artifacts | 70% | Resource cleanup and finalization |

Phase 1: Prepare
The preparation phase sets up the entire analysis pipeline. Despite taking only 2% of workflow time, it determines which tools run, how work parallelizes, and what files get scanned.
| Step | Action | Purpose |
|---|---|---|
| Environment Init | Detect OS, shell, available tools | Adapt to execution context |
| Dependency Check | Verify analyzers, scanners, AI models | Ensure required tools available |
| Language Detection | Map file extensions to languages | Activate correct analyzers |
| File Discovery | Scan with include/exclude patterns | Build analysis file list |
| Smart Filtering | Remove binary, oversized, deleted files | Optimize analysis scope |
| Plan Construction | Calculate parallelism, estimate duration | Enable efficient execution |
Phase 2: Agent (AI)
The AI Agent phase performs autonomous code review using Large Language Models. Taking 25% of workflow time, it acts like a human reviewer that understands context, explores the codebase, and applies your team's specific standards.
| Step | Action | Output File |
|---|---|---|
| Setup & Checkout | Clone repository, configure environment | - |
| Download Tools | Fetch autofind CLI and ccrcli | - |
| Load Context | Read custom instructions, build metadata | repo-custom-instructions.json |
| Retrieve Changes | Get PR diff from API | diff-file.diff |
| Agentic Review | Claude Sonnet 4.5 explores codebase | results-agent.json |
| Security Audit | GPT-5 scans for vulnerabilities | results-security-detector.json |
| Tag & Upload | Label results by source, create artifacts | results-agent.zip |
Unlike static analyzers that follow rigid rules, the AI agent actively investigates. It reads your custom instructions file to understand project-specific patterns (e.g., "No comments," "Use SafeAwait," "Decorator architecture"). Then it creates a review plan, uses tools to explore related files, validates changes against your standards, and performs a dedicated security pass. The output is tagged JSON artifacts ready for review comments.
Running Locally with Open Source Models: For complete data privacy, use local models like Llama 3, Mistral/Mixtral, Qwen, or DeepSeek. Host them with Ollama, LM Studio, or LocalAI on your GPU, then configure your agent to point to localhost:11434 instead of external APIs. This ensures zero data leaves your machine.
Phase 3: Code Analysis
The Code Analysis phase runs static analysis to find security vulnerabilities and logic errors. Taking 30% of workflow time, it uses CodeQL to scan the codebase with specific rules while filtering out noise.
| Step | Action | Output File |
|---|---|---|
| Setup & Checkout | Clone repository at PR ref | - |
| Initialize CodeQL | Download CLI, configure queries | - |
| Run Analysis | Execute specific query rules | results.sarif |
| Convert SARIF | Transform to custom JSON format | codeql-javascript-typescript-ccr.json |
| Upload Artifact | Store results for downstream use | results-codeql-javascript-typescript.zip |
CodeQL runs targeted checks like js/unreachable-statement (dead code), js/useless-comparison-test (always true/false), js/inconsistent-loop-direction (infinite loops), and js/use-of-returnless-function (void return usage). The configuration disables default queries and applies filters to exclude noisy warnings.
Results flow through two formats: First, CodeQL outputs SARIF (Static Analysis Results Interchange Format), an industry-standard JSON format used by security platforms and code quality tools. Then a custom script converts it to CCR JSON, a simplified format that strips complexity and keeps only essential info (file, line number, message) for AI consumption. The final artifact contains both formats.
Phase 4: Upload Results
The Upload Results phase aggregates findings from all analyzers and posts them to the PR. Taking only 2% of workflow time, it handles deduplication and delivers consolidated feedback.
| Step | Action | Output |
|---|---|---|
| Initialize Secrets | Parse payload, set secure variables | Configured environment |
| Download Tools | Fetch ccrcli for API communication | CLI ready |
| Prepare Payload | Clean input data for callback | payload.json |
| Download Artifacts | Retrieve agent and CodeQL results | All findings collected |
| Run Deduplication | Compare and merge duplicate findings | Unique comments only |
| Execute Callback | Send results to PR via ccrcli | Comments posted |
The phase downloads artifacts from both the AI Agent (results-agent) and CodeQL (results-codeql-*), then uses the autofind tool to detect and remove duplicate findings. If the AI already flagged an issue that CodeQL also found, only one comment appears. Finally, ccrcli sends the deduplicated results back to the Pull Request as review comments.
Phase 5: Cleanup Artifacts
The Cleanup phase removes temporary artifacts to free storage. Taking 70% of workflow time (mostly waiting), it uses the platform's API to delete all artifacts generated during the analysis.
| Step | Action | Artifacts Deleted |
|---|---|---|
| List Artifacts | Query platform API for run artifacts | - |
| Delete Loop | Iterate and delete each artifact | results-agent.zip, results-security-detector.zip, results-codeql-javascript-typescript.zip |
| Confirm Cleanup | Log completion status | All temporary files removed |
The cleanup script uses the platform's API to list all artifacts for the workflow run, pipes the IDs through xargs, and deletes each one via REST API calls. This prevents storage bloat from accumulating analysis results across hundreds of PRs. The phase runs last and takes the longest due to API rate limits and sequential deletion, but requires minimal compute resources.
Running Anywhere
The beauty of this architecture is its flexibility. The same workflow can execute:
Locally on your machine:
code-analyzer analyze --mode=local --path=./src
On a remote server:
code-analyzer analyze --mode=remote --endpoint=https://api.example.com
In your CI/CD pipeline:
- name: Run Code Analysis
run: code-analyzer analyze --mode=cicd --pr=${{ github.event.number }}
The preparation phase adapts to each context, discovering files, detecting languages, and configuring analyzers appropriately.