Skip to content

Git Analyzer

Analyze git history for extraction insights. Examines git history to understand hot files, stable files, contributor patterns, module evolution, and coupling between files.

Important: Requires a full git clone (not --depth 1). Shallow clones don't have full history.

Usage

CLI

python git_analyzer.py <repo_path>
python git_analyzer.py <repo_path> --output history.json
python git_analyzer.py <repo_path> --days 180
python git_analyzer.py <repo_path> --json
python git_analyzer.py <repo_path> --output history.md

MCP

Note: No MCP tool for git_analyzer. Use CLI only.

Example Output

{
  "repository": "/path/to/repo",
  "analysis_days": 365,
  "since_date": "2024-03-26",
  "until_date": "2025-03-26",
  "summary": {
    "total_commits": 487,
    "files_changed": 156,
    "contributors": 12,
    "avg_commits_per_day": 1.3
  },
  "hot_files": [
    {
      "path": "src/core/engine.py",
      "changes": 47,
      "commits": 38,
      "last_changed": "2025-03-20",
      "stability": "volatile"
    },
    {
      "path": "src/api/handler.py",
      "changes": 34,
      "commits": 28,
      "last_changed": "2025-03-15",
      "stability": "volatile"
    },
    {
      "path": "README.md",
      "changes": 12,
      "commits": 11,
      "last_changed": "2025-03-10",
      "stability": "stable"
    },
    {
      "path": "src/models.py",
      "changes": 8,
      "commits": 7,
      "last_changed": "2025-02-28",
      "stability": "stable"
    },
    {
      "path": "package.json",
      "changes": 5,
      "commits": 4,
      "last_changed": "2025-03-01",
      "stability": "stable"
    }
  ],
  "stable_files": [
    {
      "path": "src/db/models.py",
      "changes": 1,
      "commits": 1,
      "last_changed": "2024-06-15",
      "stability": "stable"
    },
    {
      "path": "src/utils/helpers.py",
      "changes": 2,
      "commits": 2,
      "last_changed": "2024-08-20",
      "stability": "stable"
    }
  ],
  "contributors": [
    {
      "name": "Alice Developer",
      "email": "alice@example.com",
      "commits": 156,
      "first_commit": "2023-12-01",
      "last_commit": "2025-03-25"
    },
    {
      "name": "Bob Engineer",
      "email": "bob@example.com",
      "commits": 127,
      "first_commit": "2024-01-15",
      "last_commit": "2025-03-22"
    },
    {
      "name": "Charlie Team",
      "email": "charlie@example.com",
      "commits": 89,
      "first_commit": "2024-02-01",
      "last_commit": "2025-03-20"
    }
  ],
  "coupling": [
    {
      "files": ["src/core/engine.py", "src/api/handler.py"],
      "times_changed_together": 18,
      "combined_changes": 81,
      "coupling_strength": "high"
    },
    {
      "files": ["src/api/handler.py", "src/api/router.py"],
      "times_changed_together": 12,
      "combined_changes": 54,
      "coupling_strength": "medium"
    },
    {
      "files": ["src/db/models.py", "src/core/engine.py"],
      "times_changed_together": 4,
      "combined_changes": 18,
      "coupling_strength": "low"
    }
  ],
  "insights": {
    "most_volatile_file": {
      "path": "src/core/engine.py",
      "reason": "47 changes in 365 days, frequently modified"
    },
    "most_stable_file": {
      "path": "src/db/models.py",
      "reason": "1 change in 365 days, stable interface"
    },
    "highest_coupled_pair": {
      "files": ["src/core/engine.py", "src/api/handler.py"],
      "reason": "Changed together 18 times, suggests tight coupling"
    },
    "best_extraction_candidate": {
      "path": "src/db/models.py",
      "reason": "Stable (few changes), low coupling, safe to extract"
    }
  }
}

Interpretation Guide

Volatile Files: Changed frequently. Often indicate active development areas. Good targets for refactoring but risky to extract without understanding dependencies.

Stable Files: Changed rarely. Good extraction candidates if they have low coupling. Safe to move or refactor.

Coupling Strength: Files changed together indicate tight coupling. High coupling suggests they're logically related (good for keeping together) or should be decoupled.

Insights: Automatically generated suggestions for: - Most volatile file (watch for bugs) - Most stable file (low maintenance risk) - Highest coupled pair (consider for component boundary) - Best extraction candidate (stable + low coupling)

Options

Flag Type Default Description
repo_path positional required Path to git repository
--output, -o string stdout Output file path
--days, -d int 365 Days of history to analyze
--json flag false Output as JSON

Requirements

This tool uses git log directly. Ensure: - You have git installed - The directory is a valid git repository - The clone is full, not shallow (no --depth 1)