Mega Tools is a suite of four complementary Go command-line utilities for recursively walking HTTP directory listings to discover files, filtering URLs using regex patterns, batch-downloading files with resumable concurrent transfers and smart caching, and displaying real-time progress bars when piped together into complete download pipelines.
  • Go 99.4%
  • Makefile 0.6%
Find a file
Mike 'Fuzzy' Partin a3cd9cae52 docs: update README with new mega-progress tool and enhanced usage
# Previous message:
docs: update README with new tools and filter example

Co-authored-by: aider (deepseek/deepseek-reasoner) <aider@aider.chat>
2026-02-26 15:23:17 -08:00
cmd feat: add colored error output with ANSI escape codes 2026-02-17 07:44:04 -08:00
konsoru refactor: simplify filter regex and clean up konsoru bar code 2026-02-14 18:59:42 -08:00
signal feat: add graceful shutdown signal handling to all commands 2026-02-14 10:09:45 -08:00
vendor chore: vendor gopkg.in/yaml.v3 dependency 2026-02-08 06:16:42 -08:00
.gitignore chore: ignore aider cache files in git 2026-01-22 11:56:18 -08:00
filter-example.yaml refactor: simplify filter regex and clean up konsoru bar code 2026-02-14 18:59:42 -08:00
go.mod refactor: simplify filter regex and clean up konsoru bar code 2026-02-14 18:59:42 -08:00
go.sum chore: vendor gopkg.in/yaml.v3 dependency 2026-02-08 06:16:42 -08:00
Makefile build: add mega-progress binary to Makefile 2026-02-13 16:27:21 -08:00
README.md docs: update README with new mega-progress tool and enhanced usage 2026-02-26 15:23:17 -08:00

Mega Tools: Web Directory Walker, Batch Downloader, and URL Filter

A suite of command-line tools for recursively walking web directories, filtering URLs, efficiently downloading files in bulk, and displaying progress.

Overview

This project provides four complementary tools:

  1. mega-walk: Recursively traverses HTTP directories to discover files and subdirectories
  2. mega-fetch: Downloads files from a list of URLs with support for resumable downloads, concurrent workers, and structured output
  3. mega-filter: Filters URLs based on configurable accept/reject patterns using regular expressions
  4. mega-progress: Displays real-time progress bar for download operations when piped from mega-fetch

Installation

Prerequisites

  • Go 1.16 or higher

Building from source

# Clone the repository
git clone <repository-url>
cd <repository-directory>

# Build all tools
make

Alternatively, build manually:

go build -o mega-walk ./cmd/mega-walk
go build -o mega-fetch ./cmd/mega-fetch
go build -o mega-filter ./cmd/mega-filter
go build -o mega-progress ./cmd/mega-progress

Usage

mega-walk: Web Directory Walker

Recursively traverses web directories to discover files and URLs.

./mega-walk [-no-head] [-c CONCURRENCY] <base_url>

Options:

  • -no-head: Disable HEAD requests and omit content-length from output
  • -c CONCURRENCY: Maximum number of concurrent requests (default: 10)

Example:

# Basic usage with content-length output
./mega-walk http://ftp.openbsd.org/pub/OpenBSD/7.8/

# Without HEAD requests (faster)
./mega-walk -no-head http://example.com/dir/

# With higher concurrency
./mega-walk -c 20 http://example.com/dir/

Output Format:

  • With -no-head: One URL per line
  • Without -no-head: URL,content-length format

Features:

  • Recursively follows links within the same domain
  • Skips query parameters and fragments
  • Respects directory boundaries
  • Handles common web server directory listings
  • Concurrent request processing

mega-filter: URL Filter Tool

Filters URLs from a list based on configurable accept and reject patterns using regular expressions.

./mega-filter <filter.yaml|filter.json> <input.txt> [output.txt]

Arguments:

  • filter.yaml|filter.json: Configuration file with accept/reject patterns (YAML or JSON format)
  • input.txt: Input file containing URLs (one per line), use - for stdin
  • output.txt: Optional output file (if omitted, writes to stdout)

Filter File Format:

The filter configuration file can be in YAML or JSON format and contains two optional arrays of regular expression patterns:

YAML Example (filter-example.yaml):

accept:
  - "^.*\\.img\\.xz$"
reject: []

JSON Equivalent:

{
  "accept": [
    "^.*\\.img\\.xz$"
  ],
  "reject": []
}

Filtering Logic:

  1. If no accept patterns are specified, all URLs are initially accepted
  2. If accept patterns exist, a URL must match at least one pattern to be considered
  3. If a URL matches any reject pattern, it is excluded regardless of accept matches
  4. URLs are processed line by line from the input file

Example Usage:

# Filter URLs to only include .img.xz files
./mega-filter filter-example.yaml urls.txt filtered-urls.txt

# Filter from stdin and output to stdout
cat urls.txt | ./mega-filter filter-example.yaml - | head -20

mega-fetch: Batch File Downloader

Downloads files from a list of URLs with advanced features.

./mega-fetch [OPTIONS] <url_file> <output_dir>

Options:

  • -w WORKERS: Number of concurrent workers (default: 1)
  • -q: Quiet mode (suppress error output)

Use - as the url_file argument to read from stdin.

Example:

# Basic download with 1 worker
./mega-fetch urls.txt ./downloads/

# Download with 5 concurrent workers
./mega-fetch -w 5 urls.txt ./downloads/

# Download from stdin with quiet mode
cat urls.txt | ./mega-fetch -w 3 -q - ./downloads/

Output Format: Each processed file outputs one line in the format: local_path,bytes_downloaded,status Where status is one of: new, resumed, skipped, or error

Special Total Line: Before processing begins, mega-fetch outputs a special line: total,<count> where <count> is the total number of files to process. This is used by mega-progress to display accurate progress.

Features:

  • Resumable downloads: Automatically resumes interrupted downloads
  • Smart caching: Skips files that already exist with matching size
  • Concurrent downloads: Multiple workers for faster downloads
  • Structured output: Produces machine-readable output for further processing
  • Graceful shutdown: Handles interrupt signals properly

mega-progress: Progress Display Tool

Displays a real-time progress bar for download operations when piped from mega-fetch.

./mega-progress

This tool reads from stdin and expects the output format from mega-fetch. It automatically displays:

  • Progress bar with percentage completion
  • File count (New, Resumed, Skipped, Total)
  • Total bytes transferred
  • Download speed
  • Color-coded output

Example Usage:

# Full pipeline with progress display
./mega-walk http://example.com/dir/ | \
  ./mega-fetch -w 5 - ./downloads/ | \
  ./mega-progress

# Or with filtering
./mega-walk http://example.com/dir/ | \
  ./mega-filter filter-example.yaml - | \
  ./mega-fetch -w 5 - ./downloads/ | \
  ./mega-progress

Features:

  • Real-time progress updates
  • Color-coded status display
  • Terminal-aware output (cleans up display lines)
  • Speed calculation and human-readable sizes
  • Graceful shutdown handling
  • Spinner display before total count is known

Advanced Usage

Pipeline Examples

Combine all tools for a complete workflow:

Basic download workflow:

# Walk a directory and download all files with progress
./mega-walk http://ftp.openbsd.org/pub/OpenBSD/7.8/ | \
  ./mega-fetch -w 10 - ./downloads/ | \
  ./mega-progress

Advanced workflow with filtering:

# Walk a directory, filter for specific files, then download with progress
./mega-walk http://example.com/media/ > all-urls.txt

# Filter to only include .img.xz files using the example filter
./mega-filter filter-example.yaml all-urls.txt media-urls.txt

# Download filtered media files with progress tracking
cat media-urls.txt | ./mega-fetch -w 8 - ./downloads/ | ./mega-progress

Complete one-liner:

./mega-walk http://example.com/data/ | \
  ./mega-filter filter-example.yaml - | \
  ./mega-fetch -w 5 - ./downloads/ | \
  ./mega-progress

Saving results to file while showing progress:

# Use tee to save results and show progress
./mega-walk http://example.com/ | \
  ./mega-fetch -w 5 - ./downloads/ | \
  tee download-results.txt | \
  ./mega-progress

Makefile Targets

The included Makefile provides convenient shortcuts:

# Build all tools (including mega-progress)
make

# Clean build artifacts
make clean

# Run the test workflow
make test

The make test command will:

  1. Build all tools
  2. Test mega-walk with OpenBSD's FTP server
  3. Test mega-fetch with the discovered URLs
  4. Test mega-filter with a sample URL list and the provided filter-example.yaml

How It Works

mega-walk Internals

  1. Starts at the base URL and fetches the HTML content
  2. Parses HTML to extract all <a href="..."> links
  3. Filters links to stay within the same domain and base path
  4. Recursively processes directories using concurrent goroutines
  5. Outputs discovered URLs (with optional content-length) to stdout

mega-fetch Internals

  1. Reads URLs from a file or stdin
  2. Outputs total,<count> line with total file count
  3. For each URL:
    • Checks if file already exists locally
    • Performs HEAD request to check file size
    • Resumes download if partial file exists and server supports it
    • Downloads fresh copy if needed
    • Outputs local_path,bytes,status for each processed file
  4. Uses worker goroutines for concurrent downloads

mega-progress Internals

  1. Reads structured output from mega-fetch stdin
  2. Parses the total,<count> line to know total file count
  3. Updates counters for new, resumed, and skipped files
  4. Calculates transfer speed and progress percentage
  5. Displays real-time progress bar with ANSI escape codes
  6. Shows spinner animation before total count is known
  7. Cleans up terminal output on completion

File Structure

.
├── cmd/
│   ├── mega-walk/
│   │   └── main.go          # Directory walking tool
│   ├── mega-fetch/
│   │   └── main.go          # Batch downloading tool
│   ├── mega-filter/
│   │   └── main.go          # URL filtering tool
│   └── mega-progress/
│       └── main.go          # Progress display tool
├── konsoru/
│   ├── bar/                 # Progress bar library
│   ├── color/               # Color utilities
│   ├── cursor/              # Cursor control
│   ├── style/               # Terminal styles
│   └── util/                # Utility functions
├── signal/
│   └── signal.go            # Graceful shutdown handling
├── vendor/                  # External dependencies
├── Makefile                 # Build and test automation
├── filter-example.yaml      # Example filter configuration
└── README.md               # This file

Dependencies

  • golang.org/x/net/html: HTML parsing for mega-walk
  • golang.org/x/term: Terminal handling for progress display
  • gopkg.in/yaml.v3: YAML parsing for mega-filter

All dependencies are vendored in the vendor/ directory.

Limitations

  • mega-walk: Only works with web servers that provide directory listings in HTML format
  • mega-fetch: Resume functionality depends on server support for Range headers
  • mega-filter: Requires basic understanding of regular expressions for pattern creation
  • All tools are designed for public HTTP/HTTPS servers, not authenticated sites

Troubleshooting

Common Issues

  1. "Error: HEAD request failed"

    • Some servers block HEAD requests
    • Use -no-head flag with mega-walk to skip HEAD requests
    • mega-fetch will fall back to GET requests automatically
  2. Progress bar not displaying

    • Ensure terminal supports ANSI escape codes
    • Pipe mega-fetch output through mega-progress
  3. Downloads stopping prematurely

    • Check network connectivity
    • Reduce number of workers with -w flag
    • Some servers may rate-limit concurrent connections
  4. No output from mega-progress

    • Ensure mega-fetch is producing structured output
    • Check that mega-fetch is not using quiet mode (-q) if you want to see errors

Contributing

  1. Fork the repository
  2. Create a feature branch
  3. Make your changes
  4. Test thoroughly
  5. Submit a pull request

License

[Specify your license here]

Acknowledgments

  • Built with Go's excellent standard library
  • Inspired by traditional Unix tools like wget and curl
  • Uses the konsoru library for terminal UI components