2025-09-11 22:40:39 +02:00
2025-09-11 22:23:36 +02:00
2025-09-11 22:23:36 +02:00
2025-09-11 22:23:36 +02:00
2025-09-11 22:12:12 +02:00
2025-09-11 22:40:39 +02:00

CBZ Generator

Overview

CBZ Generator is a Python tool that creates realistic comic book archive (CBZ) files with proper ComicInfo.xml metadata for testing comic book management software. It generates complete comic series with authentic publisher/character relationships, realistic publication schedules, and proper file organization. Can easily generate 100,000+ comics in an hour on modern hardware.

Features

  • Realistic Comic Metadata: Generates proper ComicInfo.xml with series names, issue numbers, publication dates, publishers, writers, and characters
  • Authentic Publishing Schedule: Respects monthly publication cycles for regular series and annual schedules for special issues
  • Multiple Format Support: Main Series, Limited Series, One-Shots, TPBs, Annuals, Director's Cuts, and more
  • Continuation Support: Can continue existing series by scanning for the highest issue number
  • Publisher Ecosystem: Creates realistic relationships between publishers, writers, and character rosters
  • Organized File Structure: Automatically organizes files by Publisher/Character/Format/Series hierarchy
  • Configurable Data: JSON-based configuration for publishers, writers, characters, and series names
  • Year Cap Protection: Built-in safeguards prevent generation beyond 2025

Installation

Requirements

  • Python 3.7+
  • Pillow library for image generation

Install Dependencies

pip install pillow

Clone Repository

git clone https://gitea.baerentsen.space/FrederikBaerentsen/CBZGenerator
cd cbz-generator

Usage

Basic Usage

# Generate 10 comic series
python generate_cbz.py 10

# Generate 50 series with custom output directory
python generate_cbz.py 50 --out /path/to/comics

# Use custom data file
python generate_cbz.py 25 --data my_comic_data.json

# Set random seed for reproducible results
python generate_cbz.py 15 --seed 12345

Command Line Options

Option Description Default
count Number of comic series to generate Required
--out Output base directory output_cbz
--data Path to comic data JSON file ./comicdata.json
--seed Random seed for reproducibility None

Configuration

The comicdata.json file defines the comic universe:

Publishers Structure

Each publisher includes:

  • name: Publisher name (e.g., "Ironclad Comics")
  • writers: Array of writer names associated with this publisher
  • characters: Array of character names published by this publisher

Works Array

Defines base series names that can be combined with various patterns like:

  • {work} → "Odyssey"
  • The {work} → "The Chronicles"
  • {work}: Genesis → "Inferno: Genesis"
  • {work} & {work2} → "Saga & Legends"

File Organization

Generated CBZ files are organized in a hierarchical structure:

output_cbz/
├── Ironclad-Comics/
│   ├── Steel-Sentinel/
│   │   ├── Main-Series/
│   │   │   └── (1985)-Steel-Sentinel-Legacy/
│   │   │       ├── Steel Sentinel Legacy #001 [March, 1985].cbz
│   │   │       ├── Steel Sentinel Legacy #002 [April, 1985].cbz
│   │   │       └── ...
│   │   └── Annual/
│   │       └── (1987)-Steel-Sentinel-Annual/
│   │           └── Steel Sentinel Annual #001 [December, 1987].cbz
│   └── Crimson-Flame/
│       └── Limited-Series/
│           └── (1990)-Crimson-Flame-Rising/
│               └── ...
└── Nebula-Press/
    └── ...

CBZ File Contents

Each generated CBZ file contains:

  • ComicInfo.xml: Complete metadata including title, series, issue number, publication date, publisher, writer, characters, format, and page count
  • Image Pages: JPEG pages (P00001.jpg through P0000X.jpg) with simple text content
  • Proper Compression: ZIP compression for realistic file sizes

Series Types & Issue Counts

Format Typical Issue Count
Main Series 1-500 issues
Limited Series 1-15 issues
One-Shot 1 issue
TPB 1-10 volumes
Annual 1-5 issues
Director's Cut 1-5 issues

Publication Date Logic

  • Monthly Series: Issues published monthly starting from a random date, ensuring the series doesn't extend beyond 2025
  • Annuals: Published yearly with consistent month across all issues
  • Date Validation: All publication dates are capped at December 2025
  • Continuation: When continuing existing series, original publication schedule is preserved

Error Handling

The tool includes robust error handling for:

  • Invalid JSON configuration files
  • Missing required fields in publisher data
  • File system permission issues
  • Image generation failures
  • Date calculation edge cases

Example Output

A typical generated CBZ might contain:

  • Filename: Steel Sentinel Legacy #042 [June, 1988].cbz
  • Publisher: Ironclad Comics
  • Writer: Marcus Hale
  • Character: Steel Sentinel
  • Format: Main Series
  • Page Count: 5-10 pages
  • Volume Year: 1985 (year of issue #1)

Use Cases

  • Stress Testing: Test comic management software with large libraries
  • Performance Testing: Evaluate application performance with realistic datasets
  • Development: Create test data for comic book applications
  • Benchmarking: Compare comic reader/organizer performance across different libraries
Description
Generate thousands of realistic comic book CBZ files with proper metadata for stress testing comic management applications.
Readme MIT 43 KiB
Languages
Python 100%