CBZ Generator
Overview
CBZ Generator is a Python tool that creates realistic comic book archive (CBZ) files with proper ComicInfo.xml metadata for testing comic book management software. It generates complete comic series with authentic publisher/character relationships, realistic publication schedules, and proper file organization. Can easily generate 100,000+ comics in an hour on modern hardware.
Features
- Realistic Comic Metadata: Generates proper ComicInfo.xml with series names, issue numbers, publication dates, publishers, writers, and characters
- Authentic Publishing Schedule: Respects monthly publication cycles for regular series and annual schedules for special issues
- Multiple Format Support: Main Series, Limited Series, One-Shots, TPBs, Annuals, Director's Cuts, and more
- Continuation Support: Can continue existing series by scanning for the highest issue number
- Publisher Ecosystem: Creates realistic relationships between publishers, writers, and character rosters
- Organized File Structure: Automatically organizes files by Publisher/Character/Format/Series hierarchy
- Configurable Data: JSON-based configuration for publishers, writers, characters, and series names
- Year Cap Protection: Built-in safeguards prevent generation beyond 2025
Installation
Requirements
- Python 3.7+
- Pillow library for image generation
Install Dependencies
pip install pillow
Clone Repository
git clone https://gitea.baerentsen.space/FrederikBaerentsen/CBZGenerator
cd cbz-generator
Usage
Basic Usage
# Generate 10 comic series
python generate_cbz.py 10
# Generate 50 series with custom output directory
python generate_cbz.py 50 --out /path/to/comics
# Use custom data file
python generate_cbz.py 25 --data my_comic_data.json
# Set random seed for reproducible results
python generate_cbz.py 15 --seed 12345
Command Line Options
| Option | Description | Default |
|---|---|---|
count |
Number of comic series to generate | Required |
--out |
Output base directory | output_cbz |
--data |
Path to comic data JSON file | ./comicdata.json |
--seed |
Random seed for reproducibility | None |
Configuration
The comicdata.json file defines the comic universe:
Publishers Structure
Each publisher includes:
- name: Publisher name (e.g., "Ironclad Comics")
- writers: Array of writer names associated with this publisher
- characters: Array of character names published by this publisher
Works Array
Defines base series names that can be combined with various patterns like:
{work}→ "Odyssey"The {work}→ "The Chronicles"{work}: Genesis→ "Inferno: Genesis"{work} & {work2}→ "Saga & Legends"
File Organization
Generated CBZ files are organized in a hierarchical structure:
output_cbz/
├── Ironclad-Comics/
│ ├── Steel-Sentinel/
│ │ ├── Main-Series/
│ │ │ └── (1985)-Steel-Sentinel-Legacy/
│ │ │ ├── Steel Sentinel Legacy #001 [March, 1985].cbz
│ │ │ ├── Steel Sentinel Legacy #002 [April, 1985].cbz
│ │ │ └── ...
│ │ └── Annual/
│ │ └── (1987)-Steel-Sentinel-Annual/
│ │ └── Steel Sentinel Annual #001 [December, 1987].cbz
│ └── Crimson-Flame/
│ └── Limited-Series/
│ └── (1990)-Crimson-Flame-Rising/
│ └── ...
└── Nebula-Press/
└── ...
CBZ File Contents
Each generated CBZ file contains:
- ComicInfo.xml: Complete metadata including title, series, issue number, publication date, publisher, writer, characters, format, and page count
- Image Pages: JPEG pages (P00001.jpg through P0000X.jpg) with simple text content
- Proper Compression: ZIP compression for realistic file sizes
Series Types & Issue Counts
| Format | Typical Issue Count |
|---|---|
| Main Series | 1-500 issues |
| Limited Series | 1-15 issues |
| One-Shot | 1 issue |
| TPB | 1-10 volumes |
| Annual | 1-5 issues |
| Director's Cut | 1-5 issues |
Publication Date Logic
- Monthly Series: Issues published monthly starting from a random date, ensuring the series doesn't extend beyond 2025
- Annuals: Published yearly with consistent month across all issues
- Date Validation: All publication dates are capped at December 2025
- Continuation: When continuing existing series, original publication schedule is preserved
Error Handling
The tool includes robust error handling for:
- Invalid JSON configuration files
- Missing required fields in publisher data
- File system permission issues
- Image generation failures
- Date calculation edge cases
Example Output
A typical generated CBZ might contain:
- Filename:
Steel Sentinel Legacy #042 [June, 1988].cbz - Publisher: Ironclad Comics
- Writer: Marcus Hale
- Character: Steel Sentinel
- Format: Main Series
- Page Count: 5-10 pages
- Volume Year: 1985 (year of issue #1)
Use Cases
- Stress Testing: Test comic management software with large libraries
- Performance Testing: Evaluate application performance with realistic datasets
- Development: Create test data for comic book applications
- Benchmarking: Compare comic reader/organizer performance across different libraries
Languages
Python
100%