SKILL·5F0C69

generate-image

Name: generate-image
Author: K-Dense-AI

K-Dense-AI

Mis à jour 1 month ago

31,025

3,113

31,025

Voir sur GitHub

Métaaidesigndata

À propos

Cette compétence génère et modifie des images à usage général telles que des photos, des œuvres d'art et des ressources visuelles à l'aide de modèles d'IA comme FLUX. Elle est spécifiquement conçue pour l'imagerie non technique, les développeurs doivent donc utiliser la compétence distincte `scientific-schematics` pour les diagrammes techniques comme les organigrammes ou les circuits. Elle nécessite une clé API OpenRouter pour fonctionner.

Installation rapide

Claude Code

Recommandé

Principal

npx skills add K-Dense-AI/claude-scientific-skills -a claude-code

Commande PluginAlternatif

/plugin add https://github.com/K-Dense-AI/claude-scientific-skills

Git CloneAlternatif

git clone https://github.com/K-Dense-AI/claude-scientific-skills.git ~/.claude/skills/generate-image

Copiez et collez cette commande dans Claude Code pour installer cette compétence

Documentation

Generate Image

Generate and edit high-quality images using OpenRouter's image generation models including FLUX.2 Pro and Gemini 3.1 Flash Image Preview.

When to Use This Skill

Use generate-image for:

Photos and photorealistic images
Artistic illustrations and artwork
Concept art and visual concepts
Visual assets for presentations or documents
Image editing and modifications
Any general-purpose image generation needs

Use scientific-schematics instead for:

Flowcharts and process diagrams
Circuit diagrams and electrical schematics
Biological pathways and signaling cascades
System architecture diagrams
CONSORT diagrams and methodology flowcharts
Any technical/schematic diagrams

Quick Start

Use the scripts/generate_image.py script to generate or edit images:

# Generate a new image
python scripts/generate_image.py "A beautiful sunset over mountains"

# Edit an existing image
python scripts/generate_image.py "Make the sky purple" --input photo.jpg

This generates/edits an image and saves it as generated_image.png in the current directory.

API Key Setup

CRITICAL: The script requires an OpenRouter API key. Before running, check if the user has configured their API key:

Look for a .env file in the project directory or parent directories
Check for OPENROUTER_API_KEY=<key> in the .env file
If not found, inform the user they need to:
- Create a .env file with OPENROUTER_API_KEY=your-api-key-here
- Or set the environment variable: export OPENROUTER_API_KEY=your-api-key-here
- Get an API key from: https://openrouter.ai/keys

The script will automatically detect the .env file and provide clear error messages if the API key is missing.

Model Selection

Default model: google/gemini-3.1-flash-image-preview (high quality, recommended)

Available models for generation and editing:

google/gemini-3.1-flash-image-preview - High quality, supports generation + editing
black-forest-labs/flux.2-pro - Fast, high quality, supports generation + editing

Generation only:

black-forest-labs/flux.2-flex - Fast and cheap, but not as high quality as pro

Select based on:

Quality: Use gemini-3.1-flash-image-preview or flux.2-pro
Editing: Use gemini-3.1-flash-image-preview or flux.2-pro (both support image editing)
Cost: Use flux.2-flex for generation only

Common Usage Patterns

Basic generation

python scripts/generate_image.py "Your prompt here"

Specify model

python scripts/generate_image.py "A cat in space" --model "black-forest-labs/flux.2-pro"

Custom output path

python scripts/generate_image.py "Abstract art" --output artwork.png

Edit an existing image

python scripts/generate_image.py "Make the background blue" --input photo.jpg

Edit with a specific model

python scripts/generate_image.py "Add sunglasses to the person" --input portrait.png --model "black-forest-labs/flux.2-pro"

Edit with custom output

python scripts/generate_image.py "Remove the text from the image" --input screenshot.png --output cleaned.png

Multiple images

Run the script multiple times with different prompts or output paths:

python scripts/generate_image.py "Image 1 description" --output image1.png
python scripts/generate_image.py "Image 2 description" --output image2.png

Script Parameters

prompt (required): Text description of the image to generate, or editing instructions
--input or -i: Input image path for editing (enables edit mode)
--model or -m: OpenRouter model ID (default: google/gemini-3.1-flash-image-preview)
--output or -o: Output file path (default: generated_image.png)
--api-key: OpenRouter API key (overrides .env file)

Example Use Cases

For Scientific Documents

# Generate a conceptual illustration for a paper
python scripts/generate_image.py "Microscopic view of cancer cells being attacked by immunotherapy agents, scientific illustration style" --output figures/immunotherapy_concept.png

# Create a visual for a presentation
python scripts/generate_image.py "DNA double helix structure with highlighted mutation site, modern scientific visualization" --output slides/dna_mutation.png

For Presentations and Posters

# Title slide background
python scripts/generate_image.py "Abstract blue and white background with subtle molecular patterns, professional presentation style" --output slides/background.png

# Poster hero image
python scripts/generate_image.py "Laboratory setting with modern equipment, photorealistic, well-lit" --output poster/hero.png

For General Visual Content

# Website or documentation images
python scripts/generate_image.py "Professional team collaboration around a digital whiteboard, modern office" --output docs/team_collaboration.png

# Marketing materials
python scripts/generate_image.py "Futuristic AI brain concept with glowing neural networks" --output marketing/ai_concept.png

Error Handling

The script provides clear error messages for:

Missing API key (with setup instructions)
API errors (with status codes)
Unexpected response formats
Missing dependencies (requests library)

If the script fails, read the error message and address the issue before retrying.

Notes

Images are returned as base64-encoded data URLs and automatically saved as PNG files
The script supports both images and content response formats from different OpenRouter models
Generation time varies by model (typically 5-30 seconds)
For image editing, the input image is encoded as base64 and sent to the model
Supported input image formats: PNG, JPEG, GIF, WebP
Check OpenRouter pricing for cost information: https://openrouter.ai/models

Image Editing Tips

Be specific about what changes you want (e.g., "change the sky to sunset colors" vs "edit the sky")
Reference specific elements in the image when possible
For best results, use clear and detailed editing instructions
Both Gemini 3.1 Flash Image Preview and FLUX.2 Pro support image editing through OpenRouter

Integration with Other Skills

scientific-schematics: Use for technical diagrams, flowcharts, circuits, pathways
generate-image: Use for photos, illustrations, artwork, visual concepts
scientific-slides: Combine with generate-image for visually rich presentations
latex-posters: Use generate-image for poster visuals and hero images

Dépôt GitHub

K-Dense-AI/claude-scientific-skills

Chemin: skills/generate-image

agent-skillsai-scientistbioinformaticschemoinformaticsclaudeclaude-skills

FAQ

Frequently asked questions

What is the generate-image skill?

generate-image is a Claude Skill by K-Dense-AI. Skills package instructions and resources that Claude loads on demand, so Claude can perform generate-image-related tasks without extra prompting.

How do I install generate-image?

Use the install commands on this page: add generate-image to Claude Code as a plugin, or clone its repository into your skills directory, then restart Claude so it picks up the skill.

What category does generate-image belong to?

generate-image is in the Meta category, tagged ai, design and data.

Is generate-image free to use?

Yes. generate-image is listed on AIMCP and free to install. It runs inside Claude, so no separate service account is required to use the skill itself.

Compétences associées

content-collections

Méta

Cette compétence propose une configuration éprouvée en production pour Content Collections, un outil axé sur TypeScript qui transforme des fichiers Markdown/MDX en collections de données typées de manière sûre avec une validation Zod. Utilisez-la lors de la création de blogs, de sites de documentation ou d'applications Vite + React riches en contenu pour garantir la sécurité de typage et la validation automatique du contenu. Elle couvre tout, de la configuration du plugin Vite et de la compilation MDX à l'optimisation des déploiements et la validation des schémas.

Voir la compétence

polymarket

Méta

Cette compétence permet aux développeurs de créer des applications avec la plateforme de marchés prédictifs Polymarket, incluant l'intégration d'API pour le trading et les données de marché. Elle fournit également une diffusion de données en temps réel via WebSocket pour surveiller les transactions en direct et l'activité du marché. Utilisez-la pour mettre en œuvre des stratégies de trading ou pour créer des outils traitant les mises à jour de marché en direct.

Voir la compétence

creating-opencode-plugins

Méta

Cette compétence aide les développeurs à créer des plugins OpenCode qui s'interconnectent avec plus de 25 types d'événements tels que les commandes, les fichiers et les opérations LSP. Elle fournit la structure du plugin, les spécifications de l'API événementielle et les modèles d'implémentation pour les modules JavaScript/TypeScript. Utilisez-la lorsque vous avez besoin d'intercepter, de surveiller ou d'étendre le cycle de vie de l'assistant IA OpenCode avec une logique personnalisée pilotée par les événements.

Voir la compétence

sglang

Méta

SGLang est un framework de service LLM haute performance spécialisé dans la génération rapide et structurée pour les workflows JSON, regex et agentiques grâce à son cache de préfixe RadixAttention. Il offre une inférence nettement plus rapide, particulièrement pour les tâches avec des préfixes répétés, ce qui le rend idéal pour les sorties complexes et structurées ainsi que les conversations multi-tours. Choisissez SGLang plutôt que des alternatives comme vLLM lorsque vous avez besoin d'un décodage contraint ou que vous construisez des applications avec un partage étendu de préfixes.

Voir la compétence