Descriptions in this appendix prioritise workflow characteristics and deployment constraints over superlatives or leaderboard language.
| Tool | Description | |ββ|ββββ- | | Midjourney v7 | Strong stylistic control with Discord and web-based image workflows. | | DALL-E 3.5 | ChatGPT-integrated image generation with reliable prompt following and text rendering. | | FLUX 2 | Open-weight image models aimed at photorealistic output and flexible deployment. | | Stable Diffusion 3.5 | Open-source. ControlNet, LoRAs, ComfyUI ecosystem. | | Adobe Firefly 3 | Licensed data only. Commercial indemnification. Photoshop. | | Google Imagen 4 | Photorealistic image generation exposed through Googleβs AI tooling. | | Ideogram v3 | Strong text rendering for logos, posters, and design-heavy image outputs. | | Leonardo AI | Multi-model. Realtime Canvas. 3D gaming assets. Canva-owned. | | Recraft | Design-focused. Vector art, brand consistency. |
| Tool | Description | |ββ|ββββ- | | Sora 2 | Video generation focused on longer-form scene consistency and motion realism. | | Google Veo 3.1 | High-fidelity video generation with native audio and Vertex AI integration. | | Runway Gen-4.5 | Video generation and editing suite with motion-directed controls. | | Kling 3.0 | Longer-form 4K video generation with API access and native audio. | | Seedance 2.0 | Quad-modal input. Lip sync. 2K resolution. | | Pika 2.5 | Beginner-friendly. Pikaswaps. Fast renders. | | Luma Dream Machine | 4K HDR. Physics simulation. 3D/cinematic. | | HaiLuo AI | Budget video. 10 free/day. MiniMax. | | Wan 2.1 | Open-source video generation model suitable for self-hosted experimentation. | | HunyuanVideo | Tencent OSS. Consumer GPU. Multi-style. | | LTX Video | OSS. Licensed data. Clear commercial terms. |
| Tool | Description | |ββ|ββββ- | | Suno | Text-to-song generation with full tracks and vocal production. | | Udio | High-fidelity music gen. Fine control. | | ElevenLabs Music | Vocals, instrumentals. Sectional editing. Stem separation. | | Stable Audio | High-quality. Commercial license. | | Meta AudioCraft | OSS. MusicGen + AudioGen. |
| Tool | Description | |ββ|ββββ- | | Meshy | Text/image to 3D. Game assets, products. | | Tripo AI | Fast 3D from text/images. Multi-format export. | | Vizcom | Real-time AI rendering for industrial designers. |