Skip to content
Scan a barcode
Scan
Paperback Multimodal AI Systems: Architectures, Training, and Applications (Transformer Principles Series) Book

ISBN: B0H6NW8LG1

ISBN13: 9798184326054

Multimodal AI Systems: Architectures, Training, and Applications (Transformer Principles Series)

The Transformer Principles Series is a three-volume graduate-level treatise that builds a complete mathematical and engineering understanding of modern AI systems, from the foundational attention mechanism to large language models and multimodal architectures.

Volume III - Multimodal AI Systems: Architectures, Training, and Applications extends the Transformer paradigm beyond text into vision, audio, and video. It covers modality-specific encoders and tokenizers, cross-modal fusion and contrastive alignment (CLIP, SigLIP), diffusion and flow-matching generative models, vision-language architectures (ViT, LLaVA, Q-Former), text-to-image and text-to-video generation, speech and audio processing, efficient inference for multimodal models, long-context scaling, and reasoning agents that perceive and act across modalities.

Recommended

Format: Paperback

Condition: New

$72.64
Save $7.35!
List Price $79.99
Ships within 2-3 days
Save to List

Customer Reviews

0 rating
Copyright © 2026 Thriftbooks.com Terms of Use | Privacy Policy | Do Not Sell/Share My Personal Information | Cookie Policy | Cookie Preferences | Accessibility Statement
ThriftBooks ® and the ThriftBooks ® logo are registered trademarks of Thrift Books Global, LLC
GoDaddy Verified and Secured