Skip to content
Scan a barcode
Scan
Paperback AI Inference with Ollama, llama.cpp, and vLLM Book

ISBN: 1105842738

ISBN13: 9781105842733

AI Inference with Ollama, llama.cpp, and vLLM

The era of cloud-dependent AI is over. Today's developers can run state-of-the-art language models on their own hardware-from laptops to GPU clusters-without ever sending data to a third party. But the gap between downloading a model and deploying it efficiently is filled with questions about quantization, memory bandwidth, batching strategies, and tool selection. This book is your guide through that gap, showing you how to build scalable, cost-effective inference systems using the three pillars of open-source AI: Ollama, llama.cpp, and vLLM. AI Inference with Ollama, llama.cpp, and vLLM takes you from running your first local model in minutes to optimizing production deployments serving thousands of requests per second. You'll learn when to use each tool, how to navigate the memory wall that bottlenecks LLM performance, and how to choose the right hardware and quantization strategy for your use case. Whether you're building RAG systems, deploying chatbots, or scaling inference across GPU clusters, this book gives you the practical knowledge to move from experimentation to production with confidence. About the Author GK Marballi has spent 20+ years turning data into competitive advantage for global brands from Priceline to S&P Global and Barnes & Noble. He has led high-impact product and analytics teams, and navigated the front lines of the AI revolution. He is based in New York City and holds an MBA from Harvard Business School.

Recommended

Format: Paperback

Condition: New

$26.68
Ships within 2-3 days
Save to List

Customer Reviews

0 rating
Copyright © 2026 Thriftbooks.com Terms of Use | Privacy Policy | Do Not Sell/Share My Personal Information | Cookie Policy | Cookie Preferences | Accessibility Statement
ThriftBooks® and the ThriftBooks® logo are registered trademarks of Thrift Books Global, LLC
GoDaddy Verified and Secured