Small Language Models: When Smaller Is Better

By Miguel Torres

No Customer Reviews

Bigger is not always better. In production AI systems, bigger is often slower, more expensive, harder to deploy, harder to customize, and harder to control.

Small Language Models: When Smaller Is Better is a practical guide to building useful AI systems when latency, cost, privacy, reliability, and deployment constraints matter as much as raw benchmark scores.

Large language models are extraordinary generalists, but most products do not need the largest possible model for every request. They need the right model for the job. Sometimes that means a compact local model. Sometimes it means a fine-tuned specialist. Sometimes it means retrieval, routing, adapters, quantization, or a hybrid system where a small model handles the common path and a larger model becomes the fallback.

This book treats small language models as engineering components, not as weaker clones of frontier models. You will learn how to reason about SLMs as classifiers, extractors, summarizers, local assistants, retrieval partners, tool callers, routing stages, draft generators, privacy-preserving workers, and cost-control mechanisms inside real systems.

Inside, you will learn how to:

Decide when a small language model is good enough, and when it is notUnderstand tokens, embeddings, attention, context windows, KV cache, logits, sampling, and instruction tuningThink clearly about scaling laws, data quality, synthetic data, distillation, and the lessons behind Phi-style training recipesUse compression techniques such as distillation, pruning, quantization, LoRA, QLoRA, and adapter-based fine-tuningChoose an SLM by task fit, license, hardware target, latency budget, context window, evaluation results, and operational riskRun models locally with tools and formats such as llama.cpp, GGUF, Ollama, ONNX Runtime GenAI, MLX, vLLM, and related inference stacksDesign retrieval-augmented generation systems that help smaller models answer with better contextBuild evaluations that measure task quality, hallucination risk, latency, regressions, and cost-per-successUse routing, cascades, speculative decoding, tool calling, structured outputs, caching, and AI gatewaysHandle safety, privacy, governance, model observability, rollout strategy, and production operation

The book is written for backend engineers, platform engineers, machine learning engineers, product engineers, architects, tech leads, and developers who want to build AI systems that survive real constraints. You do not need to be a research scientist. You need enough technical grounding to ask better questions before sending every request to the biggest model available.

If you are building AI features for mobile, desktop, edge devices, private environments, customer VPCs, low-latency workflows, high-volume products, or specialized domain tasks, this book gives you the mental models and system-design vocabulary to make better trade-offs.

By the end, you will have a practical decision framework for answering the central question: when is a smaller model not just cheaper, but architecturally better?

Format:Paperback

Language:English

ISBN:B0H2GPZQ2N

ISBN13:9798197568922

Release Date:May 2026

Publisher:Independently Published

Length:212 Pages

Weight:0.64 lbs.

Dimensions:0.5" x 6.0" x 9.0"

Related Subjects

Computers Computers & Technology

Customer Reviews

0 rating

Write a review

ThriftBooks sells millions of used books at the lowest everyday prices. We personally assess every book's quality and offer rare, out-of-print treasures. We deliver the joy of reading in recyclable packaging with free standard shipping on US orders over $20. ThriftBooks.com. Read more. Spend less.

Copyright © 2026 Thriftbooks.com Terms of Use | Privacy Policy | Do Not Sell/Share My Personal Information | Cookie Policy | Cookie Preferences | Accessibility Statement
ThriftBooks ^® and the ThriftBooks ^® logo are registered trademarks of Thrift Books Global, LLC

Small Language Models: When Smaller Is Better

Recommended

Customer Reviews

Popular Categories

Website

My Account

Partnerships

Quick Help

About Us

Follow Us