This is a textbook for teaching large-scale database technologies in universities. This first edition covers the most important parts of my Master's level Big Data course at ETH Zurich. It covers a brush-up on relational database management (relational algebra, SQL), cloud storage, distributed file systems (HDFS), syntax (XML, JSON...), wide column stores (HBase), denormalized data models, validation (XML Schema, JSON Schema, JSound) and formats (Parquet...), distributed parallel processing (MapReduce, Apache Spark), resource management, document stores, and query languages for denormalized data (JSONiq). Note that the book is also available for free online, as a PDF, to make it accessible for all students and educators, but you can also support the project by purchasing a printed copy or a Kindle print replica.
ThriftBooks sells millions of used books at the lowest everyday prices. We personally assess every book's quality and offer rare, out-of-print treasures. We deliver the joy of reading in recyclable packaging with free standard shipping on US orders over $15. ThriftBooks.com. Read more. Spend less.