The Palantir Foundry Platform offers a rich set of functions for analytics and application development. This book focuses on a relatively new feature of the Foundry: The lightweight transforms. The gain of execution speed and the reduction of spent resources through lightweight transformations is immense. Use cases showed a reduction of spent CPU time to 14% and at the same time a reduction of duration to 23% h compared to the PySpark based version . Consequently, every single transaction in a Foundry stack that is not obviously out of the limits of lightweight transaction should be checked for the eligibility for a migration. The standard engine to process datasets is Apache Spark, a highly efficient engine to process large amounts of data. Spark is able to process billions of rows of data in short time by distributing the task over a cluster and collecting afterwards the results. This is the right choice for large amounts of data, but for smaller amounts that could be even processed by only one node in a server cluster, an alternative architecture is a better choice. In real life it makes no sense to plant flowers with a large excavator and this image applies as well for processing smaller amounts of data. The term "smaller amounts of data" needs to be quantified: The experience with lightweight transforms with the Polars engine showed that using an 8 cores 128GB node configuration, millions of rows can be processed . And grace to avoiding the overhead of the Spark, the processing is done in a fraction of time with significantly less resources. Grace to the fact, that the Polars language is quite similar to PySpark, a smooth transition from PySpark to Lightweight Polars is easily done. This book shows how to implement lightweight transformations, illustrates how a highly efficient development setup can be created and guides through a migration from PySpark to Polars with a comprehensive example. Some lessons learned and a dive into "lazy" mode of Polars complete the book. The appendix contains a "Polars Cheat Sheet" with the most common code bricks in Polars syntax. After reading this book, every developer with experience in writing PySpark transformations will be able to efficiently migrate existing PySpark transformations to lightweight transformations using Polars or write new lightweight transformations.
ThriftBooks sells millions of used books at the lowest everyday prices. We personally assess every book's quality and offer rare, out-of-print treasures. We deliver the joy of reading in recyclable packaging with free standard shipping on US orders over $20. ThriftBooks.com. Read more. Spend less.