Skip to content
Paperback Perl & LWP: Fetching Web Pages, Parsing Html, Writing Spiders & More Book

ISBN: 0596001789

ISBN13: 9780596001780

Perl & LWP: Fetching Web Pages, Parsing Html, Writing Spiders & More

Select Format

Select Condition ThriftBooks Help Icon

Recommended

Format: Paperback

Condition: Very Good*

*Best Available: (ex-library)

$6.99
Save $33.00!
List Price $39.99
Almost Gone, Only 1 Left!

Book Overview

Perl soared to popularity as a language for creating and managing web content, but with LWP (Library for WWW in Perl), Perl is equally adept at consuming information on the Web. LWP is a suite of modules for fetching and processing web pages. The Web is a vast data source that contains everything from stock prices to movie credits, and with LWP all that data is just a few lines of code away. Anything you do on the Web, whether it's buying or selling,...

Customer Reviews

5 ratings

This book can teach you expert-level web scraping/munging.

If you aren't yet comfortable using object-oriented Perl modules, the multitude of examples will at least allow you see how it's done even if you're a bit fuzzy on what's happening 'underneath' when you call object methods. If you're comfortable learning how to do something without knowing exactly why it works, then the author's clear step-by-step explantions and numerous progressively more powerful examples should make this book accessible even to relatively innexperienced Perl programmers.More experienced programmers will understand better why things work, but any Perl programmer will set this book down feeling empowered to turn the web into their own valet. No longer do you need to check multiple sites looking for interesting information. Instead, you can readily author code to do that for you and alert you when items of interest are found. You can use these tools to free up personal time, to harvest information to inform business decisions, to automate tedious web application testing, and a zillion other things.The author's clear exploration of the relevant Perl modules leaves the reader with a good depth of understanding of what these modules do, when you might want to use which module, and how to use them for real world tasks. Before reading the book, I knew of these modules, but they were a rather intimidating pile. I'd used a few of them on occasion for rather limited projects, but was reluctant to invest the time required to read all of the documentation from the whole collection. Mountains of method-level documentation do not a tutorial make. This book takes all of that information, selects the most important parts, and ensures that those parts are covered in progressively more powerful and/or flexible examples.If you know Perl and you're sick of 'working the web' to get information and you want the web to work for you instead, then you need this book. I had a personal project that was on the back burner for a couple of years because it just sounded too hard. The weekend after I finished this book, I wrote what I had previously thought to be the hard part of that project and it was both easy and fun. This book makes hard things not just possible, but actually easy.-matt

Fabulous book!

This book is a comprehensive and authoritative guide to web automation. It reads as both a gentle tutorial and a well organized reference. Basic HTTP operation, regexp HTML parsing, tokenizing, cookie authentication, form handling, and robot spidering are covered extensively in numerous case studies and practical examples.Naturally, I was impressed by the simple, consistent treatment of examples: inspect source and find the interesting bits, code things up and then enhance to suit. :-)A particularly satisfying thing to me is the sane way of working, that the author assumes. So many people seem to just bungle their way through web programming while ignoring basics like the robots.txt file. This book helps to prevent this.One would think that only a thick tome would be sufficient to cover such vast territory, but the author (who is an active LWP module developer) does a fabulous job covering this extensive subject matter.I recommend this book both to anyone starting out on their way to working with the underside of the web and to accomplished professionals in need of a full reference manual.

Very Informative and useful

As a web programmer, I had dealt with several such projects dealing with web automation and writing simple crawlers even before I read "Perl & LWP". The book was the first book I've read on the subject, and I'm by no means disappointed. The book is very well organized, very informative and nails the subject in the head. I am pleased. I noticed some inaccuracies in the discussions, some chopped off paragraphs and sentences. But this doesn't affect the usability of the book much. Author Sean Burke does a great job in walking one through the most of the aspects of web automation and data extraction in the web using Perl and LWP (libwww in Perl ). The codes the book gives are very well organized, well written and easily debugable. The steps are pretty consistent across all the examples: a) Inspect the HTML source code of the page; b) Determine the tokens and patterns of interest; c) Write the first code; d) Fine tune the code;As usual, I'll be commenting on individual chapters to give you an idea of thecoverage of the book in more details...

Excellent coverage of LWP, packed full of useful examples

I was definitely interested when I first heard that O'Reilly were publishing a book on LWP. LWP is a definitive collection of perl modules covering everything you could think of doing with URIs, HTML, and HTTP. While 'web services' are the buzzword friendly technology of the day, sometimes you need to roll your sleeves up and get a bit dirty scraping screens and hacking at HTML. For such a deep subject, this book weighs in at a slim 242 pages. This is a very good thing. I'm far too busy to read these massive shelf-destroying tomes that seem to be churned out recently. It covers everything you need to know with concise examples, which is what makes this book really shine. You start with the basics using LWP::Simple through to more advanced topics using LWP::UserAgent, HTTP::Cookies, and WWW::RobotRules. Sean shows finger saving tips and shortcuts that take you more than a couple notches above what you can learn from the lwpcook manpage, with enough depth to satisfy somebody who is an experienced LWP hacker. This book is a great reference, just flick through and you'll find a relevant chapter with an example to save the day. Chapters include filling in forms and extracting data from HTML using regular expressions, then more advanced topics using HTML::TokeParser, and then my preferred tool, the author's own HTML::TreeBuilder. The book ends with a chapter on spidering, with excellent coverage of design and warnings to get your started on your web trawling.

A must-read for exploiting the web in a GOOD way.

A great book for anyone who wishes to automate daily tasks on the web. Sean does an outstanding job of showing how Perl can be used to extract and manipulate not just data but useful information efficiently from the web's vast data resources. I've already adapted an example from this book (link-checking spider) for sites I maintain. Yes, I've known of the LWP module prior to this book. But as a lazy programmer, I rely on others to show me the way. Sean does just that...
Copyright © 2023 Thriftbooks.com Terms of Use | Privacy Policy | Do Not Sell/Share My Personal Information | Cookie Policy | Cookie Preferences | Accessibility Statement
ThriftBooks® and the ThriftBooks® logo are registered trademarks of Thrift Books Global, LLC
GoDaddy Verified and Secured