The internet has a wealth of freely available information in difficult to consume formats. For example, ClicheSite has lists of cliches, euphemisms, and other phrases that are just perfect for a hangman or Wheel of Fortune game. Coincidentally, I wanted to write such a game to learn some new technologies, so I wrote a short, simple Node.js script to scrape some pages for those phrases. In this tutorial, I’ll discuss how to make an HTTP request and get a page’s HTML, how to parse the HTML for specific information, and why you would and would not want to do this in the first place.
Read more »
Tag Archives: regex
Introduction to web scraping with Node.js
Posted by Nelson
on November 8, 2011
3 comments
Recent Comments