Tag Archives: node.js

Introduction to web scraping with Node.js

The internet has a wealth of freely available information in difficult to consume formats.  For example, ClicheSite has lists of cliches, euphemisms, and other phrases that are just perfect for a hangman or Wheel of Fortune game.  Coincidentally, I wanted to write such a game to learn some new technologies, so I wrote a short, simple Node.js script to scrape some pages for those phrases.  In this tutorial, I’ll discuss how to make an HTTP request and get a page’s HTML, how to parse the HTML for specific information, and why you would and would not want to do this in the first place.
Read more »