StatCrunch logo (home)

StatCrunchThis

The StatCrunchThis bookmarklet has been updated. Please reinstall to restore full functionality.

This bookmarklet allows you to pull data sets contained on many Web pages in various forms directly into StatCrunch for analysis.

1. Installation

To get started, save the link below to your bookmarks or favorites folder.

StatCrunchThis

2. Usage

The StatCrunchThis bookmarklet injects JavaScript into the page that scours its contents for HTML tables, embedded Google spreadsheets and links to Excel/Text files. The bookmarklet then submits the contents of these items to StatCrunch for analysis.

When you find data tables on a Web page, choose the StatCrunchThis option under the Favorites/Bookmarks menu of your browser.

  • If you want to load data contained on a single Web page, click the Extract data from this page option, and the data tables from the page will be loaded into StatCrunch.
  • If you would like to compile data that is spread across several pages, click the Extract data from a sequence of pages option on the first page containing your data. Relaunch the bookmarklet as you navigate to each additional page by clicking on it under the Favorites/Bookmarks menu of your browser. Each time you launch the bookmarklet, the data you have loaded (including that on the current page) will be displayed in an editable form. Once all of the data you want is compiled, click the Submit button to load the data into StatCrunch. If at any time you want to stop this process, click the Cancel button to return StatCrunchThis to its original state.
  • The Be number greedy option can be used with either the single page or multiple page options discussed above. If this option is turned on, StatCrunchThis will try to force the content of table cells into a numeric value if at all possible. For example, a cell containing the text value 5.33g will be translated into 5.33 when this option is turned on.

    This program works pretty well for most web pages, but always double check the output to make sure it matches the original data. Be very careful with pages that contain tables with very complicated header information.

    3. Examples

    Some examples of Web pages with data sets that play nicely with StatCrunchThis are listed below. Make sure you sign in to StatCrunch before using StatCrunchThis with the pages below.

  • North Carolina Pick 4 Lottery Results
    This site provides the historical data from the lottery by month. Data from multiple months ca be loaded with StatCrunchThis using the Extract data from a sequence of pages option.

  • List of teams with the most victories in NCAA Division I men's college basketball
    This page contains one simple table that loads into StatCrunch perfectly.

  • NCAA passing leaders
    This page has a table with a complicated header spanning several rows. The data loads nicely into StatCrunch with the headers appended to one another in a nice fashion. StatCrunchThis will order the resulting tables according to the number of entries.

  • Tax rates of Europe
    This page contains one simple table with some entries that have a mix of text and numbers. Note that percentages like "25%" are read into StatCrunch as numbers, but percentages with trailing text like "5% Federal ..." are not.

  • Fast Food Restaurants & Nutrition Facts Compared
    This page contains lots of tables and all of them are loaded into StatCrunch when you use the StatCrunchThis bookmarklet. The tables are sorted in terms of the number of elements in a descending fashion.

  • Beer Calories, Beer Alcohol, Beer Carb Content
    This page contains one large table. Some of the rows in the table contain advertisements, but when analyzing the data only the first column will be impacted. These problematic cells can be deleted after the data is loaded into StatCrunch. Simply click in the cell and hit the Delete key (Command-Delete on a Mac).

  • Top 250 movies as voted by IMDB users
    This page contains one simple table that loads into StatCrunch perfectly.

  • MLB hitting leaders
    This page has a table with a complicated header spanning several rows. The data loads into StatCrunch with the headers appended to one another in a nice fashion.

  • Fuel economy reports for the 2009 Toyota Prius
    This page contains a simple table that loads perfectly into StatCrunch. The tricky part with this table is that it contains a number of empty cells and one row without any header information.

  • An embedded Google spreadsheet
    This page contains an embedded Google spreadsheet that loads into StatCrunch perfectly.