About Us

What is Polar Data Insights?

JPL and USC, under the direction of Dr. Chris Mattmann, have worked to collect a corpus of “deep web” polar datasets spanning many file types containing scientific data such as images, videos, and other information on the Web. These pieces of data were collected using Apache Nutch, Apache Tika, and Apache Solr.

Our goal is to aggregate this data into an intuitive search engine that scientists can utilize for polar research. Additionally, the data is analyzed and illustrated using visualization APIs Banana and D3.js, providing researchers a better understanding of the data's relationship within the Polar ecosystem.


Search Engine

Providing researchers with a powerful tool to find relevant data sets and websites.


Illustrating data set connections and related terms to narrow searches.


Demonstrate the value of these polar data sets to the NSF, USC, and NASA.


Banana For Solr. Search Simplified

Search multiple keywords simultaneously for thousands of relevant URLs.

Add filters for more refined results using Banana's live-updating visualizations.

Go to the Banana Dashboard.

Generic placeholder image

D3.js. See for yourself.

View data sets from a variety of sources to better understand polar relationships.

View some of our visualizations.

Generic placeholder image

Facetview. Experience Solr.

Filter searches using facets and easily save, share, and consume documents from the Deep Web.

Go to the Facetview Dashboard.

Generic placeholder image

USC Data Science Partner Sites

TREC/Data Description

The goal of the Text Retrieval Conference (TREC) is to encourage research in information retrieval from large text collections by providing interesting and understudied domains of documents to crawl.

Currently, the polar domains contains the NSF-funded Advanced Cooperative Artic Data and Information System (ACADIS), NASA-funded Antarctic Master Directory (AMD), and National Snow and Ice Data Center (NSIDC) Arctic Data Explorer. Our data was retrieved using these directories and submitted to TREC in 2015.

Polar Hack - November 2014

Hosted by the NSF, the goal of this hackathon was to implement visualizations of existing polar data sets to support new discoveries and promote cross agency collaboration between the NSF, NASA, NOAA and other Arctic/Polar related agencies.

Ultimately, the workshop fostered the understanding of the variability of the polar regions at different timescales, allowing the NSF to make longer-term investments in technologies and visualizations that can be adopted by the community.


The Information Retrieval and Data Science Group’s (I.R.D.S.) mission is to research and develop new methodology and open source software to analyze, ingest, process, and manage Big Data and to turn it into information.

We have expertise in data collection and contribute to the world's largest and most often downloaded open-source projects, working with NASA, DARPA, DHS, NIH across a number of domains, Earth Science, Planetary Science, Astronomy, defense, and private industry.


Dr. Chris Mattmann - Visit his website

CS401 Group (Lorraine Sposto, Jonathan Luu, Ruthvik Peddawandla, Titus Jung, Janet Kim)

CS599 Spring 2016 Class - Visit the class website

CS572 Spring 2015 Class - Visit the class website