a curated guide to the best tools, resources and technologies for data visualization

Analysis

GeoDa

GeoDa

GeoDa is a free and open source software tool that serves as an introduction to spatial data analysis. It is designed to facilitate new insights from data analysis by exploring and modeling spatial patterns.

Trifacta Wrangler

Trifacta Wrangler

Trifacta Wrangler is specifically designed to make the data preparation process easier and faster. By providing a connected desktop application for users to visually explore, structure and publish out dashboard-ready datasets, Trifacta Wrangler helps analysts deliver faster and more accurate analysis.

CSV Fingerprints

CSV Fingerprints

CSV Fingerprints aims to make it easy to spot errors in your dataset by providing a birdseye view of the file without too much distracting detail.

Datacomb

Datacomb

An interactive tool for exploring large, tabular datasets.

agate

agate

agate is a Python data analysis library that is optimized for humans instead of machines. It is an alternative to numpy and pandas that solves real-world problems with readable code.

Analyse-it

Analyse-it

Analyse-it brings powerful statistical analysis and data visualisation into Microsoft Excel. All the statistical analysis you need, in an application you already know.

Bokeh

Bokeh

An interactive Python data visualization plugin that runs straight out of an iPython Notebook and can be easily shared.

Chart.io

Chart.io

Simple and Powerful Data Exploration

ContextMiner

ContextMiner

ContextMiner is a framework to collect, analyze, and present the contextual information along with the data.

Visokio Omniscope

Visokio Omniscope

Visokio Omniscope is a versatile, multi-tab and multi-view interactive data analysis, filtering and presentation tool. It offers a powerful new way to visualise, explore and report on large tables of data – with related images, maps, links, and more – then lets you share your file with others using the free Viewer. n Examples/references: Demos and screenshots

Wizard

Wizard

Wizard is built around pictures: pictures of your data, and pictures of statistical values. The innovative graphics will help you understand data quickly and explain statistical concepts. Textbook-style illustrations — based on your data — help p-values and confidence intervals spring to life. To export, just control-click and choose a crisp PDF or a web-ready PNG.

Wolfram Mathematica

Wolfram Mathematica

Bring in your data, combine it with Wolfram Alpha’s ever-increasing store of knowledge, apply sophisticated symbolic and numeric analysis, and create state-of-the-art visualizations—all in one system, with one integrated workflow. n Examples/references: Gallery of features

Wordseer

Wordseer

WordSeer is a text analysis environment that combines visualization, information retrieval, sensemaking and and natural language processing to make the contents of text navigable, accessible, and useful.

Wordstat

Wordstat

WordStat is a flexible and easy-to-use text analysis software – whether you need text mining tools for fast extraction of themes and trends, or careful and precise measurement with state-of-the-art quantitative content analysis tools. WordStat‘s seamless integration with SimStat – our statistical data analysis tool – QDA Miner – our qualitative data analysis software – and Stata – the comprensive statistical software from StataCorp, gives you unprecedented flexibility for analyzing text and relating its content to structured information, including numerical and categorical data.

Wordtree

Wordtree

Text analysis tool that lets you paste a chunk of text and explore the frequency, position and usage of language within your chosen passage.

Wordwanderer

Wordwanderer

With WordWanderer we are experimenting with visual ways in which we can enhance people’s engagement with language. By fusing the information we can obtain from corpus searches, concordance outputs and word clouds we are aiming to enable and encourage people to notice and wander through the words they read, write and speak.

MonkeyLearn

MonkeyLearn

Build apps with machine learning and sentiment analysis.

QuickCode

QuickCode

Formerly known as ScraperWiki, QuickCode is a Python and R data analysis environment, ideal for economists, statisticians and data managers who are new to coding.

Pandas for Python

Pandas for Python

pandas is an open source, BSD-licensed library providing high-performance, easy-to-use data structures and data analysis tools for the Python programming language.

Synerscope

Synerscope

SynerScope allows users to show, analyze, and understand relational “Big Data” visually by using edge-bundled, timeline-enhanced visualizations. SynerScope is the most powerful new software for analyzing massive amount of data, where analysts have to examine millions of data points easily within a very short period of time.

Textalyser

Textalyser

Welcome to the online text analysis tool, the detailed statistics of your text, perfect for translators (quoting), for webmasters (ranking) or for normal users, to know the subject of a text. Now with new features as the analysis of words groups, finding out the keyword density, analyse the prominence of word or expressions.

The R Project

The R Project

R is a highly extensible, open source language and environment for data handling, statistical computing and graphical techniques.

Rcharts

Rcharts

rCharts is an R package to create, customize and publish interactive javascript visualizations from R using a familiar lattice style plotting interface. rCharts supports multiple javascript charting libraries, each with its own strengths. Each of these libraries has multiple customization options, most of which are supported within rCharts. rCharts also allows you to share your visualization in multiple ways. You can save it as a standalone page, embed it in a shiny application, or even include it as a part of a blog post or tutorial.

Re:dash

Re:dash

Rethinking how data is queried, shared and visualized, re:dash is a web application that allows to easily query an existing database, share the dataset and visualize it in different ways. Oh and you can also create dashboards. re:dash is a work in progress and has its rough edges and way to go to fulfill its full potential. The Query Editor part is quite solid, but the visualizations need more work to enrich them and to make them more user friendly. n Examples/References: More info

Rstudio

Rstudio

RStudio is an integrated development environment (IDE) for R. It includes a console, syntax-highlighting editor that supports direct code execution, as well as tools for plotting, history, debugging and workspace management. Click here to see more RStudio features. RStudio is available in open source and commercial editions and runs on the desktop (Windows, Mac, and Linux) or in a browser connected to RStudio Server or RStudio Server Pro.

Seaborn

Seaborn

Seaborn is a Python visualization library based on matplotlib. It provides a high-level interface for drawing attractive statistical graphics.

Silk

Silk

A Silk site lets you answer questions with your data by creating overviews and visualisations. It lets you create visualizations, maps and overviews. It works its magic by using the connected information from your fact sheets. Silk sites that contain pages with numbers on prices, size or distances can create interactive charts that put all of your data in perspective.

Palladio

Palladio

Palladio is a web-based platform for the visualization of complex, multi-dimensional data. It is a product of the “Networks in History” project that has its roots in another humanities research project based at Stanford: Mapping the Republic of Letters (MRofL). MRofL produced a number of unique visualizations tied to individual case studies and specific research questions. You can see the tools on this site and read about the case studies at republicofletters.stanford.edu.

Outwit

Outwit

OutWit Hub explores the depths of the Web for you, automatically collecting and organizing data and media from online sources.

Shiny

Shiny

Turn your analyses into interactive web applications. No HTML, CSS, or JavaScript knowledge required

Leximancer

Leximancer

Leximancer enables you to navigate the complexity of text in a uniquely automated fashion.

Matplotlib

Matplotlib

matplotlib is a python 2D plotting library which produces publication quality figures in a variety of hardcopy formats and interactive environments across platforms.

Generatedata

Generatedata

It’s a free, open source tool written in JavaScript, PHP and MySQL that lets you quickly generate large volumes of custom data in a variety of formats for use in testing software, populating databases, and… so on and so forth.

Geotime

Geotime

This award-winning visual analysis tool places an emphasis on visual presentations, introducing new ways to visualize events over time, including the ability to run statistical functions on numerical attributes within your data. n

In-Spire

In-Spire

IN-SPIRE performs basic to complex text analysis.

Data Hero

Data Hero

With Data Hero you can do more with your data. Quickly and easily get answers to your data question. No IT. No business analyst. Just you and a few clicks. Simply upload your Excel spreadsheets or connect directly to the services you use every day and DataHero does the rest. DataHero’s powerful Data Decoder analyzes your data and shows you relevant visualizations and key insights. From there, simply drag-and-drop to filter and refine your charts, then share them with your colleagues and be the hero.

Data Science Toolkit

Data Science Toolkit

The Data Science Toolkit provides a range of open-source tools for data scientists assembled by Pete Warden.

DataBasic

DataBasic

DataBasic is a suite of easy-to-use web tools for beginners that introduce concepts of working with data. These simple tools make it easy to work with data in fun ways, so you can learn how to find great stories to tell. WordCounter analyzes your text and tells you the most common words and phrases. WTFcsv tells you WTF is going on with your .csv file. SameDiff compares two or more text files and tells you how similar or different they are.

Dataiku Dss

Dataiku Dss

Data Science Studio (DSS) is a software platform that aggregates all the steps and big data tools necessary to get from raw data to production ready applications. It shortens the load-prepare-test-deploy cycles required to create data driven applications. Thanks to its visual and interactive workspace, it is accessible to both Data Scientists and Business Analysts.

R

R

A software environment for statistical computing and graphical techniques.