Tableau, a primer

Derrick Lewis
5 min readApr 22, 2019

As a budding Data Scientist I have been polishing my formal statistical practices and fighting through complicated regression models. Many steps in the process require a quick visual check of how the data is organized. The common method is to use one of the tools built into (or rather onto) Python. These are great tools and have been adopted by the Data Science community for good reason, they are directly connected to your program and for the most part, they get the job done.

However, even the most flashy of the Python-connected visualization tools is limited in its ability and sophistication in our ever growing visual world. Further, these tools require figures and visualizations to be built with Python code. While manageable, it can certainly be a burden, and thinking in pure efficiency, investing the time to make minor graphical changes for the sake of better visual communication adheres to a law of diminishing returns. Sometimes you just need a platform that has one job: creating the fastest and most accurate path communicate quantitative information.

After banging my head against a piece of code to adjust the style of a distribution function and force it to display alongside another distribution function, I decided to punt. I was not dealing with a massive data set, and I could send it out to another program in a few seconds. I had worked for 40 minutes in MatPlotLib to adjust a figure which I created inside Tableau in less than a minute. I come from the real world and I believe my future boss or boss’s boss will care more about interpreting the data in glance than how I got there. That doesn’t mean I’ll give up on the Python tools, but I hereby recognize when to wave the white flag and forfeit for another tool.

So, here’s a quick primer on my new favorite tool, Tableau, and why it can allow you to show off to the right people and get back to coding faster.

What is it?

  • Visualization software that query relational databases, online analytical processing cubes, cloud databases, and spreadsheets and then generates a number of graph types
  • Built in Geo-coding to represent admististrative data as well as plotting latitude and longitude coordinates and connect to spatial files like Esri Shapefiles, KML, and GeoJSON
  • Applications can be built and displayed via desktop, server, online, reader, and public
  • Purchased by Empirical Systems, an AI company with plans to integrate their intelligence into the software.

A few of my favorite things

It’s gorgeous.

A group of some the greatest artists and designers have created a path for you the user not to accidentally flood your visualization with chart junk, offensive colors, or misleading perspectives.

It’s Easy

(if you want it to be)

When working through the analysis phase of a data science project, I often need to crank out many many different views of the same data to look for patterns or trends. Tableau has drag and drop variables everywhere. I can bounce data from X to Y, add time, filters, sizes, colors, shapes faster than any other I’ve worked with. More importantly, once additional variables are added, it knows what to do. Readjusting spacing, scale and direction if needed.

Built for a story

For any of my eventual work to be valuable it will need to be communicated and understood broadly. I’ve found no other platform that drives toward storytelling better than Tableau.

The big finale of your favorite movie would not communicate much to you if you didn’t also follow a path to get there. It’s important to do this with your data. Tableau allows you to build multiple interactive pages that follow a path and can be viewed online by anyone.

Training tools

As an obvious bit of marketing, Tableau knows that the easier its software is to learn the more it will be requested by the employees of businesses. Thus, the learning side of the Tableau website is incredibly robust with easy to follow videos and simple explanations.

It’s Free-ish

For my sporadic use, there is a version of Tableau that is free. The public version is limited in the sources of data which it can pull from and and all projects must be saved to the Tableau Public server.

A Few Unfavorite Things

It’s not free

To really get at the features, you’ll need to pay. As this software grows in sophistication, so does the price. Tableau knows it’s core audience is a big firm with money to spend.

Big Datasets

In the free version, I’m not able to handle charts beyond 1,000 rows. For me this is for summary not execution. Thus, If I’m still working with more than 1,000 rows, I could probably find a way to aggregate the data first to paint a picture.

Not connected

This primer is very specifically about using the free version as a supplement to working with formal statistics in python. Can Tableau be connected to your changing dataset to create a living dashboard? Of course.

The greatest camera is the one you have with you.

Life happens fast and doesn't wait for you to adjust the settings on your complicated tool.

So, would I be a more ‘L33T’ programer by becoming a maestro of Matplotlib, yes. But, until I get there, I’ll gladly use the simplicity and beauty of tool built to move quickly and move on.

--

--