Interactive Canvas based dc.js scatter plots for performant data visualizations

Posted on Tags , ,

Update (09/18/2019) – My pull-request for this feature was adopted into dc-js so this canvas-based scatterplot implementation can now be used directly within the dc.js library –

dc.js is an excellent visualization library that enables easy slicing and dicing of multi-dimensional data with beautifully rendered and responsive SVG charts. While SVG-based rendering enables easy generation of beautifully animated, data-driven charts and visualizations, managing thousands of SVG elements in the DOM can result in a lot of overhead. When plotting several thousand data points, SVG becomes a bottleneck and the user experience suffers greatly. This is particularly evident when using dc.js scatter plots as typical use cases involve applying 2D brushes on these plots to interactively filter datasets.

On the other hand, HTML5 Canvas-based visualizations involve very little overhead since they are raster-based. Thus, canvas visualizations are typically only limited by the speed at which the client can process and pipe data for plotting to the canvas element. This improvement in performance unfortunately comes at the expense of having to write more verbose and “low-level” code to generate data visualizations that would only take a few lines of code when using d3 and SVG graphical elements.

Currently, the dc.js scatter plot implementation struggles to handle more than a few thousand points. The appeal of dc.js charts lie in their smooth animations and near-instantaneous chart response when filtering along different dimensions of the data being explored. In order to overcome the limitations of the current SVG based implementation, I spent some time implementing a canvas based dc.js scatter plot library and it is at a point where I feel comfortable with sharing it with everyone else. The code can be found in my Github repo:

dcjs-canvas-scatterplot Github Repo

Demo and comparison between Canvas vs. SVG

Compare the following 2 implementations of identical pairs of interactive dc.js plots that visualize scatter points for two pairs of features. The data points are color coded by which class they belong to (0 or 1). There are a total of 20,000 data points (i.e. observations) in this example and as a result, the SVG implementation is pushed very hard due to having to manage 40,000 SVG circle elements in the DOM. The canvas implementation is much more responsive and performant while maintaining a near-identical UI experience in comparison to the SVG implementation.

Some key things to note about this library are:

  1. It is a drop-in replacement for the current dc.scatterPlot implementation and is fully backward compatible with it. This means that the SVG backend can still be used if desired, in which case the scatterPlot object behaves identically with the current dc.js scatterPlot implementation.
  2. Canvas backend does not support any custom transition animations when hovering over legends or applying filters as is the case with the SVG backend. Transitions are more tricky and tedious when using Canvas, and since the primary use case of the canvas backend is when performance is critical, it seems reasonable to remove these aspects to the chart in order to make it feel more responsive and performant.
  3. Currently, this library only supports circle symbol types for the Canvas backend. Since dc.js relies on d3 version 3, there is no easy way to create shape generators that can render out d3.symbol elements to a canvas backend. This is trivial in D3v4 but has not currently been implemented in order to keep the library lightweight and so as not to force users to have dependencies on multiple versions of D3.
  4. The library utilizes a hybrid SVG + Canvas approach to rendering the canvas charts. An SVG element is used to draw all axes and legends, etc. A canvas element is perfectly aligned and overlayed over the SVG element in order to plot the scatter points. In order to achieve this alignment of the SVG and Canvas elements, the following CSS properties are applied to the following elements only when using the canvas backend.a. The anchor div (i.e., the parent element supplied to dc.scatterPlot) is modified to have a CSS style of position: relativeb. The SVG element is styled with position: relativec. The Canvas element is styled with position: absolute; z-index: -1; pointer-events: none. If the SVG element has any top/left properties, these are applied to the Canvas element in order to try and align it perfectly with the SVG element.These CSS stylings may cause conflicts in more complex UIs though they are fairly non-intrusive. In such cases, modify your CSS in order to ensure correct alignment of the Canvas element with respect to the SVG element.
  5. By default, all scatter plots will use the SVG backend for rendering the plots unless useCanvas(true) is passed in to the scatterPlot chart object during initialization.

I am going to submit a pull-request for this feature and with any luck, this may end up getting pulled into dc.js in some future release. In the meanwhile, it is pretty straightforward to incorporate it into existing projects right away as described in the “Installation” section at the Github repo.

Feel free to fork the repo and improve the library in any way you see fit. And do drop me a note below if you have any comments/feedback on this library.