One of the neatest and simultaneously confusing aspects of D3 is its data binding methods selection.data and selection.datum that bind data to elements in the DOM. These methods while seemingly simple, enable effortless generation of very complex data visualizations by virtue of keeping visualization elements in the DOM closely coupled to the data being visualized. However, if you browse through the d3 API, it can be a bit difficult to discern the difference between the selection.data and selection.datum methods and when one should prefer one over the other. This post aims to clarify the subtle and not-so-subtle differences between these methods. I assume that you already have a decent idea of how data binding works in D3, including concepts such as the enter(), update() and exit() selections. If you need a quick refresher or are unfamiliar with these concepts, then I would highly recommend the following three resources:
- Dashing D3 – Binding data to DOM elements
- Mike Bostock – Thinking with Joins
- Mike Bostock – How Selections Work
So let’s first begin with a good definition for data v/s datum.
Datum – an item of factual information derived from measurement or research (singular of data)
Compare this against the definition for Data
Data – a collection of facts from which conclusions may be drawn (plural of datum)
The basic gist of it is this – datum refers to a single unit of data whereas data refers to a collection of facts or datum.
So let’s now delve into each of these methods with this understanding of the meaning of data and datum. First, we should note that both selection.data and selection.datum provide different functionalities based on whether any data is passed in as an input argument to these methods or not. In the case when no data is passed in, these methods act as “getter” methods to access the underlying data/datum bound to elements in the selection. Due to this difference, we will treat each case separately below.
Case 1: When data is supplied as an input argument
selection.data(data)
selection.data(data)will attempt to perform the usual D3 data-join that we are all familiar with. This data-join occurs between elements in thedataarray and element(s) in theselection. Data elements that match with existing elements in the selection are part of the defaultupdate()selection. Selection elements with no matching data elements are placed in theexit()selection, whereas data elements with no matching DOM elements result in the creation of matching virtual selections that are accessible as part of theenter()selection. The end result of this is if you pass in an arraydata = [{x: 1}, {x: 2}, {x: 3}], an attempt is made to join each individual data element or datum (for example –{x: 1}) with the selection. Each element of the selection (virtual or real) will only have a single datum element of data bound to it.- If
dataonly contains a single data element or datum (eg:data = [{x: 1}]) whileselectioncontains many elements, then only the first matchingselectionelement has the datum{x: 1}bound to it with all the other selection elements being placed in theexit()selection.
selection.datum(data)
selection.datum(data)bypasses the data-join process altogether. This command is essentially stating that you want to set the datum for every selection element to bedata. If you look back at the definition of datum – a singular element of data, this makes sense. We are essentially setting the singular element of data for each selection element using this method. As a result, if you pass in an arraydata = [{x: 1}, {x: 2}, {x: 3}]toselection.datum(data), each selection element inselectionwill have the same array bound to it. So each selection element’s bound data in__data__will be[{x: 1}, {x: 2}, {x: 3}].
It is important to note the key differences between the two methods. selection.datum() will bind the provided data as a unit to every element in selection. Meanwhile, selection.data() will data-join individual elements within the data array to the selection.
selection.datum(data) is identical to selection.data([data]). However, this is only true if selection contains a single element. In that case, [data] produces a single element array, so the data-join using selection.data is equivalent to assigning the datum data to the same selection using selection.datum(data). However, if selection has multiple elements, selection.datum(data) will assign data to each of those selection elements whereas selection.data([data]) would only data-join [data] with the first selection element.Case 2: When no data is supplied as an input argument
selection.data()
selection.data()will take the bound datum for each element in the selection and combine them into an array that is returned. So, if your selection includes 3 DOM elements with the data{x: 1},{x: 2}and{x: 3}bound to each respectively,selection.data()returns[{x: 1}, {x: 2}, {x: 3}]. Note that ifselectionis a single element with (by way of example) the datum"a"bound to it, thenselection.data()will return["a"]and not"a"as some may expect.selection.datum()only makes sense for a single selection as it is defined as returning the datum bound to the first element of the selection. So in the example above with the selection consisting of DOM elements with bound datum of{x: 1},{x: 2}and{x: 3},selection.datum()would simply return{x: 1}.
selection has a single element, selection.datum() and selection.data() return different values. The former returns the bound datum for the selection ("a" in the example above) whereas the latter returns the bound datum within an array (["a"] in the example above).
Hopefully this helps clarify how selection.data and selection.datum() differ from each other; both when providing data as an input argument and when querying for the bound datum by not providing any input arguments. Feel free to leave a comment below if you have anything you’d like to add to this discussion.
PS – In case this post seems a bit familiar to you, I should mention that this is a more detailed version of a response I posted on Stack Overflow a little while back.
