data science

Clustering and forces in D3JS

The objective is to use D3JS’ network visualisation to cluster groups of nodes based on the link value.

The full code and output of this example can be found here.

First, I started with heybignick’s awesome network which itself is an adaptation of Mike Bostock’s network – but with node labels.

The bit that we are interested in is this:

var simulation = d3.forceSimulation()
    .force("link", d3.forceLink().id(function(d) {return;}))
    .force("charge", d3.forceManyBody())
    .force("center", d3.forceCenter(width / 2, height / 2));

This section of code is responsible for the placement of nodes in the visualisation, with the key function being the forceSimulation(). A full description of what it does can be found here.

In the original code, there are three things that determine the placement of nodes, link, charge, and center.

The charge force is what makes nodes repel each other. For this you need to use d3.forceManyBody(). In the above code the strength is left at default, which is -30. If you want to change it, you will have to add .strength()with the desired value. A negative strength value will naturally push nodes way from each other. If you want the nodes to attract each other, you will have to change the value to a positive number.

The center force helps place the centre of the network graph in the svg. To do that you need to use d3.forceCenter() with the x,y coordinates of where you want the network centre to be. In the above code this is set to half the height and half the width of the svg. No need to change that.

The link force is really the one we are interested in, as this is responsible for the distance between nodes. For this you need to use the d3.forceLink() function. In the above example this is followed by id() which links nodes by name (id).

If we run the script now, this is what we get:

Notice that the nodes are within similar distances from each other. The only exception is node3, which according to the data has no links at all and is therefore being pushed away by the charge force.

This is where we add .distance()to the code, which will allow us to set the distance between two nodes using value. We do this by extracting the value of each link usingfunction(d){return d.value;}. Like this:

.force("link", d3.forceLink().id(function(d){return;}).distance(function(d){return d.value;}))

This is what the graph looks like now.

Note that the distance between nodes now varies according to the values of each link.

Since I am interested in clusters rather than networks, I don’t really need to see the edges. So I’ve changed the stroke-opacity: 1.0; in the CSS (line 13) to 0. Which gives me the following graph.

Now it’s starting to look like a proper cluster.

What we need to take out of this is the role that both the link value and the force simulation charge have on the placement of nodes:

  • A high link value will place two nodes far apart
  • A low link value will place two nodes close to each other
  • Nodes with no links at all will be pushed away by the charge

To make the clustering work, however, it is important to prepare the data and make sure that the values being used are adequate enough.

But that’s a different story for another day…

Leave a Reply

Please log in using one of these methods to post your comment: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s