From Edge List to User List using R

The objective is to turn an edgelist into a user list, and then apply Bradford’s Law from my previous post.

My raw data is in edge list format, where each row represents an interaction between two community members (actors), and the actor names are separated by a single space, like this

actor1 actor2
actor2 actor3

So here we have an interaction between actor1 and actor2, and a second interaction between actor2 and actor3. What we want to do is turn this edge list into a one column actor list, like this

[1]   actor1
[2]   actor2
[3]   actor2
[4]   actor3

Notice that we have actor2 twice. That’s fine — that’s exactly what we want. We want to know how many interactions each of them were involved in, so we can apply Bradford’s Law later on.

So the first step is to load the data and put it in a data frame called dfEL.


I’ve first had to read the file as a table (read.table( )), and then change it from a table to a data frame ( )).

Now that we have the data frame, we want to append the second column after the first column, using the rbind( ) command — which binds the data by rows. But this will only work if both columns have the same name (don’t ask, don’t know). So let’s give both columns the same name.

colnames(dfEL) <- c("that","that")

In truth it doesn’t matter what I name them, as long as they both end up with the same name. Here I’ve been lazy and labelled both of them “that”.

Before we move on, let’s just check the number of rows, because we want to make sure we get the desired results afterwards. So by doing a quick


we can get the number of rows. In the data frame I am using, I have a total of 1430 rows. But this is an edgelist, so if I do things right, my actor list would be twice as long, with 2860 rows.

Anyway, now it’s time to try the rbind( ) and put the actor list in a new data frame called dfAL, like this

dfAL <- rbind(dfEL[1],dfEL[2])

where rbind( ) is first writing column one (dfEL[1]), and then appending column two (dfEL[2]) from our dfEL data frame.

So now if we check the number of rows on our actor list like this


We should end up with twice the number of our edge list. In my case I have 2860 — spot on.

But we’re not done yet.

Remember we had repeating names. Well, now we want to count the number of times each actor name repeats, and place that number in a new column. Sounds complicated, but the table( ) command can do that for us – easy. So all we have to write is

dfAL <-

I’ve placed it in the same data frame dfAL, which means that I’ve erased all previous data in the dfAL data frame, and replaced it with this new one. You can give it another name if you want to, and keep the old data … no problem.

But now, if we look at the data frame with a quick


you will see that we have two columns, and the second one, Freq, is where we have the number of times each name repeats. Just out of personal preference I will label the first column names, like this

colnames(dfAL)[1] <- "names"

Now we want to order the names according to the number of times they appear on the list, we do this with

dfAL <- dfAL[with(dfAL,order(-Freq)),]

So with this we are place all the things back into dfAL, but the rows are now ordered according to the Freq column in descending order (thus the minus).

That’s pretty much it!!

Now we can apply Bradford’s Law from my last post.

Lazy Man’s Copy+Paste


Leave a Reply

Please log in using one of these methods to post your comment: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s