Categories
Uncategorized

Links Between Networks – Python

I have 43 networks where some of the actors overlap, and what I would like to do is to visualise this overlap.

The problem is that these networks combined will have more that 40,000 nodes, so it will be very messy to stick everything into one network graph.

So the next best thing is to count the number of overlapping actors and use those as the edges and edge weight.

In other words, if you have three networks (g1, g2,and g3), and if there are 5 actors that are part of both g1 and g2, you could represent it with two nodes (one for each graph) linked by an edge with weight 5 (the number of overlapping actors). Like so…

From the graph we can then tell that g1 and g3 have 1 overlapping actor, while g3 and g2 have the highest number of overlapping actors, 6.

This way, you are not worrying about the 1,000+ that are only active in one network, instead you are emphasising those who are moving cross boundaries.

The massive problem is that you will have to compare the list of actors in two networks and pick the ones that overlap – over and over again. So for three networks you will have to do three comparisons:
1) between g1 and g2,
2) between g1 and g3, and
3) between g2 and g3.

With 3 networks it should be fine, but as the number of networks increases, so do the number of comparisons.

For the 43 networks I want to compare, I will have to do 903 comparisons.

Not going to happen – not manually anyway.

My laziness, therefore, propelled me to write a Python script. I built the engine this morning, which calculates the different combinations (all 903 of them).

Those combinations are then fed through a function that places all the network actors in two sets, intersects them, then appends the results to an output file with the corresponding network names.

The output file for the above example would read

g1 g2 5
g1 g3 1
g2 g3 6

To get the script to work, you should already have the list of actors in a text file that ends with _uList.txt (one file for each network).

For the above example, your data folder should contain:

g1_uList.txt
g2_uList.txt
g3_uList.txt

The lists inside these files contains the name of the actor and the degree, separated by a coma. Like so:

Actor1,10
Actor2,7
Actor3,5
...

When running the code, you have to $cd to the directory containing the files first.

Here is Python script…

# # # # # # # # # # # # # # # # #
# Jose Christian                #
# Batch comparison              #
# input: *_uList.txt            #
# output: _fullGStats.txt       #
# # # # # # # # # # # # # # # # #


from re import sub
from os import listdir

def getIntersection(threadOne,threadTwo):
    # reads the uList.txt files
	rFileOne=open(threadOne+"_uList.txt","r")	
	rFileTwo=open(threadTwo+"_uList.txt","r")
	wFileOut=open("_fullGStats.txt","a")

    # populates the first set
	dataFileOne=set()
	for line in rFileOne:
		line1=sub("\n","",line)
		lineF=sub(",.*$","",line1)
		dataFileOne.add(lineF)

    # populates the second set
	dataFileTwo=set()
	for line in rFileTwo:
		line1=sub("\n","",line)
		lineF=sub(",.*$","",line1)
		dataFileTwo.add(lineF)

    # intersects the two sets
	inter=dataFileOne.intersection(dataFileTwo)

    # finds the number (rather than names) of actors
	numb=len(inter)

    # formats output lines
	fullOutPut=threadOne+" "+threadTwo+" "+str(numb)+"\n"

    # appends line to output file
	wFileOut.write(fullOutPut)

    # just so you know what's going on
	print fullOutPut,

    # house-keeping
	wFileOut.close()
	rFileOne.close()
	rFileTwo.close()

# creates a list of all your uList files
uFileList=[]
dirList=listdir(".")
for eachFile in dirList:
	if eachFile.endswith("_uList.txt"):
		theID=sub("_uList.txt","",eachFile)
		uFileList.append(theID)

# creates the combinations and feeds them one at a time to the function above
listLength=len(uFileList)
for i in range(1,listLength):
    for i2 in range(i,listLength):
		getIntersection(uFileList[i-1],uFileList[i2])

One reply on “Links Between Networks – Python”

Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s