Categories
Uncategorized

Dot Graph for Directory – Python

How to create a dot graph for a directory.

The Problem

I wrote the dir2dot( ) python function in my previous post…but it has one massive mistake.

If we have items in different folders that have the same name, the function gets all confused and freaks out.

For example, if we have the following directory:

m_f
    f_A
        s_f_1a
            file_1
        s_f_2a
    f_B
        s_f_1b
        s_f_2b
            file_1

What we should get is a graph that looks like this:
dirGood

Instead, we get this:
dirBad

Which really sucks because there are two different files called “file_1”, but the function thinks they’re both the same thing.

Therefore, to fix the problem, we will have to assign a unique id to each directory and file.

The Solution

Instead of writing a dot file with their names as the node ids, like this:

graph{

    "m_f" -- {"f_A","f_B"}

    "f_A" -- {"s_f_1a","s_f_2a"}
    "f_B" -- {"s_f_1b","s_f_2b"}

    "s_f_1a" -- "file_1"
    "s_f_2b" -- "file_1"
}

we declare the nodes and give them a unique id using dictionaries, so the dot file looks like this

graph{

    n1[label="m_f"]
    n2[label="f_A"]
    n3[label="f_B"]
    n4[label="s_f_1a"]
	n5[label="s_f_2a"]
    n6[label="s_f_1b"]
	n7[label="s_f_2b"]
	n8[label="file_1"]
	n9[label="file_1"]

	n1 -- n2,n3
	n2 -- n4,n5
	n3 -- n6,n7
	n4 -- n8
	n7 -- n9

}

Mo’ Problems

We still have a bit a problem when designating unique ids, because if we split each of the directory strings ….

m_f
m_f/f_A
m_f/f_A/s_f_1a
m_f/f_A/s_f_1a/file_1
m_f/f_A/s_f_2a
m_f/f_B
m_f/f_B/s_f_1b
m_f/f_B/s_f_2b
m_f/f_B/s_f_2b/file_1

… according to '/', the same way that all m_f will be given the same id (rightly so), so will all the file_1 (do not want).

The Ultimate Solution

The solution to the world’s problems is to assign a unique id for each directory and file based on their complete path.

Therefore, it may look something like this

graph{

	"m_f"[label="m_f"]
	"m_f/f_A"[label="f_A"]
	"m_f/f_A/s_f_1a"[label="s_f_1a"]
	"m_f/f_A/s_f_1a/file_1"[label="file_1"]
	"m_f/f_A/s_f_2a"[label="s_f_2a"]
	"m_f/f_B"[label="f_B"]
	"m_f/f_B/s_f_1b"[label="s_f_1b"]
	"m_f/f_B/s_f_2b"[label="s_f_2b"]
	"m_f/f_B/s_f_2b/file_1"[label="file_1"]



	"m_f" -- "m_f/f_A","m_f/f_B"
	"m_f/f_A" -- "m_f/f_A/s_f_1a","m_f/f_A/s_f_2a"
	"m_f/f_B" -- "m_f/f_B/s_f_1b","m_f/f_B/s_f_2b"
	"m_f/f_A/s_f_1a" -- "m_f/f_A/s_f_1a/file_1"
	"m_f/f_B/s_f_2b" -- "m_f/f_B/s_f_2b/file_1"

}

Except that this looks really horrible and it won’t really work with large directories…

I’ll come back to this later with a proper solution…too busy right now.

To Be Continued …

Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s