Inspired by the train-wreck that was yesterday’s post, I was able to find a better solution to calculating inequality and plotting Lorenz Curves using the ineq library in R.

**EDIT:** gini() in the reldist package also works.

## Installing and loading the Library

Download and install

> install.packages('ineq')

Now load the library

> library('ineq')

## The Data

I have a frequency list which looks a bit like this

# user_posts.txt user posts jose 2342 BonQuisha 1564 Kisha 1198 ... ... Takiera 2 Tramicia 1 Watermelondrea 1

so we load the file to a data frame.

> df <- read.csv('path/to_the_file/user_posts.txt',sep='\t')

The “posts” column contains the data that we want to analyse.

We want to do two things, first, calculate the Gini Index (or coefficient), and the second is to plot a Lorenz curve.

## Gini Index

This is as simple as it gets

> ineq(df$posts,type='Gini') # and that returns [1] 0.8724686

AWESOME!

## Lorenz Curve Plot

Again, this cannot be any simpler…

> plot(Lc(df$posts))

and that should give us something pretty basic like this

Which is nice and all, but we can always make it better by changing the labels, title and lines.

> plot(Lc(df$subs), xlab="User Percentile", ylab="Post Percentage", main="Participation Inequality", col="blue", )

Now we get a much nicer graph

DONE!

## One reply on “Inequality and Lorenz Curve – R”

[…] ← Previous Next → […]