Categories
Uncategorized

Sorting Lists for Lorenz Curve – Python

EDIT: This does not really work. Needs lots and lots of fixes. Use R instead.

This function will take a list of 100+ numbers, divide them into percentiles, and return a list that can be used to plot a Lorenz Curve.

The Function

from decimal import Decimal
from numpy import cumsum

def listPercentiles(list_items):
    
    # check to see if there are at least 100 items
    if len(list_items) < 100:
        print '-- List is not long enough! --'

    else:
        # people in each percentile
        int_per_cent = len(list_items)/100

        # total entries
        int_total_entries = sum(list_items)

        # sort list from lowest to highes
        list_items.sort()

        # list of raw posts per percentile
        list_pc_ents = []
        for int_pcnt in range(0,len(list_items),int_per_cent):
            # slice the list of entries to percentiles
            list_each_percentile = list_items[int_pcnt:int_pcnt+int_per_cent]
            # sum up all the entries in that percentile
            int_raw_percentile = sum(list_each_percentile)
            # change it to a percent of total entries
            dec_percentile_entry = Decimal(int_raw_percentile)/Decimal(int_total_entries)
            list_pc_ents.append(dec_percentile_entry)

        # now do the cumsum
        list_cumsum = list(cumsum(list_pc_ents))

        list_int_formated = []
        for dec_each_item in list_cumsum:
            list_int_formated.append(float(dec_each_item))
        

        # return your results
        return sum(list_pc_ents)

Running the Function

First, we make a list full of integers. This time we will fill a list of 100 items with 1s.

        
# list of 100 items
list_100_items = []
for int_range in range(0,100):
    list_100_items.append(1)

Now we run the above function

list_percentiles = listPercentiles(list_100_items)

print list_percentiles

and that should return

[0.01, 0.02, 0.03, 0.04, 0.05, 0.06, 0.07, 0.08, 0.09, 0.1, 
0.11, 0.12, 0.13, 0.14, 0.15, 0.16, 0.17, 0.18, 0.19, 0.2, 
0.21, 0.22, 0.23, 0.24, 0.25, 0.26, 0.27, 0.28, 0.29, 0.3, 
0.31, 0.32, 0.33, 0.34, 0.35, 0.36, 0.37, 0.38, 0.39, 0.4, 
0.41, 0.42, 0.43, 0.44, 0.45, 0.46, 0.47, 0.48, 0.49, 0.5, 
0.51, 0.52, 0.53, 0.54, 0.55, 0.56, 0.57, 0.58, 0.59, 0.6, 
0.61, 0.62, 0.63, 0.64, 0.65, 0.66, 0.67, 0.68, 0.69, 0.7, 
0.71, 0.72, 0.73, 0.74, 0.75, 0.76, 0.77, 0.78, 0.79, 0.8, 
0.81, 0.82, 0.83, 0.84, 0.85, 0.86, 0.87, 0.88, 0.89, 0.9, 
0.91, 0.92, 0.93, 0.94, 0.95, 0.96, 0.97, 0.98, 0.99, 1.0]

PERFECT!!

EDIT: Not really perfect. Only works if the length of the list is a multiple of 100. Big flaw really.

One reply on “Sorting Lists for Lorenz Curve – Python”

Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s