Categories
data science

Using the .isin() function in Pandas – Python

The .isin() function is a powerful tool that can help you search search for a number of values in a data frame. This is how it’s done. We start by creating a simple data frame The data frame should look something like this Now, we will use the .isin() function to select all the rows […]

Categories
data science

Create a range of dates using Pandas – Python

Here is how to create a range of dates using the Pandas module. The range will start from April 2 2014 and will end October 1 2014. Well…here it is That’s pretty much it!

Categories
data science

Reddit User Info – Python

This one is just for me. No explanation. As is. This script will let you download all the posts submitted by any Reddit user. Just put the user name in line 9.

Categories
data science

Exploring jason files – Python

Working with json files can be freaking horrible, specially if you don’t know what data is in the file. Let me give you and example of how unreadable it can be. If you use Apple’s iTunes search API, and you search for user id 112018 you get this in return Which is crappy because you […]

Categories
data science

Bibtex to YAML – Python

I’m writing my thesis right now, so I haven’t had much time to post. I am now going through my literature review and I was looking for ways of storing and analysing all my citations so I can do a bit of bibliometrics. Long story short, after trying json and xml, I stumbled across yaml. […]

Categories
data science

Smooth Line Plots – Python

Just a very quick and dirty reminder of how to do this, starting with a data frame. Most of the info in this port can be found here. We can load the data frame (its a csv file) and check the data That should return The ones we want to plot are the 3 *_pec […]

Categories
data science

Timing Execution – Python

This is a simple function that can help you time the code. All it does is use time() from the time module to saves the current time when the code begins, then saves the current time when it ends, and then calculates the difference. Then it uses the gmtime and strtime to format the calculated […]

Categories
data science

Building a Local Library – Python

This may not be the ideal way of doing, but it’s how I’m doing it for now. To build a local library with all the modules you write you should move the modules to the python, a process which is explained here. The main problem, however, is that if you want to edit them, then […]

Categories
data science

Resizing Images – Shell Script

I go this from an awesome post by user80168. Is uses the convert to resize all the files that end in ‘.jpg’. It will create a directory called ‘resized’ and then it will reduce the size of each image to 50% of it’s original size. There. It is sooooo easy.

Categories
data science

Plotting an equation – R

This is how you plot an equation in R. The equation we are going to plot is a simple one. y = x– 1 The first thing we do is set the value for x by creating a sequence from 0 to 1 at 0.01 intervals Now we can get the y values so if […]