Conquer CSV files in Python

copywriting-guide-cartoon-comma

CSV files are common for data manipulation in Python, in cases where you extract the data from an excel sheet. Here is a short tutorial on how to extract some data out of a csv file, along with other nifty tricks along the way.

1) Easy Binary Conversion

You can use this to convert to binary.

your_binary_string = "0100000010001"
int_value = int(your_binary_string, 2)

Of course, you can extend this to octal, etc.

2) File reading and list comprehension

Suppose you have a whole csv of binary numbers. You need to read it out to python as a list.
Read the csv as string, and convert it to int easily.

Your csv data looks like this :

0,0,1,1111100110101010,0000111111101111,0000000000000000,0000000000000000,001,1,0,1,1,0,0,1,0,0,0,0,0,0,0000000000000000,0000000000000000,0,0000000000000000,0000000000000000
1,1,0,1111000110000000,0001000100000001,0000000000000000,0000000000000000,001,1,0,0,1,0,0,1,0,0,0,0,0,0,0000000000000000,0000000000000000,0,0000000000000000,0000000000000000
2,2,0,1111001010001000,0000111001100011,0000000000000000,0000000000000000,001,1,0,1,1,0,0,1,0,0,0,0,0,0,0000000000000000,0000000000000000,0,0000000000000000,0000000000000000
3,3,0,1111000010011000,0000111011100101,0000000000000000,0000000000000000,001,1,0,1,1,0,0,1,0,0,0,0,0,0,0000000000000000,0000000000000000,0,0000000000000000,0000000000000000
4,4,0,1111000000010011,0000110010011111,0000000000000000,0000000000000000,001,1,0,0,1,0,0,1,0,0,0,0,0,0,0000000000000000,0000000000000000,0,0000000000000000,0000000000000000
5,5,0,1110111100101100,0000110010000001,0000000000000000,0000000000000000,001,1,0,1,1,0,0,1,0,0,0,0,0,0,0000000000000000,0000000000000000,0,0000000000000000,0000000000000000
6,6,0,1110111001000010,0000101010111110,0000000000000000,0000000000000000,001,1,0,0,1,0,0,1,0,0,0,0,0,0,0000000000000000,0000000000000000,0,0000000000000000,0000000000000000
7,7,0,1110110111010000,0000100111110110,0000000000000000,0000000000000000,001,1,0,1,1,0,0,1,0,0,0,0,0,0,0000000000000000,0000000000000000,0,0000000000000000,0000000000000000
8,8,0,1110110011101111,0000100010011001,0000000000000000,0000000000000000,001,1,0,1,1,0,0,1,0,0,0,0,0,0,0000000000000000,0000000000000000,0,0000000000000000,0000000000000000
9,9,0,1110110010101011,0000011101010010,0000000000000000,0000000000000000,001,1,0,0,1,0,0,1,0,0,0,0,0,0,0000000000000000,0000000000000000,0,0000000000000000,0000000000000000
....

and you just want to get the values in bold. Meaning putting each of the the fourth and third column values in a tuple, converted to integer value.

import matplotlib.pyplot as plt
import numpy as np
import csv

filename = 'yourfile.csv'

with open(filename, 'rt') as csvfile:
 reader = csv.reader(csvfile)
 # the following line uses list comprehension to iterate through every row in the csv
 # and creates row number of tuples in a list
 # You can call next(reader) to skip a row, for example if the 1st row is just labels
 iq = [(int(row[4], 2), int(row[3], 2)) for row in reader]

You get something like this:

[(4079, 63914), (4353, 61824), (3683, 62088), (3813, 61592) .... ]

At this point, if for some reason you want to get all the 1st values in each tuple in one list, and the 2nd values in each tuple in another list, you can do:

i, q = zip(*iq)
i = list(i)
q = list(q)

You will get for i :

[4079, 4353, 3683, 3813, 3231, 3201, 2750, 2550, 2201, 1874, 1559, .... ]

and q:

[63914, 61824, 62088, 61592, 61459, 61228, 60994, 60880, 60655, 60587... ]

3) Writing CSV files

With the previous i and q lists you extracted, you can now write it out as a csv.

with open('complex_out.csv', 'w') as csvfile:
 fieldnames = ['i', 'q']
 writer = csv.DictWriter(csvfile, fieldnames = fieldnames)
 writer.writeheader()
 for row_num in range(len(i)):
 writer.writerow({'i':i[row_num], 'q':q[row_num]})

Your csv file looks like this :

i,q
4079,63914
4353,61824
3683,62088
3813,61592
3231,61459
3201,61228
2750,60994
2550,60880
2201,60655
1874,60587
1559,60434
1213,60367
898,60284
529,60238
195,60224
65371,60196
65033,60237
64701,60281
64372,60335
64024,60430
63705,60535
63385,60660
63065,60826
62763,61000
62474,61192
62198,61410
61935,61644
61680,61883
61468,62145
61232,62433
61040,62710
60863,63013
60702,63328
60566,63648
60459,63979
60361,64304
60292,64646
60247,64994
.....

3) Converting data types

From the above data, I know that my number is really a 16bit binary representation.
Now, I am told that this is a 2’s complement representation.
So each tuple should be really (int16, int16), with +ve and -ve values possible. Fortunately Python allows us to do this easily.

iq = np.array(iq, np.int16) # creates an array with iq existing values, converted to int16
array([[ 4079, -1622],
 [ 4353, -3712],
 [ 3683, -3448],
 ..., 
 [ -567, 5301],
 [ -216, 5329],
 [ 146, 5332]], dtype=int16)

Now we want to convert it to complex64, for further processing down the line.

sig = iq.astype(np.float32).view(np.complex64) # convert the values to float32, before viewing it as a complex64 (2 float32s)
array([[ 4079.-1622.j],
 [ 4353.-3712.j],
 [ 3683.-3448.j],
 ..., 
 [ -567.+5301.j],
 [ -216.+5329.j],
 [ 146.+5332.j]], dtype=complex64)
sig = sig.ravel() # flatten it to 1D
array([ 4079.-1622.j, 4353.-3712.j, 3683.-3448.j, ..., -567.+5301.j,
 -216.+5329.j, 146.+5332.j], dtype=complex64)

Hooray, now we’re ready to do further processing on this data!

Advertisements

3 thoughts on “Conquer CSV files in Python

  1. Hi Pier. Thank for sharing, very illustrative.
    In this case I prefer to combine numpy with pandas, it could simplify your code. For example, to obtain an equivalent result:

    >>> pd.read_csv(“data.csv”, header = None, usecols = [3, 4], dtype = str).applymap(lambda x : np.int16(int(x, 2))).apply(lambda x:x[4]+1j*x[3], axis=1).astype(np.complex64)
    0 (4079-1622j)
    1 (4353-3712j)
    2 (3683-3448j)
    3 (3813-3944j)
    4 (3231-4077j)
    dtype: complex64

    Another thing, in your code plt is imported but not used.
    Take care.

    Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s