Monday 30 July 2018

pandas-programming practice 1

1.How to create a Boolean column that compares the value of the next n rows with the actual value of a row

Input
A  B  C
14 3 32
28 3 78
15 4 68
42 3 42
24 4 87
13 3 65

Calculation for D: if any of the next n rows (in this case 3) have a value that is >= than the actual row (n)+30 then return 1, else 0

OUTPUT
A  B  C  D
14 3 32  1     # 32+30 = 62 so [78>=62, 68>=62]
28 3 78  0     # 78+30 = 108 
15 4 68  0     # 68+30 = 98
42 3 42  1     # 42+30 = 72 so [87>=72]  
24 4 87  0     # 87+30 = 117
13 3 65  0     # 65+30 = 95
 
Solution:
>>> import pandas as pd
>>> import numpy as np
>>> df=pd.read_csv("pan1.csv")
>>> df['D'] = np.where((df.C+30<=df.C.shift(-1)) | ((df.C+30<=df.C.shift(-2))),1,0)
 
[or]
>>> import pandas as pd
>>> df=pd.read_csv("pan1.csv")  
>>> df['D'] =df.C.iloc[::-1].rolling(3,min_periods=1).max().iloc[::-1].gt(df.C+30).astype(int)

No comments:

Post a Comment

list operations in python

How to make each term of an array repeat, with the same terms grouped together? i/p: [A,B,C] o/p: [A,A,A,B,B,B,C,C,C] Solution: >...