« Numpy
numpy.where(condition to check, x, y)
Return x or y as elements based on condition check.
condition | array_like, bool |
x,y | x is returned if condition is True, y otherwise |
Examples : Updating data
We will create an array by using arange(). We will multiply each element by 3 if they are even numbers.
import numpy as np
ar=np.arange(6) #[0 1 2 3 4 5]
ar=np.where(ar%2==0,ar*3,ar)
print(ar)
Output ( updated the same array with new data )
[ 0 1 6 3 12 5]
Fill all elements by np.NaN if they are divisible by 5
import numpy as np
ar=np.arange(15)
ar=np.where(ar%5==0,np.NaN,ar)
print(ar)
output
[nan 1. 2. 3. 4. nan 6. 7. 8. 9. nan 11. 12. 13. 14.]
returns positions of elements where condition is True
import numpy as np
ar=np.array([12,2,7,1,9,3,11])
ar=np.where(ar>5)
print(ar)
Output ( Position of the elements where numbers are more than 5 )
(array([0, 2, 4, 6]),)
Using and to combine two conditions. Returns the position of elements satisfying the condition.
import numpy as np
ar=np.array([12,2,7,1,9,3,11])
ar=np.where((ar > 5) & (ar < 10))
print(ar)
print(ar[0][1])
Output
(array([2, 4]),)
4
Using OR to combine two conditions
import numpy as np
ar=np.array([12,2,7,1,4,3,11])
ar=np.where((ar > 5) | (ar %2==0))
print(ar)
Output
(array([0, 1, 2, 4, 6]),)
Using multidimensional arrays
np.where([[True, False], [True, True]],
[[5, 2], [13, 42]],[[9, 18], [73, 16]])
Output
array([[ 5, 18],
[13, 42]])
Nested np.where()
We can use nested np.where() condition checks ( like we do for CASE THEN condition checking in other languages). We will keep another np.where() when our first np.where() condition returns false.
Here is a solution we used to assign some numbers to another column ( allowed ) based on the value at dept column.
my_data['allowed']=np.where(my_data['dept']=='mktg',50,
np.where(my_data['dept']=='production',65,
np.where(my_data['dept']=='planning',45,np.nan)))
This is the part of a solution of Exercise No 3-4 , read the full exercise to understand the requirement.
Here my_data['allowed'] is assigned value of 50 if the my_data['dept'] column is equal to mktg, similarly this value is 65 for production and 45 for planning.
Example Using Pandas DataFrame
Use our sample student DataFrame. We will add one column status which will store pass or fail value for each student based on the mark they scored. Pass mark is 60.
import pandas as pd
import numpy as np
df= pd.read_csv('D:\\my_data\\student.csv') # DataFrame from csv file data
df['status']=np.where(df['mark']>=60,'Pass','Fail')
print(df)
We can use nested WHERE to distribute GRADE to each student based on the mark they scored.
df['grade']=np.where(df['mark'] >=80,'A',
np.where(df['mark']>=70,'B',
np.where(df['mark']>=50,'C','D') ))
Students have appered in different subject exams. Here is the input DataFrame.
Name Subject_1 Mark_1 Subject_2 Mark_2
0 Alex Science 30 Chemistry 40
1 Ron Social 90 Math 80
2 Ravi History 10 Physics 60
3 King English 100 Geography 90
Arrange the subjects in alphabetical order for each student without any change in marks. The output should be like this.
Name Subject_1 Mark_1 Subject_2 Mark_2
0 Alex Chemistry 40 Science 30
1 Ron Math 80 Social 90
2 Ravi History 10 Physics 60
3 King English 100 Geography 90
Solution ???
import pandas as pd
import numpy as np
my_dict={'Name':['Alex','Ron','Ravi','King'],
'Subject_1':['Science','Social','History','English'],
'Mark_1':[30,90,10,100],
'Subject_2':['Chemistry','Math','Physics','Geography'],
'Mark_2':[40,80,60,90]}
df = pd.DataFrame(data=my_dict)
print(df)
#
df['Subject_1'],df['Mark_1'],df['Subject_2'],df['Mark_2']=np.where(df['Subject_2']>df['Subject_1'],
(df['Subject_1'],df['Mark_1'],df['Subject_2'],df['Mark_2']),
(df['Subject_2'],df['Mark_2'],df['Subject_1'],df['Mark_1']))
print(df)
«Numpy
eye()
ones()
bincount()
linspace()
← Subscribe to our YouTube Channel here