Type B: Application-Based Questions
1. Predict the output of following code fragments one by one. For every next code fragment, consider that the changes by previous code fragment are in place. That is, for code fragment (b), changes made by code fragment (a) are persisting; for (c), changes by (a) and (b) are persisting and so on.
(a) import pandas as pd
columns=[ 2015 ‘ , ‘2016’ , ‘2017’ , ‘2018’ ]
index=[ ‘Messi% Ronaldo’, ‘Neymar’, ‘Hazard’]
df = pd.DataFrame(columns = columns, index=index)
print (df)
df.to_csv(“c:\one.csv”)
Ans: Output
2015 | 2016 | 2017 | 2018 | |
Messi | NaN | NaN | NaN | NaN |
Ronaldo | NaN | NaN | NaN | NaN |
Neymar | NaN | NaN | NaN | NaN |
Hazard | NaN | NaN | NaN | NaN |
(b) df[‘2015’][‘Messi’] = 12
df[‘2016’][‘Ronaldo’] = 11
df[‘2017’][‘Neymar’] = 8
df[‘2018’][‘Hazard’] = 16
print (df)
df.to_csv( “c:\two.csv”, sep = ‘@’ )
Ans: Output
2015 | 2016 | 2017 | 2018 | |
Messi | 12 | NaN | NaN | NaN |
Ronaldo | NaN | 11 | NaN | NaN |
Neymar | NaN | NaN | 8 | NaN |
Hazard | NaN | NaN | NaN | 16 |
(c) new_df = pd.read_csv(‘c:\one.csv’, index_col= 0)
print(new_df)
Ans: Output
2015 | 2016 | 2017 | 2018 | |
Messi | NaN | NaN | NaN | NaN |
Ronaldo | NaN | NaN | NaN | NaN |
Neymar | NaN | NaN | NaN | NaN |
Hazard | NaN | NaN | NaN | NaN |
(d) new_df = pd.read_csv(‘c:\one.csv’)
print(new_df)
Ans: Output
Unnamed: 0 | 2015 | 2016 | 2017 | 2018 | |
0 | Messi | NaN | NaN | NaN | NaN |
1 | Ronaldo | NaN | NaN | NaN | NaN |
2 | Neymar | NaN | NaN | NaN | NaN |
3 | Hazard | NaN | NaN | NaN | NaN |
(e) new_df = pd. read_csv( ‘c:\two.csv’)
print(new_df)
Ans: Output
@2015@2016@2017@2018 | |
0 | Messi@12@@@ |
1 | Ronaldo@@11@@ |
2 | Neymar@@@8@ |
3 | Hazard@@@@16 |
(f) new_df = pd. read_csv( ‘c:\two.csv’, sep=’@’)
print(new_df)
Ans: Output
Unnamed: 0 | 2015 | 2016 | 2017 | 2018 | |
0 | Messi | 12 | NaN | NaN | NaN |
1 | Ronaldo | NaN | 11 | NaN | NaN |
2 | Neymar | NaN | NaN | 8 | NaN |
3 | Hazard | NaN | NaN | NaN | 16 |
2. Are the following two statements same? Why/Why not?
(i) pd.read_csv(‘zoo.csv’, sep=’,’)
(ii) pd.read_csv(‘zoo.csv’)
Ans: Yes, both are same, because by default in csv file the value of sep is comma , only.
3. How are the following two codes similar or different? What output will they produce?
(i) df = pd.read_csv(“data.csv”, nrows = 5)
print(df)
(ii) df = pd.read_csv(“data.csv”)
print(df)
Ans: The first statement (i) reads only first five rows from csv file and store it in DataFrame df. While second statement (ii) reads all the rows from csv and store the same in the DataFrame df.
4. What is the difference between following two statements?
(i) df.to_sql( ‘houses’, con = conn, if_exists = ‘replace’)
(ii) df.to_sql(‘houses ‘, con = conn, if_exists = ‘replace’, index = False)
Ans: Statement (i), creates a table named houses having one extra column named index, while Statement (ii), creates table named houses which does not have index column.
5. Consider the following code when conn is the name of established connection to MySQL database.
Cars = { ‘Brand’ : [ ‘Alto’,’Zen’, ‘City’, ‘Kia’ ],
‘Price’ : [22000, 25000, 27000, 35000] }
df = DataFrame(Cars, columns = [ ‘Brand’, ‘Price’])
df.to_sql(‘CARS’, conn, if_exists’ = replace’, index = False)
What will be the output of following query if executed on MySQL.
SELECT * from CARS ;
Ans: The table CARS haiving
Brand | Price |
Alto | 22000 |
Zen | 25000 |
City | 27000 |
Kia | 35000 |
6. Consider following code when conn is the name of established connection to MySQL database.
sql = SELECT * from Sales where zone = “central”
df = pandas. read_sql(sql, conn)
df.head()
What will be stored in df ?
Ans: df contains all the rows of Sales table which belongs to zone “central”.