2021. 5. 7. 17:22ㆍData science/Python
import pandas as pd
csv_path ='file1.csv'
df=pd.read_csv(csv_path)
*Date Frames (df) :comprised rows and columns
: df.head() - first 5 lines
: to be able to created out of dictionary (key - columns, value - rows)
: single or multiple columns can be extracted -> y = df [ [ 'Length' ] ] or y = df [ ['Length', 'Genre' ] ] -> new dataframe created
List unique values - pandas has unique functions : df['Released'].unique()
Save as CSV : df1.to_csv('new_songes.csv')
Exercise using Watson Studio
Final Assignment - IBM Cloud Pak for Data
{"locales":"en-US","messages":{"CommonHeader.client.search.recentTitle":"Recent searches","CommonHeader.client.search.suggestionsTitle":"Suggestions","CommonHeader.client.trial.days":"Your trial ends in {number} days","CommonHeader.client.trial.tomorrow":"
dataplatform.cloud.ibm.com
# Dependency needed to install file
!pip install xlrd
# Import required library
import pandas as pd
# Read data from CSV file
csv_path = 'https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/IBMDeveloperSkillsNetwork-PY0101EN-SkillsNetwork/labs/Module%204/data/TopSellingAlbums.csv'
df = pd.read_csv(csv_path)
df.head() ***examine the first five rows of a dataframe
# Read data from Excel File and print the first five rows
xlsx_path = 'https://s3-api.us-geo.objectstorage.softlayer.net/cf-courses-data/CognitiveClass/PY0101EN/Chapter%204/Datasets/TopSellingAlbums.xlsx'
df = pd.read_excel(xlsx_path)
df.head()
# Access to the column Length
x = df[['Length']]
x
# Get the column as a series
x = df['Length']
x
# Get the column as a dataframe
x = type(df[['Artist']])
x
# Access to multiple columns
y = df[['Artist','Length','Genre']]
y
# Access the value on the first row and the first column : iloc[ ]
df.iloc[0, 0]
# Access the column using the name
df.loc[1, 'Artist']
# Slicing the dataframe : row - from index 0 to 1 /column - from index 0 to 2
df.iloc[0:2, 0:3]
# Slicing the dataframe using name
df.loc[0:2, 'Artist':'Released']
loc : index name, column name
iloc : index number, column number
gist.github.com/IreneJeong/c8d63bc33bcdeba6658ff52879fbb5a3
Final Assignment.ipynb
GitHub Gist: instantly share code, notes, and snippets.
gist.github.com
One Dimensional Numpy
Numpy is a library of scientific computing. Speed and memory
similar to list, fixed size, the same type
access with index
a.array
type(a) : numpy.ndarray
a.size
a.ndim : array dimensions or the rank of the array
a.shapre :
Zero-Dimension (Scalar) | Multiplication of two scalars, a and b. |
One-Dimensional Arrays (Vector) | Inner product of vectors. |
Two-Dimensional Arrays (Matrix) | Matrix Multiplication. |
a: N-Dimensional Array b: 1-D Array |
Sum product over the last axis of a and b. |
a: N-Dimensional Array b: M-Dimensional Array (M>=2) |
Sum product over the last axis of a and second-to-last axis of b. |
Two Dimensional Numpy
; you can even create multiple dimensional
**********the number of columns in A and the number of rows in B should be equal!
: use dot for multiply the arrays when the size/shape is different