pandas notes 3
51
get single row values as list in pandas
To get the values of a single row in a pandas DataFrame as a list, you can use either the .iloc[] indexer or the .loc[] indexer. Here's how you can do it:
Using .iloc[] indexer (index-based selection):
import pandas as pd
# Sample DataFrame
data = {'Name': ['John', 'Alice', 'Bob', 'Emma', 'Mike'],
'Age': [30, 25, 40, 35, 28],
'Gender': ['Male', 'Female', 'Male', 'Female', 'Male']}
df = pd.DataFrame(data)
# Get values of the first row as a list
first_row_values = df.iloc[0].tolist()
print(first_row_values)Using .loc[] indexer (label-based selection):
# Get values of the first row as a list
first_row_values = df.loc[0].tolist()
print(first_row_values)Both of these methods will output the values of the first row of the DataFrame df as a list:
Replace 0 with the index of the row you want to retrieve the values for. These methods allow you to access the values of a single row and convert them into a list format.
52
show a single row and colunms in a table format + pandas
To display a single row and selected columns in a table format in pandas, you can use the iloc indexer to select the row by its index and specify the columns you want to display. Then, you can convert the resulting DataFrame into a formatted table using the to_string() method. Here's how you can do it:
Output:
In this example, row_index specifies the index of the row you want to display, and selected_columns specifies the columns you want to include in the table. We use the iloc indexer to select the specified row by its index and the specified columns. Then, we convert the resulting DataFrame into a formatted table using the to_string() method with index=False to exclude the row index.
53
loop by rows in pandas
You can loop through rows in a pandas DataFrame using the iterrows() method. Here's how you can do it:
This will output each row in the DataFrame df along with its index:
In each iteration of the loop, index represents the index of the current row, and row represents the values of the current row as a pandas Series. You can access individual values within the row using dictionary-like indexing, such as row['Name'], row['Age'], etc.
54
create dataframe pandas
To create a pandas DataFrame, you can pass a dictionary of lists or arrays to the pd.DataFrame() constructor. Each key in the dictionary will become a column name, and the corresponding list or array will become the data for that column. Here's how you can create a simple DataFrame:
This will output:
In this example, each key in the data dictionary corresponds to a column in the DataFrame, and the lists associated with each key provide the data for that column. The DataFrame df is created by passing this dictionary to the pd.DataFrame() constructor.
55
get columns as list pandas
To get the columns of a pandas DataFrame as a list, you can access the columns attribute of the DataFrame. Here's how you can do it:
Output:
In this example, df.columns returns a pandas Index object representing the columns of the DataFrame df, and .tolist() converts this Index object into a Python list containing the column names.
56
consider second row as header + pandas
To consider the second row as the header while creating a DataFrame in pandas, you can use the header parameter when reading the data from your source. Here's how you can do it:
Output:
In this example, data is a list of lists where each inner list represents a row of data. We use the second row (data[1]) as the header by specifying it as the columns parameter when creating the DataFrame df using pd.DataFrame(). The remaining rows of data are used as the actual data for the DataFrame.
57
Find ms col index
If you want to find the index of a specific column in a pandas DataFrame based on its name, you can use the .get_loc() method of the DataFrame's columns. Here's how you can do it:
Output:
In this example, column_name is the name of the column whose index we want to find. We use the .get_loc() method on df.columns to get the index of the column with the specified name. The resulting column_index variable contains the index of the column. Note that indexing in pandas starts from 0, so the index of the first column is 0, the index of the second column is 1, and so on.
58
get row by index in pandas
To get a specific row by its index in a pandas DataFrame, you can use the .iloc[] indexer. Here's how you can do it:
Output:
In this example, row_index is the index of the row you want to retrieve. We use the .iloc[] indexer to access the row with the specified index. The resulting row variable contains the data for the specified row as a pandas Series.
59
get row by multiple index in pandas
If you want to get multiple rows by their indices in a pandas DataFrame, you can pass a list of indices to the .iloc[] indexer. Here's how you can do it:
Output:
In this example, row_indices is a list containing the indices of the rows you want to retrieve. We pass this list to the .iloc[] indexer, which returns the rows with the specified indices. The resulting rows variable contains the selected rows as a new DataFrame.
60
pandas iloc with multiple index
If you want to select specific rows and columns using .iloc[] indexer in pandas, you can pass lists of indices for rows and columns. Here's how you can do it:
Output:
In this example, row_indices and col_indices are lists containing the indices of the rows and columns you want to select, respectively. We pass these lists to the .iloc[] indexer, specifying both rows and columns using the syntax .iloc[row_indices, col_indices]. The resulting selected_data DataFrame contains the data from the specified rows and columns.
61
row from and to iloc + pandas
If you want to select specific rows and columns using .iloc[] indexer in pandas, you can pass lists of indices for rows and columns. Here's how you can do it:
Output:
In this example, row_indices and col_indices are lists containing the indices of the rows and columns you want to select, respectively. We pass these lists to the .iloc[] indexer, specifying both rows and columns using the syntax .iloc[row_indices, col_indices]. The resulting selected_data DataFrame contains the data from the specified rows and columns.
62
get specific column values as list + pandas
To get the values of a specific column as a list in pandas, you can access the column using its name and then use the .tolist() method. Here's how you can do it:
Output:
In this example, df['Name'] accesses the 'Name' column of the DataFrame df, and .tolist() converts the values of this column into a Python list. You can replace 'Name' with the name of any column you want to retrieve the values for.
63
apply convert small case + pandas
You can use the apply() function in pandas to apply a function to each element of a Series. To convert all values in a column to lowercase, you can use the str.lower() method within apply(). Here's how you can do it:
Output:
In this example, df['Name'].apply(str.lower) applies the str.lower() method to each element of the 'Name' column, converting all values to lowercase. Then, the result is assigned back to the 'Name' column in the DataFrame df. You can replace 'Name' with the name of any column you want to convert to lowercase.
64
pandas drop specific column
To drop a specific column from a DataFrame in pandas, you can use the .drop() method with the axis parameter set to 1. Here's how you can do it:
Output:
In this example, df.drop('Gender', axis=1) drops the 'Gender' column from the DataFrame df. The axis=1 parameter specifies that we want to drop a column (as opposed to dropping a row, which would be axis=0). The resulting DataFrame df does not contain the 'Gender' column.
65
how-to-check-if-any-value-is-nan-in-a-pandas-dataframe
You can check if any value is NaN (missing) in a pandas DataFrame using the .isna() method, followed by the .any() method. Here's how you can do it:
Output:
In this example, df.isna() returns a DataFrame of the same shape as df, where each element is True if the corresponding element in df is NaN, and False otherwise. Then, .any().any() checks if there is any True value in the resulting DataFrame, indicating the presence of NaN values. If any NaN value is found, the condition is_nan will be True, otherwise, it will be False.
66
exclude "Unnamed" columns in pandas
When reading a CSV file into a pandas DataFrame, sometimes unnamed columns are created, which usually occur due to extraneous commas in the file or missing column names. To exclude these "Unnamed" columns, you can filter out columns whose names contain "Unnamed". Here's how you can do it:
Output:
In this example, df.columns.str.contains('^Unnamed') returns a boolean mask indicating whether each column name contains "Unnamed". We use ~ to invert this mask, so that it's True for columns not containing "Unnamed". Then, we use .loc[] to select columns based on this mask, effectively excluding the "Unnamed" columns.
67
loop pandas rows
You can loop through rows in a pandas DataFrame using the .iterrows() method. Here's how you can do it:
Output:
In each iteration of the loop, index represents the index of the current row, and row represents the values of the current row as a pandas Series. You can access individual values within the row using dictionary-like indexing, such as row['Name'], row['Age'], etc.
68
read_excel with column data format + pandas
When reading an Excel file with read_excel in pandas, you can specify the data types of columns using the dtype parameter. Here's how you can do it:
Replace 'your_excel_file.xlsx' with the path to your Excel file, and specify the column names and their corresponding data types in the dtype parameter. In the example above, 'Column1' is set as integer (int), 'Column2' as string (str), and 'Column3' as float (float).
By specifying the data types, you can ensure that pandas interprets the data correctly when reading the Excel file. This can be useful for preventing pandas from inferring incorrect data types, especially for columns containing numerical or date values.
69
70
Last updated