Python reads csv file

#Import the pandas package and set the alias to pd import pandas as pd #Read the csv format file and set the format to DataFrame format df=pd.read_csv("1.csv"); print(df) print("-"*80 ) #Delete null values#Function: As long as a null value (NaN) appears in a row, the entire row of data will be deleted. df1=df.dropna() print(df1) print("-"*80) #Parameter values ​​commonly used in the dropna method #The syntax is as follows: # pd.DataFrame.dropna(axis=0,how="any",thresh= None,subset=None,inplace=False) #1. Axis: The default is 0, which means the entire row will be removed when the value is empty. If the parameter axis=1 is set, the entire column will be removed when the value is empty. #2. how: The default is 'any'. If there is NA in a row (or column), the entire row will be removed. # If you set how='all', the entire row will be removed only if all the data in a row (or column) are NA. . #3. thresh: Set how much non-null data is required to be retained. #4. Subset: Set the columns you want to check. If there are multiple columns, you can use a list of column names as parameters. #5. inplace: If set to True, the calculated value will directly overwrite the previous value and return None, and the source data will be modified. #Determine the null value of the specified column, the return type is Boolean print(df["NUM_BEDROOMS"].isnull()) print("-"*80) #Specified value content missing_value=["n/a","na" ,"--","NaN"] df=pd.read_csv("1.csv",na_values=missing_value) #Delete specified value#Add a permission here to modify the source data. If not added, the program will not delete the specified value df.dropna(inplace=True) print(df) print("-"*80) #Determine whether certain columns have null values, delete them if so, and overwrite the source data. df.dropna(subset=["PID","ST_NUM"],inplace=True) print(df) print("-"*80) #Specify value content missing_value=["n/a","na"," --","NaN"] df=pd.read_csv("1.csv",na_values=missing_value) #Replace the specified value content (replace the value specified above with 123) --All# df.fillna(123,inplace =True) #------------------------------------------------ ------------- # Fill in the specified columns, the value is the column name, and the fill value is 123 --specify df["ST_NUM"].fillna(123,inplace=True) print(df) print("-"*80) #Specify value content missing_value=["n/a","na","--","NaN"] df=pd.read_csv("1.csv", na_values=missing_value) #Get the average of the specified column avg=df["ST_NUM"].mean() #Set the null value to the content of the avg value (the average of a certain column) df["ST_NUM"].fillna(avg ,inplace=True) print(avg) print(df)

Related Posts

Use Request in Python to implement HTTP requests (data, json, file, headers, timeout)

“New Semester, New FLAG” Breaking the Limit

Swift implements second-level nesting of TableView through storyboard

[WebSocket] Protocol Detailed Explanation

ModelArts decompresses the compressed package on OBS

Selenium simple common operations (part of WebdriverAPI)

Matlab obtains the pitch frequency and formant information of the speech and marks it on the spectrogram

Sorting function in Python–sorted() function

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>