Detailed explanation of finding the difference between two dataframes using Pandas

1. Intersection

intersected=pd.merge(df1,df2,how=’inner’)

Extend (intersect columns) intercepted=pd.merge(df1,df2,on[‘name’],how=’inner’)

2. Difference set (df1-df2 as an example)

diff=pd.concat([df1,df2,df2]).drop_duplicates(keep=False)

Detailed explanation of difference set function:

1. Pandas can easily combine Series and DataFrame objects through the concat() function. The syntax format of the function is as follows: pd.concat(objs,axis=0,join=’outer’,join_axes=None,ignore_index=False)

2. There needs to be duplicate values ​​in a column in the dataframe. Apply drop_duplicates to solve this problem.

For example: ata={“a” [2,2,3,5,5,10],”c” [4,5,6,7,8,12]}
pd_data=pd.DataFrame(data=data)
print(pd_data)
t=pd_data.drop_duplicates(subset=[‘c’,’b’],keep=’last’,inplace=False)
print(t)

illustrate:

keep=’first’ means to keep the first occurrence of duplicate rows, which is the default value. The other two values ​​​​of keep are “last” and False, which respectively indicate retaining the last duplicate row and removing all duplicate rows.

inplace=True means to delete duplicates directly on the original DataFrame, while the default value of False means to generate a copy. If you want to generate a new DataFrame:,inplace=False

subset removes heavy columns. subset=[‘c’,’b’], indicating that the records in the row: both columns c and b are repeated.

3. Combining concat and drop_duplicates solves the problem of finding the difference set.

In addition, there is another way to achieve the same purpose:

Related Posts

Calculation of pi π in Python

Configure and run ORBSLAM2 on Ubuntu20.04 (nanny-level tutorial)

Python3 crawler tutorial-basic use of aiohttp

Python Digital Analog Notes-PuLP Library (1) Introduction to Linear Programming

Anaconda is installed on the D drive, but why is the working terminal itself on the C drive?

ModelArts(1)——From quick start to remote development

Mouse operation in selenium: ActionChains class

Python installation tutorial step 2: Create a virtual environment in Windows Install Pytorch and configure the virtual environment in PyCharm

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>

*