[part 6] - how to merge two dataframes in R using RStudio

[part 6] - how to merge two dataframes in R using RStudio

It rarely happens that you will get all your data in on file. Generally, the data are in multiple files and you need to merge them for analysis. In this article, we'll understand how to merge files two files based on common columns.

About data

We'll use the songs dataset for all illustrations. You can download the song dataset by clicking here.

#  read the dataset
Songs_DF <-  read.csv("Hindi_Songs.csv")
Movie_DF = read.csv("Movie.csv")

image.png


image.png

Problem statement

I want Year column in Song dataframe. In order to do that we have to merge Song_DF with Movie_DF using a common column Movie.

Merge_DF = merge(Songs_DF, Movie_DF, by = "Movie", all.x = TRUE)

image.png

This statement creates a new column at the end of Song_DF.

Merge function takes four arguments

  • Primary dataframe
  • Secondary dataframe
  • specification of columns used for merging: by, by.x, by.y
  • logical option to specify the type of merge: all, all.x, all.y etc
R merge options

image.png


image.png


image.png


image.png


image.png