It rarely happens that you will get all your data in on file. Generally, the data are in multiple files and you need to merge them for analysis. In this article, we'll understand how to merge files two files based on common columns.
About data
We'll use the songs dataset for all illustrations. You can download the song dataset by clicking here.
# read the dataset
Songs_DF <- read.csv("Hindi_Songs.csv")
Movie_DF = read.csv("Movie.csv")
Problem statement
I want Year column in Song dataframe. In order to do that we have to merge Song_DF with Movie_DF using a common column Movie.
Merge_DF = merge(Songs_DF, Movie_DF, by = "Movie", all.x = TRUE)
This statement creates a new column at the end of Song_DF.
Merge function takes four arguments
- Primary dataframe
- Secondary dataframe
- specification of columns used for merging: by, by.x, by.y
- logical option to specify the type of merge: all, all.x, all.y etc