Be Kind Rewind is a YouTube channel mainly about past best actress winners, women in film, and film and Hollywood history. The videos delve into the career of these accomplished women and the circumstances that lead to their win. The videos are well-researched and very insightful. I love history, I love film, and I love women in film so I truly enjoy every video from this channel. And so I wanted to see what other people thought about this channel and its videos.
In this project, I apply natural language processing (NLP) to determine the general sentiments on this channel. The results of this project will provide the creator behind this YouTube channel a good idea of how viewers perceive the content produced.
The goals of this project are:
The data was scraped using Google Youtube Data API version 3.0. There are two datasets: one for the details for each of the 67 videos on the channel and one for the comments posted for all of the videos on the channel. The first dataset was used to analyze the overall performance of the channel. The second dataset was used to get the sentiments from viewers.
After importing the libraries and extracting the data of the channel from YouTube, we prepared the data for analysis by doing the following:
The exploratory data analysis was performed using SQL to retrieve information from the datasets. Python was also used for the analysis to visualize the analysis. Some of the notable insights gained from this analysis are as follows:
The most viewed videos are those that discuss movies and personalities who are familiar to today’s audiences. The channel goes over the story of past best actress winners all the way from the infancy of the Oscars. Average viewers may not be able to easily recognize and relate to the subject of some of the videos especially those from the Golden Age of Hollywood. The top 4 most viewed videos cover movies that were recently made (or remade) or actresses that are currently active who won their Oscar within the last 2 decades. The top 5 and 6 most watched video also has this characteristic in a way. They delve into the story of 2 actresses from the Golden Age of Hollywood who were the subjects of a recent popular TV series.
The average views of channel per video is 352,354.
Videos with the most comments are also those that cover actresses and movies that are more recent as they would be more familiar to today’s viewers.
The average number of comments on the channel’s videos is 1,246.
Video comments are correlated to views since the more views a video has, the more chance the video will have of getting comments. Thus, to be able to better compare the comment count of videos, we need to consider them in the context of their view counts. The comment count was divided by the total views of each video and then multiplied by 1000. Using this ratio, we can see that the most commented video per 1000 views is the channel’s Q&A video. Perhaps the reason behind this is that this video is a way for the creator of the channel to engage with the channel’s audience and this, in turn, resulted to more comments.
Comments have an average length of 33 words.
There is a high correlation between viewCount, likeCount, and commentCount. The more a video is viewed, the higher the chance of it getting likes and comments from the viewers.
As seen in the correlation matrix, there is a positive correlation between commentCount and viewCount as well as likeCount and viewCount. Videos with higher viewCounts have higher like counts and comment counts. There is no distinguishable relationship between durationSecs and viewCount as well as tagCount and viewCount.
TextBlob was used to analyze each comment and determine their subjectivity and polarity. Polarity defines orientation of a statement whether it expresses a positive, neutral, or negative sentiment. It expressed as a score from -1 to +1 where -1 means a negative statement and +1 means a positive statement. Subjectivity measures the amount of personal opinion or factual information contained in a text. It’s value is from 0 to 1 with 1 meaning it expresses personal opinion.
Based on the polarity score, each comment was tagged as ‘Positive’ if the score > 0, ‘Neutral’ if the score = 0, and ‘Negative’ if the score < 0.
The channel has more positive comments than neutral or negative comments. With 63.53% of the comments being positive, viewers generally like the content available on the channel Be Kind Rewind.
The videos have an average of 64.88% of their comments being positive. Negative comments are only 11.63% on average for the videos.
Based on the sentiments of comments on the channel’s videos, we can conclude that the channel is generally well received by viewers. Majority of the comments (63.53%) are positive. The videos have an average of 64.88% of their comments being positive. Some of the most used words in the positive comments are love, great, think, time, good, and woman.
The project can be accessed on GitHub through this link with the codes and the full documentation.