📜  youtube comments scrape r - TypeScript (1)

📅  最后修改于: 2023-12-03 15:06:06.575000             🧑  作者: Mango

Youtube Comments Scrape r - TypeScript

Introduction

In this guide, we will discuss how to scrape youtube comments using the R programming language and TypeScript. We will use the YouTube Data API to retrieve comments from a specific video and then parse the data to extract relevant information.

Prerequisites
  • Basic knowledge of R programming language.
  • A Google account to access the YouTube Data API, enable API and set credentials.
  • Basic knowledge of TypeScript, as we will build a Node.JS program using it.
Scrape Youtube Comments using R

We will be using the tuber package in R, which is a wrapper for the YouTube Data API. First, we need to load the tuber package and authenticate using the oauth_ function from the httr package.

library(tuber)
library(httr)

#authentication
yt_oauth_token <- oauth_token(...)

Note: We need to generate the yt_oauth_token by following the instructions in this guide.

Next, we can use the with_comments function to retrieve the comments for a specific video. We need to provide the video ID, which can be found in the video URL, and the part parameter, which is set as snippet.

comments <- with_comments(yt_oauth_token, video_id = "VIDEO_ID", part = 'snippet')

The comments object is a list containing all the comments for the specified video, with each element corresponding to a comment. We can extract the relevant details by looping through the elements of the list.

for (i in 1:length(comments)) {
    comment_author <- comments[[i]]$snippet$authorDisplayName
    comment_text <- comments[[i]]$snippet$textDisplay
    comment_date <- comments[[i]]$snippet$publishedAt
}
Building a Typescript program

We can use TypeScript to build a Node.js program to scrape the youtube comments. We will be using the google-auth-library package to authenticate, the googleapis package to access the YouTube Data API, and the node-fetch package to fetch data.

import { OAuth2Client } from 'google-auth-library';
import { google } from 'googleapis';
import fetch from 'node-fetch';

async function main() {
    const videoId = 'VIDEO_ID';
    const auth = new OAuth2Client({
        clientId: 'CLIENT_ID',
        clientSecret: 'CLIENT_SECRET',
    });
    auth.setCredentials({
        access_token: 'ACCESS_TOKEN',
        refresh_token: 'REFRESH_TOKEN',
    });

    const youtube = google.youtube({
        version: 'v3',
        auth,
    });

    const comments = await youtube.commentThreads.list({
        part: ['snippet'],
        videoId: videoId,
        maxResults: 100,
    });
    
    comments.data.items?.forEach((comment: any) => {
        const commentAuthor = comment.snippet.topLevelComment.snippet.authorDisplayName;
        const commentText = comment.snippet.topLevelComment.snippet.textDisplay;
        const commentDate = comment.snippet.topLevelComment.snippet.publishedAt;
    })
}

main();

Note: We need to generate the ACCESS_TOKEN, REFRESH_TOKEN, CLIENT_ID, and CLIENT_SECRET by following the instructions in this guide.

Conclusion

In this guide, we discussed how to scrape youtube comments using the R programming language and TypeScript. We used the YouTube Data API to retrieve comments from a specific video and then parsed the data to extract relevant information. We also built a Node.js program using TypeScript to accomplish the same task.