#ted-talks-analysis
All Messages
Updates: I have started working on this analysis. Here's my notebook -
What I have done till now -
* Some basic exploration taking help from @Erik Kristofer Anderson’s Hello World
* Converted the duration of the talks into seconds
* Figured out the sex of each speaker by analysing his/her profile
What do you think?
Also, it's a public notebook. So, you are welcome to fork it, download it and use it for your own analysis
Things to try:
• Topic modelling
Refer -
• Comparing word usage - b/w TED/TEDplus, men/women, etc
• Changes in word use
Refer - ,
• Words at the beginning and end
• Sentiment analysis
Refer - ,
• Gender and verbs analysis
Refer - , ,
I have gone through the detailed description of the project provided by the authors and jotted down the exact paragraphs where the authors suggested things to find using this dataset. I am opening the edits to this post. So, please add your own questions/ideas here.
Great initiative, indeed @Erik Kristofer Anderson!
Thanks!
Good new: I build a hello world python program for the ted talk data.
I did the following: signed up for , made a repl (it's what they call the place where you keep files and data and can run python files in their environment). It was very user friendly, although at times I stumbled against some unfamiliar features of it.
Anyway, I wrote a program which downloads the full data set then prints some information about it, including the full text of the first talk.
I hope this helps people get started playing with this dataset!
Here's the link. I believe you can open it, run it, and tinker with it and you don't even need to sign up to do that. Although if you'd like to save your changes you'd probably need to sign up.
@NITISH SARIN I'm also new here, and fairly new to data analysis stuff. I guess a place to start would be brainstorming questions to ask the data. (If I may anthropomorphize data.) Nityesh mentions on the dev post that the suggestion is to "Analyze these transcripts to reveal some intracasies about out culture"
So let's start there. I'll be back in a bit once I've thought of a few.
@Nityesh Agarwal, I agree. This week I'll be traveling for work, but I'll try to analyze and propose ideas as soon as I can
So I mentioned a few ideas for interesting project ideas in my article -
. The TED talks dataset is one on them. Here, in this channel, we are going forward with this idea and trying to do some analysis using that dataset.Awesome! Great to have you @NITISH SARIN
Hi. I am fairly new to this Data Analysis stuff.
Any leads on what we are trying to do?
I am from a Java development background. But up for anything new.
Probably a gist of what we are trying to achieve would be helpful. 🙂
BTW, here's the Data Is Plural entry on this dataset:
TED talks. Katherine M. Kinnaird and John Laudun — professors whose research includes cultural analytics and computational folklore studies — have created a dataset of 2,656 TED talks, with metadata and transcripts, and have published a detailed description of the project.
•
•
• [dataset of 2,656 TED talks] )
•
IMHO we should first try to explore the dataset and everything related to it with the goal of finding questions that we might answer using the data. What do you think?
Hi @Luiz Oliveira @NITISH SARIN 🙂