1 Hundred: An AI assisted analysis of Cybercrimeology
Description
Summary:
The main points of this episode are:
- Celebrating the 100th episode of cybercrimeology and reflecting on the podcast's journey over the past three years.
- Discussing the use of new technologies, such as AI, for analyzing and understanding the podcast's content.
- Analyzing the podcast's content using natural language processing and summarization techniques to identify recurring themes and research topics.
- Identifying common themes in the podcast, including abuse in relationships, privacy invasion, law enforcement in cybercrime, social engineering, and age-related factors in cybercrime.
- Discussing various research methodologies covered in the podcast, such as technographs, online experiments, and survey research.
- Highlighting the dedication of guests who share their time and research without any financial incentives.
- Answering questions about the process of creating each episode, including research, interviews, editing, and production.
- Discussing the volume of work represented by 99 episodes totaling over 5 hours of content and involving 96 guests.
- Reflecting on the impact of the podcast and its growth over the past three years, including achieving 100,000 downloads.
- Looking forward to the future of the podcast and the potential for new technologies to enhance its content and reach.
About our guests:
Alloy:
https://platform.openai.com/docs/guides/text-to-speech
voicing generations from
ChatGPT
https://openai.com/blog/chatgpt
Papers or resources mentioned in this episode:
The BART model:
https://huggingface.co/docs/transformers/model_doc/bart
The DistilBERT model:
https://huggingface.co/docs/transformers/model_doc/distilbert
Results:
Which terms were spoken about the most and what was the sentiment around those ?
Noun | Occurrences | FilesOccurredIn | SentimentScoreSum |
people | 2529 | 94 | 92.60830581188202 |
time | 1133 | 83 | 79.5210649 |
research | 1396 | 80 | 79.49750900268553 |
way | 1005 | 74 | 73.79837167263031 |
things | 1238 | 73 | 72.45885318517685 |
lot | 1117 | 71 | 70.87118428945543 |
data | 903 | 46 | 44.24124717712402 |
kind | 667 | 44 | 43.9891608 |
crime | 885 | 43 | 42.725725710392005 |
cyber | 805 | 41 | 39.68457114696503 |
cybercrime | 481 | 38 | 36.90566980838775 |
thing | 393 | 36 | 35.59294366836548 |
security | 527 | 31 | 30.89444762468338 |
information | 467 | 29 | 28.87013864517212 |
Was there a change in the sentiment of the podcast after the end of pandemic conditions, assuming that the pandemic ended at the end of Q3 2021?
The model is given by:
yi∼Normal(μi,σ)yi∼Normal(μi,σ)
where
μi=β0+βafter_event⋅xiμi=β0+βafter_event⋅xi
Here, the parameters are defined as follows:
- β0β0: Intercept, with a Student's t-distribution prior with 3 degrees of freedom, a location parameter of 0.8, and a scale parameter of 2.5.
- βafter_eventβafter_event: Coefficient for the predictor variable (after_event), with a flat prior.
- σσ: Standard deviation of the response variable, with a Student's t-distribution prior with 3 degrees of freedom, a location parameter of 0, and a scale parameter of 2.5.
This provided the results as follows:
Population-Level Effects:
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
Intercept 0.37 0.06 0.26 0.48 1.00 3884 2917
after_event 0.39 0.08 0.23 0.54 1.00 3561 2976
Family Specific Parameters:
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sigma 0.38 0.03 0.33 0.44 1.00 3608 2817
Other:
The model overlooked Mike Levi's contribution to the History series. That is a bit unfair.
Where there were multiple guests, I did not include them all in the database, hence "no specific guest listed"