We will explore MongoDB Charts, specifically focusing on the newly released natural language charts feature. It has been about six months since its initial release in summer 2024, and we believe it's the perfect time to delve into its functionalities.
Step 1: Data Overview We will use a dataset of approximately 6,000 tweets related to the U.S. elections, collected just before Election Day on November 5, 2024. This dataset includes metrics such as the number of likes, replies, retweets, mentioned users, hashtags, and sentiment analysis results generated by a large language model.
To learn more about collecting Twitter data, please check out our online course, Introduction to Database and Data Collection.
Step 2: Accessing MongoDB Charts
Log into your MongoDB account and navigate to the Charts section.
Create a new dashboard, optionally adding a description.
Step 3: Creating Charts
Creating a Simple Count Chart:
Select your tweet collection.
Drag the Tweet ID field into the aggregation pipeline to count the total tweets.
Name this chart "Number of Tweets" and save it.
Creating a Stacked Area Chart:
Create a new chart to visualize tweet counts over time.
Convert the tweet creation date to the appropriate format and set it as a category.
Drag the Tweet ID again to aggregate the counts.
Add filters for sentiment analysis (e.g., anger) and customize the chart's colors.
Word Cloud for Annotations:
Use the same dataset to create a word cloud that visualizes popular annotations from tweets.
Note: Word clouds are not recommended for accurate quantitative analysis.
Geographic Distribution of Tweets:
Use a map chart to show tweet origins based on user locations.
Customize the map to display data specifically from the U.S. for better clarity.
Heat Map for Entities:
Based on sentiment analysis, a heat map will be used to show the relationships between the mentioned individuals and organizations.
Filter out empty values for clarity and focus on the top 10 entities.
Step 4: Utilizing Natural Language Queries
Switch to the natural language query function.
Type in questions like, “Show the top 10 popular Twitter users,” and see how the system generates charts based on your queries.
Test various prompts, adjusting your wording for better results. For instance, asking about the average likes by tweet sentiment can yield insightful outputs.
Results Evaluation While the natural language query has significantly improved, the natural language chart feature still needs refinement. However, you can modify and customize charts manually to better suit your analysis needs.
We explored MongoDB Charts' capabilities, highlighting the natural language query feature while analyzing election-related tweets. The new functionalities open exciting avenues for data exploration, though some areas require further enhancement.
Comments