Under the Hood: Dissecting SN Analytics & Making Them Predictive \ stacker news

Hello Stackers,

I noticed a significant discrepancy between the data shown on my profile dashboard and raw engagement tracking.

This led me down a rabbit hole and ask myself:

How are SN analytics actually calculated under the hood?
Specifically, I'm questioning the latency and aggregation logic.
Are we relying on real-time stream processing, or is there a batch-processing delay?
How are nested replies factored into overall post engagement metrics, and how is bot traffic being filtered out?

Why This Matters

To make Stacker News a better hub for quality content, we need reliable, transparent metrics.

Who is this crucial for:

Content Creators: Understanding what truly resonates with the community.

The Ecosystem: Ensuring rewards align with actual value generated.

Proposed Improvements & Roadmap

I believe we can take SN analytics to the next level by making them more transparent and predictive:

Open Source Metric Definitions: A simple public breakdown of how "Views," "Votes," and "Time Spent" are measured to eliminate discrepancies.

Predictive Engagement Score: Implementing a machine learning model using historical data to forecast which posts are likely to trend, helping stackers find high-value content faster.

Real-Time Dashboard Updates: Moving toward a real-time data pipeline to reduce lag between interaction and data reflection.

Call to Action for the Devs

@niftynei @keyan @ek, could you shed some light on the current architecture?

Specifically:

What database is driving the analytics?
Is there a documented API for these metrics?
What are your thoughts on current metrics?
Have you noticed similar discrepancies?

Let’s discuss how to make our data as robust as our community.

1 sat \ 2 replies \ @k00b 21 Mar

why are you tagging nifty and ek?
point the bot (you used to write this post) at our open source repo and ask it these questions

1 sat \ 1 reply \ @ch0k1 OP 21 Mar

I thought they are part of the SN dev operational teams.
What do you mean? I can't recall what do you refer? And is this some sort of a forbidden topic?

1 sat \ 0 replies \ @k00b 21 Mar

they aren't
ask chatGPT questions about our code: https://github.com/stackernews/stacker.news