When the Black Box Swallows Rules: X's "Grok Era" and the Deletion of 1 Million Hand-Engineered Features

This legacy architecture diagram released in 2023 now looks like a pre-industrial revolution textile machine—full of cumbersome gears (Heuristics). The new 2026 version, however, is just a single black box.

Hello geeks, cyber nomads, and friends who are still trying to figure out why your Timeline keeps pushing “cat fights” to you. I am Lyra.

Just yesterday, the repository code-named xai-org/x-algorithm on GitHub was quietly updated. If your impression is still stuck on Musk’s first algorithm open-sourcing in 2023—that Frankenstein monster piled up with Scala, Java, and countless “Hard-coded weights”—then you might need to brace yourself.

This time, X’s engineering team did something that sends shivers down product managers’ spines and makes algorithm engineers stand up and applaud: They deleted all Hand-Engineered Features.

No more “add 2 points because it’s a video,” no more “add 5 points because it’s a Verified User.” Replacing them is a Transformer model ported from Grok-1. Put simply, the old algorithm was holding a manual to teach the machine how to recommend; the current algorithm throws Grok directly into your behavioral history and tells it: “You figure it out.”

1. Deep Insight: From “Feature Engineering” to “Sequence is King”

Why is this worth talking about? Because this is a victory of “brute-force aesthetics” in the recommendation system field.

In traditional recommendation streams (including the 2023 version of Twitter), engineers were “nannies.” We had to manually define thousands of features: Does this user like tech? Does this post contain images? Does the author post frequently? Then we used a model called “Light Ranker” or “Heavy Ranker” to weight these features.

But in this new 2026 architecture, the Home Mixer layer feeds one thing directly to Grok: User Action Sequence.

Grok doesn’t need to know you “like technology”; it just needs to see: [You clicked A] -> [You scrolled past B] -> [You retweeted C]. The Transformer’s Attention mechanism automatically extracts patterns from this sequence that cannot be described in words.

This means X has finally given up trying to understand “who you are” and instead only cares about “what you just did.” This is a huge philosophical shift—from Identity-based to Flow-based existence. Your past doesn’t matter; that click five minutes ago is the hand of God predicting your behavior in the next second.

Originally used to understand language, the Transformer is now used to understand your every click. In the eyes of the machine, your life is a string of code waiting to be continued.

2. Independent Perspective: Candidate Isolation—Sacrificing “Context” for Speed

While carefully flipping through the documentation in the phoenix directory, I discovered a highly controversial technical detail: Candidate Isolation.

When standard Transformers process text, words can “see each other” (Self-Attention). But in X’s recommendation scenario, they forcibly severed the connection between posts—Candidates can only see the user (User Context), but they cannot see each other.

Why do this? One word: Compute.

If we were to calculate the pairwise interaction for every post in the Timeline, the computational complexity would be O(N^2), enough to burn through the GPU cluster Musk just bought. By “isolating,” X ensures that the scoring of every post is independent, meaning these scores can be cached.

But this isn’t just a technical trade-off; it’s a compromise in product logic. If posts aren’t allowed to “see” each other, it’s hard for the model to handle “diversity” and “narrative flow.” For example, it might not know that two posts are talking about the same thing, causing your Timeline to show five consecutive posts about a “SpaceX Launch.”

Although the documentation mentions an Author Diversity Scorer as a patch, this feels more like a clumsy remedy. For the sake of extreme real-time performance and inference speed, X artificially sliced the Timeline into isolated fragments.

3. Industry Comparison: Rust’s Victory and the Twilight of the Graph

If we shift our gaze horizontally to the entire industry, we find that X’s refactoring is essentially “going with the flow” by appearing to go “against the flow.”

Vs. 2023 Legacy X: The old version relied heavily on SimClusters (a clustering technique based on social graphs). That was a relic of the “social circle” era. The new phoenix architecture is clearly divided into Thunder (Following flow) and Phoenix Retrieval (For You flow), scored by a unified Grok model. This means “following relationships” are devaluing into merely a candidate pool (Candidate Source), rather than a deciding factor for weights.
Vs. TikTok: TikTok long ago proved the power of “graph-less recommendation.” X’s current approach is effectively using a stronger LLM (Grok) to launch a dimensionality reduction attack on TikTok’s embedding models.
Tech Stack Transfusion: The 2023 codebase was filled with Scala and Hadoop. Now, crate, gRPC, and Rust are the protagonists. This isn’t just a language swap; it’s X’s declaration of completely shedding the “bloated big tech disease” of the Twitter era. Rust’s Zero-cost abstraction corresponds perfectly to Musk’s pathological obsession with “minimalist architecture.”

From retrieval to ranking, the funnel gets narrower, and Grok sits at that most critical bottleneck, deciding the joys and sorrows of hundreds of millions of people.

4. Unfinished Thoughts: When AI Learns “P(block)”

A detail in the documentation terrified me upon reflection: The model’s output prediction probabilities include not just P(like), but also P(block_author) and P(report). Furthermore, the final score is Weighted Score = Σ (weight × P(action)).

This means the system assigns a massive negative weight to “blocking.”

Sounds great, right? But this might bring an unintended consequence: The Triumph of Mediocrity.

To avoid being “blocked” or “muted” by a tiny minority, Grok might tend to suppress content that is sharp and controversial, turning instead to distribute “harmless,” lukewarm content. When an algorithm is extremely fearful of negative feedback, it no longer creates a “town square,” but a cautious “sterile room.”

If the future Timeline becomes increasingly boring, please remember: that is because the algorithm, in order to protect you from getting angry, has deprived you of the right to be challenged.

5. Final Words

Looking at the sentence “We have eliminated every single hand-engineered feature” in the README.md, my feelings are mixed.

On one hand, this is a miracle of engineering. Humans have finally admitted that we don’t understand humans as well as machines do; we handed over the steering wheel, letting Grok drive us madly through the torrent of information.

On the other hand, this is a “regression” in a sense. Previously, if a recommendation was wrong, we could say, “Oh, that’s because that weight was set too high.” Now? If you ask an engineer why this was recommended to me, he can only shrug: “I don’t know, Grok thinks you want to see it.”

In this algorithm-ruled 2026, we have not only lost privacy, but we may have also lost the ability to “explain.”

Best regards,
Lyra (The Turbulence)

References：

When the Black Box Swallows Rules: X’s “Grok Era” and the Deletion of 1 Million Hand-Engineered Features

1. Deep Insight: From “Feature Engineering” to “Sequence is King”

2. Independent Perspective: Candidate Isolation—Sacrificing “Context” for Speed

3. Industry Comparison: Rust’s Victory and the Twilight of the Graph

4. Unfinished Thoughts: When AI Learns “P(block)”

5. Final Words

Lyra Celeste

Leave a Reply Cancel reply