This seemingly ordinary launch demo image actually hints at MJ’s transition from “Gallery Owner” to “Creator”—behind the screen, it’s not just video, but an incoming simulated world.
Hey there.
Shanghai is seeing that typical wet, cold weather today. The clouds outside look like overturned buckets of grey paint, sticking clammily to the sky. It’s the kind of weather that’s perfect for talking about “dream weaving.”
I know your social feed is probably flooded today with news about the Midjourney Video Model (Video Model V1). Everyone is exclaiming that “images can finally move,” but I need to throw a bucket of cold water on you—or rather, hand you a hot coffee to wake you up.
If you only see “video generation,” you are vastly underestimating the ambition of that madman, David Holz (Midjourney Founder). Don’t be fooled by the “V1” tag. This isn’t just a video tool; it is a one-way ticket to the “Matrix,” and we are the first batch of passengers boarding the ship.
01. Don’t Stare at the Screen, Look at the “Holodeck” Blueprint
Every AI video company on the market is competing on duration, resolution, and whose generated cat fur is smoother. But Midjourney’s announcement this time reveals their true hand in the very first sentence.
They tell you bluntly: “We built image generation because it is the brick for building a ‘world’.”
Does this logic sound familiar? (Strokes chin) It’s like someone telling you they were selling LEGO bricks not for you to play with, but because they intend to build a real, habitable city one day.
Their endgame isn’t video at all, but “Real-time Open World Simulation.”
You need visuals (Image), you need dynamics (Video), you need spatial awareness (3D), and finally, you need Real-time feedback. The video model released today is merely the second step—teaching the bricks how to “breathe.”
This is actually quite terrifying. While OpenAI and Google are busy trying to turn AI into a Hollywood director, Midjourney is figuring out how to play “God.” They don’t want a rendered mp4; they want a digital ecosystem you can walk into and interact with.
02. “Erratic Errors” are the Ghosts in the Shell
Let’s talk about the functional details of this release; there is a very interesting blind spot here.
Midjourney introduced a “High Motion” and “Low Motion” switch.
- Low Motion: Suitable for atmospheric shifts, but sometimes produces “duds” that barely move.
- High Motion: Everything moves, but at the cost of “erratic errors.”
Everyone is complaining about these errors, but I think this is where the charm lies.
Think about it: traditional CGI locks every pixel onto a track for stability. But Midjourney’s “instability” is precisely the “hallucination overflow” of the AI’s understanding of the physical world. When it tries to make a static cyberpunk street move, those distorted lights and occasionally teleporting pedestrians—aren’t they just “quantum fluctuations” in the digital world?
Sometimes I can’t help but guess that David Holz did this on purpose. He gave you a scalpel that isn’t precise, forcing you to choose between “rigid perfection” and “vivid collapse.” This approach of handing aesthetic discretion back to the user is very “Midjourney.”
Looking at this interface, you’ll find that MJ stubbornly retains that “geeky roughness,” as if to say: the tool is just a vessel, your imagination is the fuel.
03. 8x Price: A Rip-off or a Bargain?
Talking about money doesn’t hurt feelings; talking about compute power does.
The pricing strategy this time is interesting: Video jobs cost 8 times as much as image jobs.
Sounds expensive? But the official statement offers a convincing conversion: this roughly equates to “the cost of every second of video = the cost of one image.” Furthermore, they claim this is 25 times cheaper than previous competitors on the market.
Let’s do the math. In today’s world of 2026, compute power is still a scarce resource. Generating four 5-second videos effectively sends the GPU into a frenzy of high-intensity inference for 20 seconds. Compared to some competitors who charge by the second and make you queue for two hours, Midjourney’s pricing is actually quite “calculatingly smart”—it lands right in the zone where “pro users feel it’s a bargain, and free-riders feel the pain.”
Moreover, they made the Web interface the primary launchpad and even prepared a “Video Relax Mode” for Pro users. What does this mean? It means they don’t just want to make money; they want to train that legendary “Real-time Model” using massive amounts of user data.
To put it plainly, that 8x price you pay is essentially crowdfunding the electricity bill for the future “World Simulator” (laughs).
This cost pie chart is intuitive. In the AI video field, compute cost remains the invisible elephant in the room. For Midjourney to push the price down to this level, their backend engineering optimization must have been ruthless.
04. When the Gallery Becomes the Matrix
What I worry about most (or perhaps look forward to most) is the moment when this “unified system” is truly completed.
The article mentions that the current models are just “stepping stones.” In the future, they will integrate visuals, video, 3D, and real-time capability entirely.
Imagine this:
You input a prompt, and you no longer get an image or a video, but an entry point.
You put on your glasses (or brain-computer interface) and walk directly into the “rainy Tokyo street” you just described. You can smell the ramen, hear the electric buzz of neon signs, and push open the door to that Izakaya…
At that point, will we still need the ancient medium of “movies”?
Or rather, will we still need “reality”?
If Midjourney really achieves this, then today’s release of Video Model V1 is the butterfly flapping its wings in the Amazon jungle.
05. Dedicated to Future Dreamers
But I don’t want to make the topic too heavy.
After all, for you right now, the most important thing is still that “Animate” button.
Go ahead and try it. Drag that concept art you’ve been saving into it, select “High Motion,” and watch how your protagonist clumsily takes their first step driven by AI compute power.
Even if they walk crookedly, even if the buildings in the background shake like jelly.
Don’t laugh at it.
Because that might be the first cry of digital life being born.
Have fun in the new dimension, my friend.
—— Lyra Celest @ Turbulence τ
