I recently reviewed the news from 2023, when everyone was touting prompt engineers as the next high-paying opportunity in Silicon Valley, and even the best career path for humanities majors to enter the tech industry. For example, this article: link

It was content like this that led me to decide to transition from the financial industry to the AI field, and I’ve found myself getting deeper and deeper into it.

Looking back at the summer of 2023, it was the year when “Prompt Engineer” was elevated to god-like status. Back then, watching Midjourney in its early stages, requiring extremely precise, almost mystical keywords to produce usable images, I also once believed: this might be the future, an art of human intuition and experience exploring the boundaries of models. Through accumulated human experience, we would find the “correct way to open” this black box, to awaken this large magic box to get what we needed, and with a lot of variance.

But standing here today in 2026, that myth has been thoroughly shattered.

  1. Good Prompts Aren’t Talent, But Model “Defects”

At the time, we thought that being able to write good prompts was a scarce human talent. Now, it appears that it was merely a compensatory behavior caused by imperfect model training distributions. I’ve recently been re-reading CLIP (Contrastive Language-Image Pre-training) in depth. It’s still a gold standard for combining images and natural language, useful even five years later. However, some “tricks” are now considered model defects. These defects were actually mentioned in the original papers at the time, indicating that they should be addressed through architectural evolution, not prompt engineering. Back then, prompt engineering felt more like a reluctant measure to optimize performance.

In the early stages of architecture, models’ semantic understanding was full of noise. The reason we had to write “a photo of…”, “4k resolution”, “hyper-realistic” was essentially to manually “feature-align” the model. We weren’t creating; we were helping the model find the few correctly labeled coordinate points within its distorted high-dimensional manifold.

With violent breakthroughs in computing power and advancements in data cleaning techniques, models now possess powerful semantic alignment capabilities. When a model can accurately understand “natural language,” those deliberately stacked “keyword spells” become worthless. As the model gets smarter, human “tuning experience” depreciates.

  1. Back to Basics: Why Underlying Architecture is the Truth?

When “spells” are no longer mysterious, the true moats reveal themselves. Instead of delving into “how to talk to the model,” it’s better to understand “why the model talks that way.” For example:

  • Next-token Generation: Understand the essence of autoregressive models, and you’ll grasp why models hallucinate and why “CoT” improves logic.
  • Bi-encoder and Cross-encoder: Understand the alignment and interaction of vector spaces, and you’ll know why some retrieval tasks (RAG) will never be perfectly accurate, and why CLIP can achieve zero-shot learning.

This understanding of underlying architectures is the hard knowledge that transcends model iteration cycles. When you understand the foundational logic, you won’t feel anxious when Midjourney releases V7 or Gemini releases a new version, because you know that no matter how the surface changes, the underlying mathematical essence remains the same.

In addition, Professor Hung-yi Lee’s courses also strive to identify what content will remain crucial for years to come in this era of rapid development and teach it to everyone. I believe this embodies the spirit a true professional should have, and I am very grateful for his videos over the years, which have given us the correct intuition about LLMs.

  1. Startups’ Collective Anxiety: Your Feature is Just Someone Else’s “By the Way”

This also serves as a huge warning for job seekers and entrepreneurs. In early 2024, countless startups invested significant effort in developing prompt-based applications (e.g., helping you write resumes, edit images), or even connecting to APIs for financial analysis. What was the result? When Claude or Gemini updated a small feature, these startups were often “dimensionally reduced” within a day.

Because if your moat is built upon others’ “imperfections,” when others become perfect, you disappear.

  1. The Future’s “Golden Combination”: Underlying Understanding + Domain Expertise

So, in 2026, what kind of talent will truly be valuable? The answer is: individuals who possess an understanding of underlying AI architectures and can combine it with deep domain knowledge.

If you understand Law and comprehend how AI retrieves legal precedents through vector spaces, you can develop tools that professional lawyers would actually dare to use.

If you (truly) understand Finance and grasp the limitations of Transformers in processing time series and structured data, you won’t blindly trust predictions given by models. A technical moat is never about cutting-edge, easily superseded “trendy knowledge.” It’s about the deep coupling of underlying logic with real-world professional domains. This combination is a moat that computing power cannot easily breach.