Software build cost is collapsing toward zero. The scarce resource is knowing what to build. I designed the protocol that makes that judgment systematic instead of intuitive.
软件构建成本正在被压向零。稀缺的不是工程能力,而是知道该做什么。我设计了一套协议,把这个判断从凭直觉做,变成按系统做。
AI is compressing software build cost toward zero. A working prototype that used to require a small team and a quarter can now be shipped in an afternoon by one person with a model and a credit card. The bottleneck has moved.
When anyone can ship a product in a few hours, engineering capacity is no longer the constraint. The constraint is knowing what to build — which need is real, which is loud but shallow, which window is open for six weeks, which is already closed.
I sat in that gap for a year and decided to build the instrument I kept wishing I had: a venture-studio engine that treats market judgment as a system, not a personality trait.
AI 正在把软件的构建成本压向零。过去一个小团队、一个季度才能交付的原型,今天一个人加一个模型、一张信用卡,下午就能上线。瓶颈已经换位置了。
当任何人都能在几小时内交付一个产品,工程能力不再是约束。真正的约束是 知道该做什么 —— 哪一个需求是真的,哪一个只是声音大、实质浅,哪一扇窗口开六周就会关,哪一扇其实早就关了。
我在这个缝隙里观察了一整年,决定亲手做出一直想用的那台仪器:一套把市场判断当成系统、而不是当成个人天赋的 venture-studio 引擎。
Watch how decisions actually get made in most product orgs and you see two failure modes, dressed up as the same thing.
Mode one: intuition. A charismatic founder argues by personality. The strongest voice in the room wins; the evidence trail behind the bet is selection bias plus a few screenshots. When the bet fails, no one can reconstruct why it was placed — so no one learns.
Mode two: the one-shot deck. A six-week research engagement produces 80 polished slides. The deck is read once, frozen, and quietly contradicted by every signal that arrives the following month. There is no protocol to update it, no protocol to compare it against a competing claim, no protocol to say where this argument is thin.
What both modes lack is the same thing: a way to take heterogeneous evidence from any source, compare it to a hypothesis, and tell you specifically where the evidence is thin, where it contradicts itself, and what to validate next. Not a verdict — a workbench.
看看大多数产品组织实际做决策的方式,你会看到两种失败模式,被包装成同一件事。
第一种:直觉。一个有气场的创始人靠人格说服别人。会议室里嗓门最大的人赢,下注背后的证据链其实是选择偏差加几张截图。一旦下注失败,没人能复盘 为什么 当时这么赌 —— 所以也没人能学到东西。
第二种:一次性 PPT。一个六周的调研项目,产出 80 页精美 slide。读一遍,冻结,然后被接下来一个月里抵达的每一条信号悄悄推翻。没有协议去更新它,没有协议去拿它对照一个相反的判断,也没有协议明确说 这段论证薄在哪里。
两种模式缺的是同一样东西:一种方法,能把来自任何来源的异质证据,与一条假设做对照,明确告诉你证据稀薄在哪、自相矛盾在哪、下一步要验证什么。不是给一个结论,是给一张工作台。
The core abstraction is a five-entity data flow. Everything the system does — capture, evaluation, validation — sits on top of these objects.
External data platforms — public review sites, developer communities, search-trend services, product directories. A source is the upstream world the system listens to. It is described abstractly so the system stays steerable: I can swap or weight sources without rewriting the rest.
A collection rule bound to a source. A probe defines what to look for (keywords, scope) and how to look (cadence, threshold). One source carries many probes. Probes are how a vague interest like “watch the analytics category” becomes a concrete, schedulable instruction.
One piece of raw evidence captured by a probe — one post, one complaint, one issue, one query trend point. A signal is the atom: it cannot be split further and it always carries a link back to its source.
The load-bearing object. Multiple signals aggregate into an evaluable demand direction. Each opportunity carries a type: gap (supply-demand mismatch), trend (rapid attention spike worth a positioning move), or trigger (an external event — a competitor raising prices, an API shutting down — opening a window).
Opportunities that survive scoring enter a validation flow: landing page, user interviews, MVP, paid validation. Validation is not a slide deck; it is a sequence of cheap reality checks that either kill the opportunity early or hand it to an independent operator.
On top of the five entities sits an Evidence-Opportunity Match protocol. EOM does not make the decision for you. Given a body of evidence on one side and a hypothesis on the other — a gap, a PRD, a product thesis — it tells you three things: how much of the available evidence actually supports the claim, where the evidence contradicts itself, and what specific evidence is still missing. It runs forward (evidence to opportunity), reverse (claim against evidence), or both.
Inside the protocol, scoring is deliberately simple and uses a kill switch:
Opportunity score O = demand intensity × pain intensity × supply gap ÷ supply density. Action score A measures whether users will actually migrate: A = (E_new − E_old) − switching cost. Feasibility F = (technical feasibility + market size + speed advantage) ÷ 3. Final score = O × max(A, 0) × F. If A ≤ 0, the opportunity is dropped regardless of market size.
The kill switch matters more than the multiplication. A big market that users will not migrate to is a worse bet than a small market they will. The model is designed to refuse the seductive option, not to celebrate it.
整套系统的底层抽象,是一条由五个实体组成的数据流。捕捉、评估、验证 —— 所有动作都跑在这些对象上面。
外部的数据平台 —— 公开点评站、开发者社区、搜索趋势服务、产品名录。信号源是这套系统所听的上游世界。我刻意只用抽象描述,让系统保持可调度:换一个源、调整一个权重,不需要重写其它部分。
绑定在信号源上的采集规则。探针定义 看什么(关键词、范围)和 怎么看(频率、阈值)。一个源上可以挂多个探针。模糊的兴趣 —— 比如“盯一下数据分析这个品类” —— 通过探针变成具体、可调度的指令。
探针捕获的单条原始证据:一条帖子、一句差评、一个 issue、一个查询趋势上的点。信号是原子,不可再拆,并且永远带着一条回到来源的链接。
承重的那个对象。若干信号聚合后,形成一个可被评估的需求方向。每个机会带一个类型:gap(供需失衡)、trend(短时间内值得卡位的注意力爆发)、trigger(一个外部事件 —— 竞品涨价、API 关停 —— 打开了一扇窗口)。
通过打分的机会进入验证流:Landing Page、用户访谈、MVP、付费验证。验证不是 PPT,是一串便宜的现实检验 —— 要么把机会早早杀掉,要么交给一个独立操盘人接手。
五个实体之上,是一套 证据-机会匹配(Evidence-Opportunity Match) 协议。EOM 不替你做决策。一边给它证据,一边给它一条假设 —— 一个 gap、一个 PRD、一段产品 thesis —— 它会告诉你三件事:当前证据到底支撑这条主张到什么程度,证据在哪里自相矛盾,还差哪些具体证据没补上。它可以正向跑(证据 → 机会),反向跑(命题 ← 证据),也可以双向跑。
协议内部的打分故意做得很简单,但带一个止损开关:
机会分数 O = 需求强度 × 痛点强度 × 供需缺口 ÷ 供给密度。行动分数 A 衡量用户是否真的会迁移:A =(新体验 − 旧体验)− 替换成本。可行性 F =(技术可行性 + 市场规模 + 速度优势)÷ 3。最终得分 = O × max(A, 0) × F。一旦 A ≤ 0,无论市场多大,这个机会被直接淘汰。
止损开关比乘法本身更重要。一个用户根本不会迁移过去的大市场,是比小而真实的市场更糟的赌注。这个模型被设计来拒绝那个最诱人的选项,而不是奖励它。
Sitting on the diagram above, I designed and shipped:
Software build cost is being compressed toward zero. When anyone can ship a product in hours, the scarce resource isn’t engineering capacity — it’s knowing what to build. The engine exists to make that judgment reproducible.
在上面这张图之上,我设计并交付了:
软件的构建成本正在被压向零。当任何人都能在数小时内交付一个产品,稀缺的不是工程能力 —— 而是知道 该做什么。这台引擎存在的理由,就是让这个判断可以被重复。
What the system actually changes about day-to-day operator work is concrete and unglamorous. The feedback loop from signal to validation compresses: an interesting complaint on Monday becomes a scored opportunity on Tuesday, a landing page on Thursday, and a killed bet or a live experiment by the following week. The cycle that used to take a quarter now runs every few days.
More importantly, the unit of argument changes. “I think this is a big market” gets replaced by a receipt: here are the 47 signals, here is where they contradict each other, here is the missing evidence that would change the score. The argument is no longer about who is more confident; it is about whose evidence chain holds up.
这套系统真正改变的日常工作,是具体而不光鲜的。从信号到验证的反馈回路被压缩:周一一条有意思的差评,周二变成一个被打过分的机会,周四变成一个 Landing Page,下一周要么被杀掉,要么变成一个跑着的实验。原来一个季度的周期,现在几天就跑一次。
更重要的是,论证的最小单位变了。“我觉得这是一个大市场” 被一张凭证替代:这是支撑它的 47 条信号、这是它们彼此矛盾的地方、这是补上之后会改变得分的缺失证据。讨论不再是谁更自信,而是谁的证据链更扛得住推。
I started this work thinking the value of a research system was the ranked list it produced — a clean ordering of opportunities, big to small, with a number next to each one. I was wrong about which part mattered. The ranked list is the byproduct. The actual product is everything that sits underneath the number.
What an operator actually uses, day after day, is not the score. It is the list of named contradictions the system surfaces — places where two pieces of evidence say opposite things and someone has to choose which one to trust. And it is the missing-evidence list — the specific gaps that, if filled, would move the score enough to change the decision. Those two outputs are what turn a vague feeling into a falsifiable plan.
The score is the wrapper. The contradictions and the missing-evidence list are the load.
The lesson: a research system that only outputs an answer is theater. A research system that outputs an answer plus the specific shape of its own uncertainty is an instrument. Build the second one, even though it makes you look less decisive in the meeting where you present it.
一开始我以为,调研系统的价值在于它给出的那份排序清单 —— 一张干净的机会列表,由大到小,每一项后面挂着一个数字。我看错了重点。那份排序,只是副产品。真正的产品,是数字下面那一整层东西。
操盘人每天真正用的,不是分数本身。是系统主动挑出来的那张 “被命名的矛盾” 清单 —— 两条证据互相打架的地方,有人必须选择相信哪一边。还有那张 “缺失证据” 清单 —— 哪些具体的空白一旦补上,分数会被推动到足以改变决策。正是这两张清单,把一种模糊的感觉,变成一份可以被证伪的计划。
分数是包装。矛盾和缺失证据,才是承重的那一层。
真正的教训:一个只给答案的调研系统是表演。一个同时给出答案、并且 明确说出自己不确定在哪里 的调研系统才是仪器。要做后者 —— 哪怕你在汇报会议里因此显得不够果断。