WZ — Case · 02
Filed · 2026 / Q2
SF Bay Area · 37.77°N
v.01 · current
Pillar 01 — Market judgment 第一根支柱 —— 市场判断

I designed a venture-studio engine that turns scattered market signals into ranked, auditable opportunities.

我设计了一套 venture-studio 引擎,把散落的市场信号变成可审计、可排序的机会。

Software build cost is collapsing toward zero. The scarce resource is knowing what to build. I designed the protocol that makes that judgment systematic instead of intuitive.

软件构建成本正在被压向零。稀缺的不是工程能力,而是知道该做什么。我设计了一套协议,把这个判断从凭直觉做,变成按系统做。

01CONTEXT背景

When build cost collapses, judgment becomes the scarce resource.

构建成本归零后,判断力才是稀缺的那一项。

AI is compressing software build cost toward zero. A working prototype that used to require a small team and a quarter can now be shipped in an afternoon by one person with a model and a credit card. The bottleneck has moved.

When anyone can ship a product in a few hours, engineering capacity is no longer the constraint. The constraint is knowing what to build — which need is real, which is loud but shallow, which window is open for six weeks, which is already closed.

I sat in that gap for a year and decided to build the instrument I kept wishing I had: a venture-studio engine that treats market judgment as a system, not a personality trait.

AI 正在把软件的构建成本压向零。过去一个小团队、一个季度才能交付的原型,今天一个人加一个模型、一张信用卡,下午就能上线。瓶颈已经换位置了。

当任何人都能在几小时内交付一个产品,工程能力不再是约束。真正的约束是 知道该做什么 —— 哪一个需求是真的,哪一个只是声音大、实质浅,哪一扇窗口开六周就会关,哪一扇其实早就关了。

我在这个缝隙里观察了一整年,决定亲手做出一直想用的那台仪器:一套把市场判断当成系统、而不是当成个人天赋的 venture-studio 引擎。

02PROBLEM问题

Most market research is either intuition or a one-shot deck.

大多数市场调研,不是凭直觉,就是一次性的 PPT。

Watch how decisions actually get made in most product orgs and you see two failure modes, dressed up as the same thing.

Mode one: intuition. A charismatic founder argues by personality. The strongest voice in the room wins; the evidence trail behind the bet is selection bias plus a few screenshots. When the bet fails, no one can reconstruct why it was placed — so no one learns.

Mode two: the one-shot deck. A six-week research engagement produces 80 polished slides. The deck is read once, frozen, and quietly contradicted by every signal that arrives the following month. There is no protocol to update it, no protocol to compare it against a competing claim, no protocol to say where this argument is thin.

What both modes lack is the same thing: a way to take heterogeneous evidence from any source, compare it to a hypothesis, and tell you specifically where the evidence is thin, where it contradicts itself, and what to validate next. Not a verdict — a workbench.

看看大多数产品组织实际做决策的方式,你会看到两种失败模式,被包装成同一件事。

第一种:直觉。一个有气场的创始人靠人格说服别人。会议室里嗓门最大的人赢,下注背后的证据链其实是选择偏差加几张截图。一旦下注失败,没人能复盘 为什么 当时这么赌 —— 所以也没人能学到东西。

第二种:一次性 PPT。一个六周的调研项目,产出 80 页精美 slide。读一遍,冻结,然后被接下来一个月里抵达的每一条信号悄悄推翻。没有协议去更新它,没有协议去拿它对照一个相反的判断,也没有协议明确说 这段论证薄在哪里

两种模式缺的是同一样东西:一种方法,能把来自任何来源的异质证据,与一条假设做对照,明确告诉你证据稀薄在哪、自相矛盾在哪、下一步要验证什么。不是给一个结论,是给一张工作台。

03APPROACH方法

Five entities, one matching protocol, one scoring rule with a kill switch.

五个实体,一套匹配协议,一条带止损开关的打分规则。

The core abstraction is a five-entity data flow. Everything the system does — capture, evaluation, validation — sits on top of these objects.

Source

External data platforms — public review sites, developer communities, search-trend services, product directories. A source is the upstream world the system listens to. It is described abstractly so the system stays steerable: I can swap or weight sources without rewriting the rest.

Probe

A collection rule bound to a source. A probe defines what to look for (keywords, scope) and how to look (cadence, threshold). One source carries many probes. Probes are how a vague interest like “watch the analytics category” becomes a concrete, schedulable instruction.

Signal

One piece of raw evidence captured by a probe — one post, one complaint, one issue, one query trend point. A signal is the atom: it cannot be split further and it always carries a link back to its source.

Opportunity

The load-bearing object. Multiple signals aggregate into an evaluable demand direction. Each opportunity carries a type: gap (supply-demand mismatch), trend (rapid attention spike worth a positioning move), or trigger (an external event — a competitor raising prices, an API shutting down — opening a window).

Validation

Opportunities that survive scoring enter a validation flow: landing page, user interviews, MVP, paid validation. Validation is not a slide deck; it is a sequence of cheap reality checks that either kill the opportunity early or hand it to an independent operator.

The protocol on top — EOM

On top of the five entities sits an Evidence-Opportunity Match protocol. EOM does not make the decision for you. Given a body of evidence on one side and a hypothesis on the other — a gap, a PRD, a product thesis — it tells you three things: how much of the available evidence actually supports the claim, where the evidence contradicts itself, and what specific evidence is still missing. It runs forward (evidence to opportunity), reverse (claim against evidence), or both.

The scoring rule

Inside the protocol, scoring is deliberately simple and uses a kill switch:

Opportunity score O = demand intensity × pain intensity × supply gap ÷ supply density. Action score A measures whether users will actually migrate: A = (E_new − E_old) − switching cost. Feasibility F = (technical feasibility + market size + speed advantage) ÷ 3. Final score = O × max(A, 0) × F. If A ≤ 0, the opportunity is dropped regardless of market size.

The kill switch matters more than the multiplication. A big market that users will not migrate to is a worse bet than a small market they will. The model is designed to refuse the seductive option, not to celebrate it.

整套系统的底层抽象,是一条由五个实体组成的数据流。捕捉、评估、验证 —— 所有动作都跑在这些对象上面。

信号源 Source

外部的数据平台 —— 公开点评站、开发者社区、搜索趋势服务、产品名录。信号源是这套系统所听的上游世界。我刻意只用抽象描述,让系统保持可调度:换一个源、调整一个权重,不需要重写其它部分。

探针 Probe

绑定在信号源上的采集规则。探针定义 看什么(关键词、范围)和 怎么看(频率、阈值)。一个源上可以挂多个探针。模糊的兴趣 —— 比如“盯一下数据分析这个品类” —— 通过探针变成具体、可调度的指令。

信号 Signal

探针捕获的单条原始证据:一条帖子、一句差评、一个 issue、一个查询趋势上的点。信号是原子,不可再拆,并且永远带着一条回到来源的链接。

机会 Opportunity

承重的那个对象。若干信号聚合后,形成一个可被评估的需求方向。每个机会带一个类型:gap(供需失衡)、trend(短时间内值得卡位的注意力爆发)、trigger(一个外部事件 —— 竞品涨价、API 关停 —— 打开了一扇窗口)。

验证 Validation

通过打分的机会进入验证流:Landing Page、用户访谈、MVP、付费验证。验证不是 PPT,是一串便宜的现实检验 —— 要么把机会早早杀掉,要么交给一个独立操盘人接手。

顶层协议 —— EOM

五个实体之上,是一套 证据-机会匹配(Evidence-Opportunity Match) 协议。EOM 不替你做决策。一边给它证据,一边给它一条假设 —— 一个 gap、一个 PRD、一段产品 thesis —— 它会告诉你三件事:当前证据到底支撑这条主张到什么程度,证据在哪里自相矛盾,还差哪些具体证据没补上。它可以正向跑(证据 → 机会),反向跑(命题 ← 证据),也可以双向跑。

打分规则

协议内部的打分故意做得很简单,但带一个止损开关:

机会分数 O = 需求强度 × 痛点强度 × 供需缺口 ÷ 供给密度。行动分数 A 衡量用户是否真的会迁移:A =(新体验 − 旧体验)− 替换成本。可行性 F =(技术可行性 + 市场规模 + 速度优势)÷ 3。最终得分 = O × max(A, 0) × F。一旦 A ≤ 0,无论市场多大,这个机会被直接淘汰。

止损开关比乘法本身更重要。一个用户根本不会迁移过去的大市场,是比小而真实的市场更糟的赌注。这个模型被设计来拒绝那个最诱人的选项,而不是奖励它。

04WHAT I BUILT我做了什么

A signal-to-validation pipeline with auditable scoring at every step.

一条从信号到验证、每一步都可审计的管线。

Source external data platforms Probe collection rule Signal single piece of evidence Opportunity gap · trend trigger scored · ranked Validation landing · interview MVP · paid five-entity data flow · accent = load-bearing object
FIG. 01 · core data model

Sitting on the diagram above, I designed and shipped:

  • Five-entity data model — source / probe / signal / opportunity / validation. One contract, used by every other component.
  • EOM evidence-opportunity match protocol — stateless matching kernel; outputs support level, named contradictions, and a missing-evidence list.
  • Three-stage pipeline — signal capture, automated evaluation, validation handoff. Each stage has an explicit input contract and an explicit kill condition.
  • Auditable evidence chain — every score links back to the source signals that produced it and to the named contradictions that fight it. No score floats free of its receipts.
  • Agent-first control plane — capabilities exposed as service, API, and CLI before any UI. The human interface is a projection of the same surface an agent uses, not a parallel codebase.
Software build cost is being compressed toward zero. When anyone can ship a product in hours, the scarce resource isn’t engineering capacity — it’s knowing what to build. The engine exists to make that judgment reproducible.

在上面这张图之上,我设计并交付了:

  • 五实体数据模型 —— 信号源 / 探针 / 信号 / 机会 / 验证。一份契约,被其它每一个组件复用。
  • EOM 证据-机会匹配协议 —— 无状态的匹配内核,输出支撑度、被命名的矛盾、待补证据清单。
  • 三阶段管线 —— 信号捕捉、自动评估、验证移交。每一阶段都有显式的输入契约和显式的止损条件。
  • 可审计的证据链 —— 每一个分数都能回溯到它来自哪几条信号、被哪几条矛盾顶过。没有任何分数能脱离凭证悬空存在。
  • Agent-first 控制面 —— 能力先以 service / API / CLI 暴露,再才是 UI。人类界面是同一能力面的投影,不是另一套并行实现。
软件的构建成本正在被压向零。当任何人都能在数小时内交付一个产品,稀缺的不是工程能力 —— 而是知道 该做什么。这台引擎存在的理由,就是让这个判断可以被重复。
05OUTCOME结果

A working judgment protocol, not a one-shot deck.

一套运转中的判断协议,而不是一次性的 PPT。

5Entities in the core data model核心数据模型实体
3Pipeline stages · capture / evaluate / validate管线阶段 · 捕捉 / 评估 / 验证
4Opportunity scoring dimensions · O × A × F + drop rule机会评分维度 · O × A × F + 淘汰规则
1Auditable judgment protocol可审计判断协议

What the system actually changes about day-to-day operator work is concrete and unglamorous. The feedback loop from signal to validation compresses: an interesting complaint on Monday becomes a scored opportunity on Tuesday, a landing page on Thursday, and a killed bet or a live experiment by the following week. The cycle that used to take a quarter now runs every few days.

More importantly, the unit of argument changes. “I think this is a big market” gets replaced by a receipt: here are the 47 signals, here is where they contradict each other, here is the missing evidence that would change the score. The argument is no longer about who is more confident; it is about whose evidence chain holds up.

这套系统真正改变的日常工作,是具体而不光鲜的。从信号到验证的反馈回路被压缩:周一一条有意思的差评,周二变成一个被打过分的机会,周四变成一个 Landing Page,下一周要么被杀掉,要么变成一个跑着的实验。原来一个季度的周期,现在几天就跑一次。

更重要的是,论证的最小单位变了。“我觉得这是一个大市场” 被一张凭证替代:这是支撑它的 47 条信号、这是它们彼此矛盾的地方、这是补上之后会改变得分的缺失证据。讨论不再是谁更自信,而是谁的证据链更扛得住推。

06WHAT IT TAUGHT ME学到了什么

The score is not the product. The contradictions are.

分数不是产品。矛盾才是。

I started this work thinking the value of a research system was the ranked list it produced — a clean ordering of opportunities, big to small, with a number next to each one. I was wrong about which part mattered. The ranked list is the byproduct. The actual product is everything that sits underneath the number.

What an operator actually uses, day after day, is not the score. It is the list of named contradictions the system surfaces — places where two pieces of evidence say opposite things and someone has to choose which one to trust. And it is the missing-evidence list — the specific gaps that, if filled, would move the score enough to change the decision. Those two outputs are what turn a vague feeling into a falsifiable plan.

The score is the wrapper. The contradictions and the missing-evidence list are the load.

The lesson: a research system that only outputs an answer is theater. A research system that outputs an answer plus the specific shape of its own uncertainty is an instrument. Build the second one, even though it makes you look less decisive in the meeting where you present it.

一开始我以为,调研系统的价值在于它给出的那份排序清单 —— 一张干净的机会列表,由大到小,每一项后面挂着一个数字。我看错了重点。那份排序,只是副产品。真正的产品,是数字下面那一整层东西。

操盘人每天真正用的,不是分数本身。是系统主动挑出来的那张 “被命名的矛盾” 清单 —— 两条证据互相打架的地方,有人必须选择相信哪一边。还有那张 “缺失证据” 清单 —— 哪些具体的空白一旦补上,分数会被推动到足以改变决策。正是这两张清单,把一种模糊的感觉,变成一份可以被证伪的计划。

分数是包装。矛盾和缺失证据,才是承重的那一层。

真正的教训:一个只给答案的调研系统是表演。一个同时给出答案、并且 明确说出自己不确定在哪里 的调研系统才是仪器。要做后者 —— 哪怕你在汇报会议里因此显得不够果断。