I help B2B tech companies design and deliver data-driven editorial strategies, including original research.
Solo. 1,120 API calls across four GPT models, eight experiments. AI graders are more consistent than two human teachers, but vulnerable to gaming, prompt injection, and bias against non-native English writers.
With Dr. Sam Illingworth. 500 API calls across Sonnet 4.5 and Opus 4.7. Tone doesn't affect correctness, but flattery compresses Opus's deliberation by nearly 50%.
Solo. 250 API calls replicating Anthropic's interpretability findings at the behavioral level. Under token pressure, Claude silently degrades work 20–44% of the time, with no warning in its output.
2,600+ subscribers · Substack
I started this beginner-friendly tech newsletter as a home for my build-in-public stories about emerging SaaS, original research into AI behavior, and reviews of popular frontier models and AI-enabled products.
Read on Substack150,000+ arXiv papers, updated weekly
An ML research platform for tracking emerging directions in AI research. Built from scratch in Python with custom embeddings.
Visit Future Scan“Why should we work with you? Can’t we just get AI to do this?”
The answer is, yes, you can give AI a prompt plus a CSV file and get back an editorial calendar, an article, or even some not-entirely-terrible data analysis.
But, if you use AI for any length of time, you'll notice diminishing returns. It can create a draft that's maybe 80% of what you want. But it will struggle with the remaining 20%, often missing nuance, adding stylistic tics, falsifying data, and introducing subtle inconsistencies.
If AI isn't used carefully, you will lose all the time you saved on ideation and writing to extensive editing, fact-checking, and expert validation.
Some other things AI can't do:
I think AI absolutely can help us do more. But we have to be strategic about where and how we use it—and disclose this information to clients, stakeholders, and readers.
Open to full-time Editorial Lead, Director of Content, or Head of Content roles at B2B companies with complex or technical products. Remote or hybrid.
Also taking on a small number of project engagements: original research reports and sales enablement libraries. Details: /research, /sales-enablement.
karen@goodcontent.cc