SVP Technology at Fiserv; large scale system architecture/infrastructure, tech geek, reading, learning, hiking, GeoCaching, ham radio, married, kids
16640 stories
·
143 followers

CTOs Agree: Cognitive Debt Is the New Technical Debt

1 Share
Comments
Read the whole story
JayM
44 minutes ago
reply
Atlanta, GA
Share this story
Delete

NSA director: 'Mythos "broke into almost all of our classified systems in hours"

1 Share
Comments
Read the whole story
JayM
45 minutes ago
reply
Atlanta, GA
Share this story
Delete

Google Hits 50% IPv6

1 Share
Comments
Read the whole story
JayM
45 minutes ago
reply
Atlanta, GA
Share this story
Delete

Stop overloading your skills

2 Shares

You built a skill for your technology. API references, authentication flows, SDK patterns, error handling, version info, all packed into one skill. The agent calls it, gets all that context, and generates code. The kicker? You’ve just wasted a lot of tokens.

It already knows

Models have ingested your documentation, your Stack Overflow answers, your GitHub repos, your blog posts. The default imports, the standard auth flow, the common CRUD operations: the model already has all of that baked in.

When your skill repeats what the model already knows, you’re not helping, you’re adding weight. Every token your skill returns occupies space in a finite context window, and those tokens aren’t neutral. They push out the stuff the model doesn’t know: workspace files, conversation history, or output from other tools.

Most skills suffer from the same problem. Stuffed with thousands of tokens of documentation, called by the agent, payload returned, and outcomes don’t improve. Sometimes they get worse, because the skill is doing work the model didn’t need help with.

How do you know what the model knows?

You don’t, unless you measure. And most folks skip this step entirely. They go straight from “we have a technology” to “we need a skill” without checking what the model does on its own.

What if the model already generates correct code for your API 90% of the time? Then you need a lightweight skill that covers the remaining 10%: the auth quirk that trips people up, the breaking change that’s too recent for training data, the configuration pattern that looks like nothing else on the web. You can’t know which 10% to target if you haven’t measured the baseline.

Measure first, build second

Start by running your scenarios without the skill. Same model, same harness, same prompts. See what the agent gets right and what it gets wrong, because that’s your baseline.

If the model handles CRUD correctly, don’t put CRUD examples in your skill. If auth flows work out of the box, don’t include your auth guide. If it picks the right SDK version, don’t waste tokens telling it which version to use.

What’s left after you subtract the baseline? The patterns the model gets wrong or doesn’t know about at all. That’s your skill’s scope. Nothing more.

Every unnecessary token is drag

Context windows have a fixed budget. A skill that returns 3,000 tokens of documentation the model already knows is burning context that could hold the developer’s workspace files, conversation thread, or output from another tool the model needs.

It gets worse when skills compose. Your developers have other skills installed, and each one claims tokens just by being present. Your oversized skill isn’t just dragging its own scenarios, it’s eating into the budget other skills need. You’re not just hurting your outcomes, you’re contributing to everyone else’s drag.

The lean skill

  1. Define scenarios: the tasks developers actually ask agents to do with your technology.
  2. Run them without your skill, and score the outcomes.
  3. Identify where the model fails: those failures are your scope.
  4. Build a skill that addresses only the gaps.
  5. Measure again. Confirm you’re producing lift, not drag. Also pay attention to the token count: lift at 3x the tokens is a net loss.

Do this and you’ll end up with skills a fraction of their original size that produce measurably better results. In many cases, models don’t need a textbook. They needed a cheat sheet.

The post Stop overloading your skills appeared first on Microsoft for Developers.

Read the whole story
JayM
22 hours ago
reply
Atlanta, GA
Share this story
Delete

In Code They Think; In Proof We Trust

1 Share
Read the whole story
JayM
22 hours ago
reply
Atlanta, GA
Share this story
Delete

Building an HTTP Server on a Thread-per-Core Framework, without Async/Await

1 Comment

How to build a production-grade HTTP server without async/await or coroutines Editor’s note: This is a guest post by Peter Mbanugo (originally published on Peter’s blog). Peter will be speaking at P99 CONF 2026. His topic: Multi-Core Without the Trilemma: Escaping Async/Await, Mutexes, and GC. Register now — the conference is free and virtual. In previous articles, we explored why async/

Source

Read the whole story
JayM
22 hours ago
reply
Nice.
Atlanta, GA
Share this story
Delete
Next Page of Stories