hacker news Hacker News
  1. new
  2. show
  3. ask
  4. jobs
"Continual learning" is considered one of the "blockers" for LLMs: they can't learn on the job, don't improve over time, etc. In particular, Dwarkesh Patel describes it as a number of problem which has to be solved to get to AGI.

Many academic article propose some kind of a memory system for LLM which might be considered a form of "continual learning". But most evals focus on memorizing facts which is just not very useful (it's better to fetch facts via tool use than to store it in neural memory) and these proposals might not fit well into common LLM API use patterns.

In this article I'm proposing a "new" method called "skill capsules" which is highly pragmatic, easy to understand and evaluate and might integrate well into existing tooling.

Skill capsule is a concrete object - it's a bunch of vectors, basically. You can insert it somewhere into a middle of LLM context and it improves performance on a particular skill, e.g. get tool calls more reliable, use particular writing style, coding style, etc. In theory, it can be used to patch any LLM inadequacy. A capsule can include knowledge (e.g. how to call a particular API or write code involving particular library).

Skill capsule can be produced using a single forward pass from a _single example_, not gradients or "fine-tuning" is required. So it might allow LLM to "learn on the job" - i.e. a single demonstration of how to perform something correctly can be used to create a capsule.

You might ask - why is a "Show HN" and not an academic article? Because researchers already know the method - it's known as "soft prompts", "hypernetworks", "steering vectors", prefix tuning, etc. All these terms are horrible and do not convey possibilities of this method. I just want more people to know that LLMs can be improved on the fly. And a better term -- "skill capsules" -- might help people to think how to apply these techniques (I hope).

Another reasons it's "Show HN" is that:

  * it shows one can do a kinda cool ML experiment in 
    a few days using Claude Code and few dollars to pay for GPUs
  * a somewhat-interesting story of how I got there

loading...