☆ Yσɠƚԋσʂ ☆@lemmy.ml to

Technology@lemmy.mlEnglish · 3 hours ago

DeepSeek just published a paper on conditional memory via scalable lookup

2

6

DeepSeek just published a paper on conditional memory via scalable lookup

☆ Yσɠƚԋσʂ ☆@lemmy.ml to

Technology@lemmy.mlEnglish · 3 hours ago

2

Engram/Engram_paper.pdf at main · deepseek-ai/Engram

Conditional Memory via Scalable Lookup: A New Axis of Sparsity for Large Language Models - deepseek-ai/Engram

The paper argues that we have been wasting a lot of expensive GPU cycles by forcing transformers to relearn static things like names or common phrases through deep computation. Standard models do not have a way to just look something up so they end up simulating memory by passing tokens through layer after layer of feed forward networks. DeepSeek introduced a module called Engram which adds a dedicated lookup step for local N-gram patterns. It acts like a new way to scale a model that is separate from the usual compute heavy Mixture of Experts approach.

The architecture uses multi head hashing to grab static embeddings for specific token sequences which are then filtered through a context aware gate to make sure they actually fit the current situation. They found a U shaped scaling law where the best performance happens when you split your parameter budget between neural computation and this static memory. By letting the memory handle the simple local associations the model can effectively act like it is deeper because the early layers are not bogged down with basic reconstruction.

One of the best bits is how they handle hardware constraints by offloading the massive lookup tables to host RAM. Since these lookups are deterministic based on the input tokens the system can prefetch the data from the CPU memory before the GPU even needs it. This means you can scale to tens of billions of extra parameters with almost zero impact on speed since the retrieval happens while the previous layers are still calculating.

The benchmarks show that this pays off across the board especially in long context tasks where the model needs its attention focused on global details rather than local phrases. It turns out that even in math and coding the model gets a boost because it is no longer wasting its internal reasoning depth on things that should just be in a lookup table. Moving forward this kind of conditional memory could be a standard part of sparse models because it bypasses the physical memory limits of current hardware.

Chat

Avid Amoeba@lemmy.ca
link
fedilink
arrow-up
2·
3 hours ago
Hardware embargo seems to be working great!
- ☆ Yσɠƚԋσʂ ☆@lemmy.mlOP
  link
  fedilink
  arrow-up
  2·
  2 hours ago
  It’s really spurring Chinese companies to make LLMs that don’t need a lake of water to tell you how many r’s there are in strawberry. 🤣

Technology@lemmy.ml

technology@lemmy.ml

You are not logged in. However you can subscribe from another Fediverse account, for example Lemmy or Mastodon. To do this, paste the following into the search field of your instance: !technology@lemmy.ml

This is the official technology community of Lemmy.ml for all news related to creation and use of technology, and to facilitate civil, meaningful discussion around it.

Ask in DM before posting product reviews or ads. All such posts otherwise are subject to removal.

Rules:

1: All Lemmy rules apply

2: Do not post low effort posts

3: NEVER post naziped*gore stuff

4: Always post article URLs or their archived version URLs as sources, NOT screenshots. Help the blind users.

5: personal rants of Big Tech CEOs like Elon Musk are unwelcome (does not include posts about their companies affecting wide range of people)

6: no advertisement posts unless verified as legitimate and non-exploitative/non-consumerist

7: crypto related posts, unless essential, are disallowed

Visibility: Public

This community can be federated to other instances and be posted/commented in by their users.

100 users / day
830 users / week
2.29K users / month
6.71K users / 6 months
1 local subscriber
40.9K subscribers
4.45K Posts
52.3K Comments
Modlog

mods:
MinutePhrase@lemmy.ml