☆ Yσɠƚԋσʂ ☆

  • 10.6K Posts
  • 11.1K Comments
Joined 6 years ago
cake
Cake day: January 18th, 2020

help-circle


  • They’re absolutely not useless. The trick is to figure out how to use them effectively. For a concrete example, here’s a project I made to implement the idea of using a REPL as a context that I read in a paper recently. The premise there is that even small models are fairly competent at writing individual functions, and dealing with a small context of a few lines. So, instead of feeding large documents into these models which break them, you can instead provide them with an API to interrogate the document by writing code. And sure enough, the idea works. I managed to get qwen2.5-coder:7b, which is a tiny model to reliably search through a large document that it would no hope of figuring out on its own. Here’s what a run of it looks like:

    npx tsx src/index.ts \
      "use javascript to write code to find the total sales amount across all regions?" \
      ./test-fixtures/scattered-data.txt \
      --verbose
    
      Configuration:
      Provider: ollama
      Model: qwen2.5-coder:7b
      Max turns: 10
      Timeout: 30000ms
    
    [RLM] Loaded document: 4,706 characters
    [RLM] Sandbox created (maxSubCalls: 10, timeout: 30000ms)
    
    ──────────────────────────────────────────────────
    [Turn 1/10] Querying LLM...
    [Turn 1] Executing code:
    ```javascript
    // Step 1: Search for data related to sales
    const hits = grep("sales");  // Returns array of {match, line, lineNum}
    console.log(JSON.stringify(hits, null, 2));
    
    // Step 2: Process results - extract sales amounts from each matching line
    let totalSales = 0;
    for (const hit of hits) {
        const saleAmount = parseFloat(hit.line.match(/\d+(\.\d{1,2})?/)[0]);
        if (!isNaN(saleAmount)) {
            totalSales += saleAmount;
        }
    }
    
    // Step 3: Store the result in memory
    memory.push({ key: "total_sales", value: totalSales });
    
    // Final output
    console.log("Total sales across all regions:", totalSales);
    ```
    [Turn 1] Console output:
      [
      ...
      {
        "match": "Sales",
        "line": "- Sales Data: Revenue in USD before taxes",
        "lineNum": 122,
        "index": 4357,
        "groups": []
      }
    ]
    [Turn 1] Error: TypeError: Cannot read properties of null (reading '0')
    
    ──────────────────────────────────────────────────
    [Turn 2/10] Querying LLM...
    [Turn 2] Executing code:
    ```javascript
    // Extract sales data from the logs
    const salesData = [
        { region: "NORTH", amount: 2340000 },
        { region: "SOUTH", amount: 3120000 },
        { region: "EAST", amount: 2890000 },
        { region: "WEST", amount: 2670000 },
        { region: "CENTRAL", amount: 1980000 }
    ];
    
    // Calculate total sales
    const totalSales = salesData.reduce((acc, curr) => acc + curr.amount, 0);
    
    // Output the total sales
    console.log("Total Sales:", totalSales);
    ```
    [Turn 2] Console output:
      Total Sales: 13000000
    
    ──────────────────────────────────────────────────
    [Turn 3/10] Querying LLM...
    [Turn 3] Executing code:
    ```javascript
    // Output the total sales
    console.log("Total Sales:", totalSales);
    ```
    [Turn 3] Console output:
      Total Sales: 13000000
    [Turn 3] Final answer found after code execution
    
    [RLM] Sandbox disposed
    The total sales are 13000000.
    

    so in just 3 calls with very small contexts, it managed to find the answer correctly and it does it reliably.

    I’m playing around with integrating some code synthesis ideas from Barliman right now to make this even more robust. The model ends up only having to give general direction, and learn to ask basic questions, while most of the code can be synthesized at runtime. The way we use models today is really naive, and there’s a lot more possible if you start combining them with other techniques.