Reinforcement Learning to LLM API Interface

Welcome to the RL-LLM API Interface, an innovative programming paradigm that allows developers to describe computer programs using the language of reinforcement learning, which are then translated into Large Language Model (LLM) API calls for implementation.

Key Concepts

Basic Syntax

program MyProgram {
  state {
    // Define program state variables
  }
  
  action ActionName(parameters) {
    // Define an action
  }
  
  reward {
    // Define reward function
  }
  
  policy {
    // Define action selection policy
  }
}
  

Example: Text Summarization

program TextSummarizer {
  state {
    text: string
    summary: string
    quality: float
  }
  
  action Summarize(length: int) {
    // Summarize the text to the specified length
  }
  
  action Refine() {
    // Refine the existing summary
  }
  
  reward {
    return quality
  }
  
  policy {
    if (quality < 0.8) {
      return Refine()
    } else {
      return Summarize(length / 2)
    }
  }
}
  

Translation to LLM API Calls

The RL-LLM Interface automatically translates the reinforcement learning concepts into appropriate LLM API calls. Here's how it might look for the TextSummarizer example:

// Action: Summarize
llm_api.complete({
  prompt: `Summarize the following text in ${length} words:\n${state.text}`,
  max_tokens: length * 2
})

// Action: Refine
llm_api.complete({
  prompt: `Refine the following summary to improve its quality:\n${state.summary}\nOriginal text:\n${state.text}`,
  max_tokens: state.summary.split(' ').length * 2
})

// Reward calculation
llm_api.complete({
  prompt: `Rate the quality of this summary from 0 to 1:\nSummary: ${state.summary}\nOriginal text: ${state.text}`,
  max_tokens: 10
})
  

Benefits

This innovative interface bridges the gap between reinforcement learning concepts and practical implementation using LLMs, opening up new possibilities for AI-driven software development.

Explore more examples and start building with RL-LLM Interface today!

Full Documentation | More Examples | Community Forum