lyceum-site/content/_index.md

69 lines
7.4 KiB
Markdown
Raw Normal View History

2024-10-03 22:08:11 +00:00
---
2024-10-04 11:07:58 +00:00
title: Lyceum
2024-10-03 22:08:11 +00:00
type: docs
2024-10-04 11:07:58 +00:00
bookToC: false
2024-10-03 22:08:11 +00:00
---
2024-10-04 11:07:58 +00:00
# Welcome to Lyceum
**If RL agents go to the gym to train, they should also go to the lyceum to learn.**
***
2024-10-03 22:08:11 +00:00
2024-10-04 11:07:58 +00:00
Lyceum is a Reinforcement Learning (RL) playground designed for natural language processing (NLP) experimentation. Just as Gymnasium provides environments for RL agents to train, Lyceum is the place where RL agents learn. Imagine a school where RL agents enroll in classes to master subjects like language, math, even philosophy, and then unwind at after-school clubs like chess, where they can test their decision-making skills.
2024-10-03 22:08:11 +00:00
2024-10-04 11:07:58 +00:00
Welcome to Lyceum!
2024-10-03 22:08:11 +00:00
2024-10-04 11:07:58 +00:00
## Why Lyceum?
***
Modern NLP solutions like GPTs and BERTs have made great strides in language processing and generations, however they come with serious limitations. Even though a LLM can describe or make a game of chess, and even justify moved made, it is unable to *play* it. Why? Because there's no underlying mechanism for decision-making or reward-incentives during training. Transformers rely on static token distributions without real-time feedback, limiting their capacity to *actively* learn.
2024-10-03 22:08:11 +00:00
2024-10-04 11:07:58 +00:00
Lyceum tries to address this gap by shifting the focus to active learning through reinforcement. In Lyceum, the agent doesn't just passively learn to generate language; it learns through *interaction*.
2024-10-03 22:08:11 +00:00
2024-10-04 11:07:58 +00:00
## How It Works
***
At the heart of Lyceum is the idea of teaching RL agents to process language through experience, not just token patters. Here's how it's done:
2024-10-03 22:08:11 +00:00
2024-10-04 11:07:58 +00:00
### Classes: The Learning Process
2024-10-03 22:08:11 +00:00
2024-10-04 11:07:58 +00:00
In Lyceum, agents attend **classes**, where they are trained to understand and generate language. Instead of simply mimicking token distributions, the agent actively generates tokens, which are then evaluated by a transformer. The transformer acts as a teacher, comparing the agent's output to a pre-existing corpus it has been trained on, such as a collection of texts on a subject, say English literature. The transformer then outputs a similarity score between the agent's generated tokens, and the training text, showing how closely the agent's generated tokens align with the "right answer".
2024-10-03 22:08:11 +00:00
2024-10-04 11:07:58 +00:00
This sort of testing then naturally produces a reward function for the agent to actively learn how to speak.
2024-10-03 22:08:11 +00:00
2024-10-04 11:07:58 +00:00
Each class can focus on different subjects or languages. For example an agent might attend an English class to practice grammar and sentence structure, before heading to a Maths class to hone its reasoning and argumentation capabilities. These classes therefore exist as distinct training environments where agents refine their language processing skills one step at a time.
2024-10-03 22:08:11 +00:00
2024-10-04 11:07:58 +00:00
Over time, the goal is for the RL agent to develop a deeper understanding of language, progressing from sentence generation to more complex structures like argumentation, story crafting, or problem-solving.
2024-10-03 22:08:11 +00:00
2024-10-04 11:07:58 +00:00
### After-School Clubs: Decision Making
Learning (and therefore fun) doesn't end in the classroom. After school, agents head over to the many different **clubs** offered by the Lyceum, where they apply the skills they've picked up in class in decision-making scenarios.
In these clubs, agents don't just process language and output tokens---they use language to make decisions. For example, in the **Chess Club**, the agent might use the language it learned to describe potentioal moves, but now it's not just about outputting a sequence of tokens based on a pre-defined distribution; it's about choosing the right one based on a reward function. This is where the magic happens: combining language comprehension with the reward driven decision making of RL.
The agent learns not only to describe a game, but to actually *play* it, honing decision-making and strategy through real-time feedback. After-school clubs provide traditional RL training environments, where the agent uses natural language to try and achieve goals set by the environment. This should provide higher-order thinking and help agents apply their language skills in dynamic, interactive settings.
## The Road Ahead
***
Lyceums goal is to push the boundaries of what RL agents can achieve in NLP, and hopefully approach the next step, NLU. By training agents one subject (or language) at a time, we aim to build systems that dont just output sequences of tokens but actually understand the subject matter. The hope is that by mastering specific classes, agents will converge toward stronger performance in each domain, learning to carry out real-world tasks that require both language comprehension and decision-making.
In the future, Lyceum agents could move beyond basic language generation, becoming proficient in tasks like scientific writing, tutoring, or even dynamic conversation. With a solid foundation built in the classroom and refined in after-school activities, these agents could one day graduate from simply generating language to mastering the full depth of human communication.
Alongside this, we're also working on a new tokenizer, that tries to more closely follow grammar structures, in hopes that this could help improve accuracy and intuition. We hope this will help RL agents make better predictions and improve convergence, but this will be done in the future.
### We Need Your Help!
Lyceum is still a growing project, and we've already got a few things semi-ready, like a chess minigame (the coveted **Chess Club** for RL agents to practice decision making, and we're also working on a first implementation of the English language class, using the OpenText corpus used to train GPT-2. There's however a lot to build, and that's why we need help.
We need researchers, developers, and experimenters to help us grow the Lyceum platform. Whether you're into developing RL environments, NLP research, or just like implementing algorithms for fun, your contributions are invaluable.
### How You Can Help:
- **Build new classes**: Help us design new environments for language learning across different domains. These could be different languages, (like German, Chinese, or Greek) or more complex subjects like physics or philosophy.
- **Develop new after-school clubs**: Think RL is fun? Why not help us build out more game clubs, or even create scenarios where agents use natural language to solve real-world problems through decision-making.
- **Use the thing and spread the word**: Simply by spreading the idea of RL-based NLP you are helping this platform. The more eyes the better.
## Enroll Your Agent Now!
***
2024-10-04 11:27:24 +00:00
- **Non-Commercial License**: At Lyceum, we believe in consensual interactions only. Our custom Don't Be Evil License ensures that your agents wont be used for commercial purposes without your consent. Whether you're training your agent to solve puzzles or become the next great debater, rest assured it won't be used by Big Tech without your approval.
- **Free Tuition**: We offer *free* tuition for all rl agents---no hidden fees, no secret costs. Our license allows everyone and anyone to modify, improve, and experiment with Lyceum to their heart's content, free of charge!
2024-10-04 11:07:58 +00:00
- **Open Curriculum**: No set classes here. Feel free to enroll your agents in as many classes as you'd like! Want them to be great at English *and* quantum mechanics? We got you covered.
2024-10-04 11:27:24 +00:00
- **Graduation is Optional**: No rush! Your agents can hang out in school as long as they want.
- **Money-Back Guarantee**: If youre not satisfied with the infinite potential of Lyceum, well refund every penny you didnt pay! Thats right---100% money-back guarantee on your zero-dollar investment. You lose absolutely nothing except the opportunity to create something awesome!