Fixed wording
This commit is contained in:
parent
c9a82e3c27
commit
c6ec8aca36
1 changed files with 0 additions and 3 deletions
|
@ -1,7 +1,6 @@
|
||||||
---
|
---
|
||||||
title: Lyceum
|
title: Lyceum
|
||||||
type: docs
|
type: docs
|
||||||
bookToC: false
|
|
||||||
---
|
---
|
||||||
|
|
||||||
# Welcome to Lyceum
|
# Welcome to Lyceum
|
||||||
|
@ -10,8 +9,6 @@ bookToC: false
|
||||||
|
|
||||||
Lyceum is a Reinforcement Learning (RL) playground designed for natural language processing (NLP) experimentation. Just as Gymnasium provides environments for RL agents to train, Lyceum is the place where RL agents learn. Imagine a school where RL agents enroll in classes to master subjects like language, math, even philosophy, and then unwind at after-school clubs like chess, where they can test their decision-making skills.
|
Lyceum is a Reinforcement Learning (RL) playground designed for natural language processing (NLP) experimentation. Just as Gymnasium provides environments for RL agents to train, Lyceum is the place where RL agents learn. Imagine a school where RL agents enroll in classes to master subjects like language, math, even philosophy, and then unwind at after-school clubs like chess, where they can test their decision-making skills.
|
||||||
|
|
||||||
Welcome to Lyceum!
|
|
||||||
|
|
||||||
## Why Lyceum?
|
## Why Lyceum?
|
||||||
***
|
***
|
||||||
Modern NLP solutions like GPTs and BERTs have made great strides in language processing and generations, however they come with serious limitations. Even though a LLM can describe or make a game of chess, and even justify moved made, it is unable to *play* it. Why? Because there's no underlying mechanism for decision-making or reward-incentives during training. Transformers rely on static token distributions without real-time feedback, limiting their capacity to *actively* learn.
|
Modern NLP solutions like GPTs and BERTs have made great strides in language processing and generations, however they come with serious limitations. Even though a LLM can describe or make a game of chess, and even justify moved made, it is unable to *play* it. Why? Because there's no underlying mechanism for decision-making or reward-incentives during training. Transformers rely on static token distributions without real-time feedback, limiting their capacity to *actively* learn.
|
||||||
|
|
Loading…
Reference in a new issue