Wednesday, January 17, 2024

The Path to Evaluation

Merging Fun!

Having explored various techniques for creating new Language Models (LLMs), I found great enjoyment in both common merging and the more adventurous "franken-merging." Merging involves combining two LLMs, aiming to retain the strengths of both models. However, this process feels somewhat like a gamble—success is not guaranteed.

Hugginface provides a platform where different authors take intuitive chances, merging various models to climb the leaderboard. Surprisingly, it occasionally pays off! Franken-merging takes it a step further, involving dissecting models and creatively reassembling their components.

In simpler terms, with a bit of luck and no deep expertise, an enthusiastic hobbyist can outperform seasoned data scientists!

Apart from the thrill, these methods offer cost-effective solutions. Building a potent LLM from scratch is a costly endeavor reserved for companies. In contrast, merging only requires affordable RAM, is fast, and can be accomplished within minutes. Additionally, there's no need to construct a dataset, a lengthy and complex task.

The Truth

Speaking of datasets, a post I read (apologies for forgetting the author) asserted that "all merges were memes!" This sentiment is understandable, considering the significant role luck plays in the process. The lack of control over the dataset means a lack of control over the model's exact capabilities and possibilities.

While building a genuine dataset remains crucial in LLM development (and is my ultimate goal), merging has established itself as a dark art in crafting AI, offering an easy performance boost.

Suddenly Facing the Obvious

Motivated by this, I embarked on building several models through merging and franken-merging. It seemed straightforward: edit a config file, run a script, and voila! However, as my "models" folder expanded, I realized that evaluating myself their performance in role-playing scenarios was the wrong approach.

I needed a tool for evaluation. Although existing tools existed, I wasn't satisfied and opted to create my own for two reasons:

  1. Contamination: Existing tests are known and sometimes used to train LLMs, compromising their integrity. "Decontaminating" LLMs has become necessary for some scientists.
  2. Specificity: Most tests are too general for my needs. I aim to create a narrator and a game-master, requiring specific skill assessments.

AutoRP-Eval.py

Creating AutoRP-Eval.py wasn't overly challenging, especially with AI assistance. While it needs some refinement (e.g., the model selection is still hard-coded), it effectively evaluates models. It includes a folder named "items" where I can place text files, each containing a question and an evaluation criterion. The script compels the evaluated AI to answer each question, and the responses are automatically assessed by another LLM.

More Tools

Ensuring correct and consistent corrections posed a challenge, requiring extensive prompt engineering attempts. A dedicated tool, rubric-eval.py, built in Gradio, allows me to generate or write answers and verify if they align with the rubric, using a specifically designed LLM named Proctora.

Almost Done

While the script needs some final touches, most efforts will now focus on generating test questions in various interesting areas for my purposes and completing the test suite. After this, the true craft of AI creation will commence.


Sunday, January 14, 2024

Unveiling the Realm: Journey into the Fusion of AI and RPG with Karkomagor

 

Hello, I am Karkomagor – a teacher by profession and an IT enthusiast by experience. I am a passionate explorer of the realms of Role-Playing Games (RPGs) and Artificial Intelligence (AI) as well. Today, I invite you on a fascinating journey into the crossroads of these two seemingly disparate worlds, where my passion for storytelling and technology converges into an exciting frontier.

The Beginnings

My journey has been as diverse as the characters that inhabit the worlds of RPGs. Having worked in IT in the past, I've always found myself drawn to the intricate tapestry of code and the endless possibilities it holds. Yet, my heart beats for a different kind of creation – the creation of worlds, stories, and characters that breathe life into the imagination.

A Teacher's Odyssey

As a teacher, I've always seen education as a gateway to new dimensions. I realized that the power of AI could be harnessed not only for solving complex real-world problems but also for crafting immersive and dynamic RPG experiences. Thus, began my journey into the fusion of these two passions.

Mastering the Craft

Diving headfirst into this uncharted territory, I continue to dedicate my time to mastering the new skills emerging in the art of AI. From fine-tuning to delving into the intricacies of merging and "frankenmerging"! Lastly, I learned quantization, refining the balance between computational efficiency and model performance.

The Love for RPGs

Why AI and RPG, you might wonder? For me, RPGs are not just games – they are art forms that allow us to step into alternate universes, to become storytellers and creators. It's a journey into the unknown, guided by the choices we make and the narratives we weave. Incorporating AI into this mix is not just a technical challenge; it's a pursuit of understanding the very essence of storytelling.

A Real Challenge

RPGs with AI present a real challenge: it's about crafting narratives that respond dynamically to the actions of players, creating a living and breathing world within the digital domain. If we can solve the challenge of merging AI seamlessly into RPG sessions, we're not just enhancing fictional realities – we're on the cusp of solving real-world challenges with unparalleled creativity.

The Potential Beyond

Imagine indeed a world where AI doesn't just power games but becomes a tool for problem-solving, creativity, and innovation. The bridge between fictional and real realities is thinner than we think, and mastering the complexities of RPGs with AI is a key to unlocking unprecedented potential.

So, join me on this odyssey into the unexplored territories of AI-enhanced RPGs. Together, let's unravel the mysteries, confront the challenges, and witness the birth of a new era where storytelling and technology merge seamlessly. The adventure has just begun, and the possibilities are as limitless as the imagination itself.

Are you ready to embark on this journey? The dice are cast, and the story awaits...

The State of AutoRP-Eval.py

 Over the past few weeks, I have dedicated my efforts to enhancing my Role-Playing (RP) evaluation suite, focusing on refining the main scri...