Skip to main content

Indie game storeFree gamesFun gamesHorror games
Game developmentAssetsComics
SalesBundles
Jobs
TagsGame Engines
A jam submission

Model Hubris: On the Presumptuousness of Large Language ModelsView project page

Probing the boundary between logical inference, common sense, and nonsense reasoning
Submitted by Giles — 15 minutes, 18 seconds before the deadline
Add to collection

Play project

Model Hubris: On the Presumptuousness of Large Language Models's itch.io page

Results

CriteriaRankScore*Raw Score
Reproducibility#42.8875.000
Generality#62.3094.000
Novelty#72.3094.000
Benchmark#72.3094.000
Safety#82.8875.000

Ranked from 3 ratings. Score is adjusted from raw score by the median number of ratings per game in the jam.

Judge feedback

Judge feedback is anonymous.

  • This is a very interesting project and investigates some cool failure cases. Building further on this would be to make the tasks into benchmarks that we can use as a test for models.

Where did you participate?
Online

What are the full names of the participants?
Giles Edkins, Anna Swanson

What is your team name?
The Presumers

Leave a comment

Log in with itch.io to leave a comment.

Comments

No one has posted a comment yet