Skip to content
This site is under heavy WIP, so contributions on GitHub are much appreciated! You've most likely been pointed to this site to point towards a concept, or something. Either way, take at least some of the info on this page with a grain of salt, and also don't expect much info since it's very incomplete on content.

Gary

Gary (aka Gaming Gary™️) is a Neuro simulator written in Python. Gary allows you to use models downloaded onto your computer for testing Neuro Game API integrations.

Gary is maintained by Govorunb, and can be found here.

  1. Clone Gary into a folder:

    Terminal window
    git clone https://github.com/Govorunb/gary
  2. cd into it and sync the uv lockfile:

    Terminal window
    cd gary
    uv sync
  3. Run the uv command:

    Terminal window
    uv run gary

    It shouldn’t start immediately as some configuration is required, but this is to verify everything works and no unknown error pop up.

Gary does not come with any models by default. As such, you’ll need to download a model yourself to use.

Once you have your model downloaded, place it in the repository, and rename it to have an _ at the start (so it gets ignored by Git). You should also copy config.yaml and make a new config file (whose name also starts with _), then customise it to point to your model and set other params. Finally, start Gary with this command:

Terminal window
uv run gary --config <YOUR_CONFIG>.yaml # optional: --preset <PRESET_NAME>

or configure using a .env file:

GARY_CONFIG_FILE=_your_config.yaml
GARY_CONFIG_PRESET=randy

And it should now launch successfully.

Gary’s web panel should be accessible at http://localhost:8001 (or whatever the port you set in the config file is, plus 1). You should see Gary’s configuration panel.

With this web panel, you can see the incoming/outgoing packets by the game/Gary, as well as the list of actions and a config on the right-hand side.

Toggling “Tony Mode” stops using the model and instead acts similar to Tony, allowing you to send actions on behalf of your model.

Taken (more or less) verbatim from the repository README.

Trimming context (for continuous generation) only works with the llama_cpp engine. Other engines will instead fully truncate context, and may rarely fail due to overrunning the context window.

There’s a quirk with the way guidance enforces grammar that can sometimes negatively affect chosen actions.

Basically, if the model wants something invalid, it will pick a similar or seemingly arbitrary valid option. For example:

  • The game is about serving drinks at a bar, with valid items to pick up/serve being "vodka", "gin", etc
  • The model gets a bit too immersed and hallucinates about pouring drinks into a glass (which is not an action)
  • When asked what to do next, the model wants to call e.g. pick up on "glass of wine"
  • Since this is not a valid option, guidance picks "gin" because (gives a long explanation)

For nerds - guidance uses the model to generate the starting token probabilities and forwards the rest as soon as it’s fully disambiguated.

In this case, "g has the highest likelihood of all valid tokens, so it gets picked; then, in" is auto-completed because "gin" is the only remaining option (of all valid items) that starts with "g.

Learn more

In a case like this, it would have been better to just let it fail and retry - oh well, at least it’s fast.

Not all JSON schema keywords are supported in Guidance. You can find an up-to-date list here.

Unsupported keywords will produce a warning and be excluded from the grammar.

Following the Neuro API spec is generally safe. If you find an action schema is getting complex or full of obscure keywords, consider logically restructuring it or breaking it up into multiple actions.

  • The web interface can be a bit flaky - keep an eye out for any exceptions in the terminal window and, when in doubt, refresh the page

There may be cases where other backends (including Neuro) may behave differently.

Differences marked with 🚧 will be resolved or obsoleted by the v2 of the API.

  • Gary will always be different from Neuro in some aspects, specifically:
    • Processing other sources of information like vision/audio/chat (for obvious reasons)
    • Gary is not real and will never message you on Discord at 3 AM to tell you he’s lonely 😔
    • Myriad other things like response timings, text filters, allowed JSON schema keywords, long-term memories, etc
  • 🚧 Registering an action with an existing name will replace the old one (by default, configurable through gary.existing_action_policy)
  • Only one active websocket connection is allowed per game; when another tries to connect, either the old or the new connection will be closed (configurable through gary.existing_connection_policy)
  • 🚧 Gary sends actions/reregister_all on every connect (instead of just reconnects, as in the spec)
  • etc etc, just search for “IMPL” in the code

Remote services? (OpenAI, Anthropic, Google, Azure)

Section titled “Remote services? (OpenAI, Anthropic, Google, Azure)”

Only local models are supported. Guidance does allow using remote services, but it cannot enforce grammar/structured outputs if it can’t hook itself into the inference process, so it’s more than likely it’ll just throw exceptions because of invalid output instead.

log excerpt showing remote generation failed after exceeding the limit of 10 attempts

Therefore, they are not exposed as an option at all. You should use Jippity instead anyway.

For more info, check the guidance README or this issue comment.