Foyle: You Build It; AI Should Run It

As infrastructure complexity grows with cloud services, Kubernetes clusters, networking setups, and security policies, even seasoned developers grapple with deploying and operating their applications. I created Foyle to address these challenges.

Foyle is an AI that helps developers operate their software by using LLMs to translate intent into action.

Why Foyle?

As developers today, we often know what we need to do but struggle with the how;

Foyle solves this problem by turning your intent into executable commands. You interact with Foyle by writing and running VScode Notebooks using the Runme extension. As you author markup cells describing “the what," Foyle suggests code cells containing commands, “the how." You can then run the code cells to achieve your desired outcome.

foyle_ghost_cells
As you use Runme, Foyle constantly learns from your interactions so it can better assist you with similar operations in the future.

Who is Foyle for?

How is Foyle different?

I believe Foyle will help create a collaborative, evolving environment where developers can manage infrastructure more efficiently and effectively.

  1. Integrated Execution Environment: Unlike AIs built around a chat interface, Foyle uses VScode notebooks and Runme to let you seamlessly approve and execute the suggested operations.
  2. Ghost cells: Foyle uses ghost cells to continuously update suggestions as you type. This combines the best of chat interfaces and ghost text. Ghost cells create a tight feedback loop that lets you quickly arrive at the minimal prompt needed to guide the AI correctly.
  3. Continuous Learning: Foyle continually learns from user interactions. Every time you execute a cell, Foyle logs the execution and improves its ability to achieve similar tasks.
  4. Self-Documenting Operations: By capturing the intent (in markdown) and the actions (in code cells), Foyle creates comprehensive, executable documentation of your infrastructure operations.
  5. Local: Foyle can run locally on your machine or cluster. This means sensitive data about your infrastructure never leaves your control.

Foyle at Work

In this demo, I show how Foyle can help you understand the cost of serving a user request.


I begin my analysis by creating a new markdown document in Runme and writing down the questions I want to answer. In this case:

As I write these questions, Foyle suggests commands I can run to answer them. The suggested commands are rendered as ghost cells; code cells with lightly greyed-out text. In this case, Foyle suggests an SQL query I can use to count the number of input and output tokens and their associated costs.

Since Foyle generates the query, I don’t have to waste time trying to recall SQL syntax or the table's schema. If I’m new to the team, I might not know that this data is available in SQLite, but Foyle does.

By solving those problems, Foyle lets me focus on analyzing the data. In this case, I can drill down and look at some actual requests and responses. To do this, I must perform several steps to map the user request id (“contextid”) into the actual LLM request/response. The sequence of steps and the corresponding commands are shown in the table below.

Intent Action
Fetch the most recent session curl -X POST http://localhost:8877/api/foyle.logs.SessionsService/GetSession -H "Content-Type: application/json" -d '{"contextId": "01J8TTD061VRWC41ZCXGXVHXKH"}' | jq .
Fetch the most recent trace TRACEID="b445939345b4ecdb04d6de0623a3592e"
curl -s -o /tmp/response.json -X POST http://localhost:8877/api/foyle.logs.LogsService/GetLLMLogs -H "Content-Type: application/json" -d "{\"traceId\": \"${TRACEID}\"}"
CODE="$?"
if [ $CODE -ne 0 ]; then
echo "Error occurred while fetching LLM logs"
exit $CODE
fi
Render the request as HTML jq -r '.responseJson' /tmp/response.json > /tmp/oairesponse.json
foyle llms render response --input=/tmp/oairesponse.json

Remembering and authoring these commands would be a frustrating process. Fortunately, Foyle makes this easy. As the video below shows, all I have to do is express my intent, and Foyle generates the commands.


At this point, I better understand how much it costs to serve a request in my application and how I could reduce those costs. Using Foyle, I could focus on analysis rather than hacking on shell commands. Since I’m using Runme, my analysis is automatically saved as a markdown document, which I can easily share with my colleagues.

Foyle successfully assisted me because it has learned about my application. For example, it knows that

Foyle knows this because it has learned from my previous interactions. As a side-effect of using Runme to complete tasks, I train Foyle to be a DevOps expert for my application.

DevOps without Foyle

To see the value Foyle provides we can compare it to how I would have analyzed the cost to serve without using Foyle.

To create the query, I would have to provide all the details, e.g. schema, as part of the prompt to ChatGPT or Claude because they have no knowledge of my application.

Since I don’t want to do this every time I want to check the cost of serving, I could create a script or dashboard to run this query. However, if I do that, I lose the flexibility that SQL provides. For example, what if I want to look at token usage as a function of session length (how long a user spends editing a cell)?

By using Foyle, I don’t need to choose between using an easy-to-use but fixed dashboard and writing flexible but hard-to-remember SQL queries. With Foyle, I can offload the difficulty of authoring commands to an AI, thereby gaining flexibility and ease of use. This creates an AI flywheel.

Foyle_flywheel

Try it out!

To get started using Foyle to simplify operations, visit https://foyle.io/docs/getting-started/ and follow the instructions to install Foyle and Runme on your machine. If you have questions, please reach out @jeremylewi or open an issue in GitHub.