Site icon JVM Advent

Delegating Java tasks to Supervised AI Dev Pipelines

During the second part of this year, Anysphere, the company behind Cursor IDE, released 2 new products that could help you in 2026 increase the level of automation in your software operations. The names of both products are: Cursor Agent CLI and Cursor Cloud Agents. The article will explain the features that both products share and the unique capabilities of each. Finally, the article will share some insights for creating great supervised AI Dev pipelines.

What is Cursor Agent CLI in a Pipeline context?

In August 2025, Anysphere released Cursor Agent CLI, a new way to interact with frontier models but not coupled with a particular IDE. With this local development approach, the software engineer added a new way to enrich the development experience, but what happens if we use this product in a pipeline? In that case, we will add new capabilities.

Let’s review the following pipeline to understand the concept:

name: Run Cursor Agent on Demand

on:

  workflow_dispatch:

jobs:

  agent-on-demand:

    runs-on: ubuntu-latest
timeout-minutes: 5


    permissions:

      contents: write

      pull-requests: write

    steps:

      - name: Checkout repository

        uses: actions/checkout@v6

        with:

          token: ${{ secrets.GITHUB_TOKEN }}

          fetch-depth: 0


      - name: Install Cursor CLI

        run: |

          curl https://cursor.com/install -fsS | bash

          echo "$HOME/.cursor/bin" >> $GITHUB_PATH


      - name: Run Cursor Agent

        env:

          CURSOR_API_KEY: ${{ secrets.CURSOR_API_KEY }}

        run: |

          echo "=== User Prompt:===";

         PROMPT="Develop a classic Java class HelloWorld.java program that print Hello World in the console only"

          echo "$PROMPT";

          echo "=== Cursor Agent Execution:===";

          echo "";

          cursor-agent -p "$PROMPT" --model auto


      - name: Create PR with changes

        env:

          GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}

          PAT_TOKEN: ${{ secrets.PAT_TOKEN }}

          GITHUB_REPOSITORY: ${{ github.repository }}

          GITHUB_ACTOR: ${{ github.actor }}

        run: |

          chmod +x .github/scripts/create-pr.sh

         .github/scripts/create-pr.sh

In a few lines of code, a pipeline is able to execute a task with the help of Frontier models and at the end of the process, submit a PR to be reviewed by the team.

Once you have a clear idea about how to start working with this product, let’s jump to the second product released, Cursor Cloud Agent.

What are Cursor Cloud Agents?

In October 2025, Cursor Cloud Agents was released, and it provides a collection of REST endpoints to handle the service. The different resources are organized into 3 categories:

Using this service, you can delegate tasks to frontier models, but all operations run on Cursor cloud infrastructure, not in your pipelines like with Cursor Agent CLI.

As the service provides different REST endpoints, it is important to understand the minimum concepts to orchestrate tasks with them.

Understanding the lifecycle of a Cursor Cloud Agent request

Step 1: Launching a new AI Agent

When a user want to use this service, launch a HTTP POST request to provision a new cloud AI agent, the service will require the following information:

Note: In this article we will put focus on Prompts based on Text plain, not images.

Once the User sends the request, the service will return a HTTP response with status code 201 indicating that the request was received and the service will be processed soon, an Agent-ID which is pretty useful to be used with other REST resources to track the progress and an Agent State, in this case, CREATING.

Note: You could track the whole process here in a visual way: https://cursor.com/agents 

What happens under the hood?

Once the service receives the request, it will provision an EC2 instance running in AWS region us-east-1 with the following features:

Inside this Linux container, the service will perform a git checkout operation of the git repository described in the request, and after that, it will start working on the details described in the user prompt.

As you can observe, the request receives a fast response, but the whole process is asynchronous. So how do you track the progress of your user prompt as it works on your repository?

Step 2: What is the status of my AI Agent?

An AI Agent has the following states:

If you remember from the first step, the AI Agent returned the state CREATING, and if everything goes well, the current state should now be RUNNING. But how do you know what the real status is? For that purpose, there exists a GET endpoint to receive the status from an Agent ID.

By calling the status endpoint periodically, the user/process can know when the AI Agent has changed the state to FINISHED.

Once the AI Agent is in a FINISHED state and the process has changed anything in the git repository, it will execute internally a git commit & git push to a feature branch and will create a PR to be reviewed.

Finally we have our lovely Hello World in the repository:

package info.jab.examples;

public class HelloWorld {
   public static void main(String[] args) {
       System.out.println("Hello World");
   }
}

Step 3: Review the pull request

Once Cursor Cloud Agent reaches the goal specified in the user prompt, the service will create a PR in the repository to be reviewed by your team, independent of your Git branch strategy like Trunk-Based Development, Gitflow, or similar.

When to choose Cursor Agent CLI and when to choose Cursor Cloud Agents?

Exploring new technologies always has a cost. Let’s list a few factors to help in your decision-making:

Until now, we have reviewed the way to execute user prompts by comparing 2 products, but in both cases, you can decouple the location of your prompts from the location of the execution. In the next sections, we will explore aspects of user prompts that will help you be more efficient and maintain them with less effort.

Developing great user prompts

Until now, we have only explained the output of the service—in the previous case, the creation of a Java class that writes to the terminal’s standard output. But how do you increase efficiency in the process? It’s simple: send a request with a user prompt that minimizes ambiguity to reach the defined goals.

An initial Hello World user prompt

You might think that a good user prompt could be:

Develop a classic Java class HelloWorld.java program
that prints "Hello World" in the console only.

And nothing more. But in practice, this idea—which apparently seems very easy—could be interpreted by frontier models in several ways, independent of which frontier model is used, because frontier models have non-deterministic behavior and may have doubts about:

If you understand the potential problems on the frontier model side, let’s iterate on this user prompt.

Moving away from plain text user prompts

When you use modern IDEs with AI features and the frontier model doesn’t return the expected result, you continue the conversation, and after a few iterations, the result is as expected. But when using this kind of service running in your pipelines where you expect accurate results, you need to define restrictions and other details clearly to achieve your goals. So little by little, that user prompt will require some structure to operate accurately.

Encoding your User prompts in PML format

PML is the acronym for Prompt Markup Language, an XML Schema designed to help software engineers describe user prompts accurately.

Take a look at the evolution from plain text to PML with the new sections:

Text plain:

Develop a classic Java class HelloWorld.java program
that prints "Hello World" in the console only.

XML with PML Schema:

<?xml version="1.0" encoding="UTF-8"?>
<prompt xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"   xsi:noNamespaceSchemaLocation="https://jabrena.github.io/pml/schemas/0.3.0/pml.xsd">

    <role>
        You are a Senior software engineer with extensive         experience in Java software development
</role>


   <goal>
       Develop a classic Java class HelloWorld.java program
       that print "Hello World" in the console only
    </goal>

   <constraints>
       <constraint-list>
           <constraint>The develop the class in the Maven module `sandbox`</constraint>
           <constraint>The develop the class in the package info.jab.examples</constraint>
           <constraint>Do not create any test class</constraint>
           <constraint>Do not touch the build file (pom.xml)</constraint>
       </constraint-list>
    </constraints>

   <output-format>
       <output-format-list>
           <output-format-item>Don not explain anything</output-format-item>
       </output-format-list>
    </output-format>

   <safeguards>
       <safeguards-list>
           <safeguards-item>Build the solution with Maven Only</safeguards-item>
       </safeguards-list>
    </safeguards>

    <acceptance-criteria>
       <acceptance-criteria-list>
           <acceptance-criteria-item>The solution is compiled successfully with `./mvnw clean compile -pl sandbox`</acceptance-criteria-item>
           <acceptance-criteria-item>The solution only prints "Hello World" in the console</acceptance-criteria-item>
           <acceptance-criteria-item>Only commit java sources only and push the changes to the branch to create the PR</acceptance-criteria-item>
       </acceptance-criteria-list>
   </acceptance-criteria>
</prompt>

Although we have increased the number of lines, now the user prompt look robust and we have mitigated the ambiguity and now it has a better structure and it will be easier to maintain in the future with new refinements.

Once you have created the document, it can be validated with the XML Schema and later transformed to another format like Markdown.

Here is the result converted into Markdown:

## Role

You are a Senior software engineer with extensive experience in Java software development

## Goal

Develop a classic Java class HelloWorld.java program
that print "Hello World" in the console only

## Constraints

- The develop the class in the Maven module `sandbox`
- The develop the class in the package info.jab.examples
- Do not invest time in planning
- Do not create any test class
- Do not touch the build file (pom.xml)

## Output Format

- Don not explain anything

## Safeguards

- Build the solution with Maven Only

## Acceptance Criteria

The goal will be achieved if the following criteria are met:

- The solution is compiled successfully with `./mvnw clean compile -pl sandbox`
- The solution only prints "Hello World" in the console
- The solution is committed and pushed to the branch to create the PR
- Only commit java sources only and push the changes to the branch to create the PR

 

Using XML as the source format for your user prompts, you could use the composability features that XML includes. On the other hand, when creating or updating PML files, you always create files with the same syntax, so your prompts will be homogeneous at scale.

What happen if something goes wrong?

Don’t be naive—even the most complex systems in the world, like nuclear plants, have incidents in different ways, so why wouldn’t this kind of integration have them too? Let’s explore different types of issues that your threat model plan should cover in your projects using this kind of technology.

Scenario: using Cursor Agent cli from a Pipeline

Imagine the scenario where you delegate a task to Cursor Agent CLI in the execution of your pipeline. What issues could happen?

Scenario: Using Cursor Agent CLi from the pipeline

Issues at Pipeline Level

Issues at Cursor Agent CLI Level

Scenario: Orchestrating Cursor Cloud Agent from a Pipeline

Imagine the scenario where you try to orchestrate an integration with the service Cursor Cloud Agent from a popular Pipeline. What issues could happen?

Scenario: Orchestrating Cursor Cloud Agents from the Pipeline

Issues at Pipeline level

Issues at Cursor Cloud Agent level

In general, it is a good practice to log the Agent ID for potential Cursor support and log the internal frontier model conversation for further analysis in order to improve the user prompt. Do not miss creating a threat model in your projects.

Real world scenarioS

If you have doubts about what scenarios could be used for this new cloud service, I’ll share a few scenarios that you might find interesting.

https://adventofcode.com/2025

Creativity and your monthly budget mark the limit.

LIMITATIONS

This technology is awesome, but you should consider the following factors:

ExampleS in action

using cursor agent cli in action: Orchestrating Cursor Cloud Agent from a Pipeline

Review the following step to understand how to run a pipeline with Cursor Agent CLI using user prompts based on PML.

 - name: Run Cursor Agent
env:
CURSOR_API_KEY: ${{ secrets.CURSOR_API_KEY }}
run: |
echo "=== User Prompt:===";
jbang trust add https://github.com/jabrena/
PROMPT=$(jbang pml-to-md.0.4.0-SNAPSHOT@jabrena convert pml-hello-world-java.xml)
echo "$PROMPT";
echo "=== Cursor Agent Execution:===";
echo "";
cursor-agent -p "$PROMPT" --model auto

In the previous example, the Cursor agent processes a user prompt in Markdown which was originally created in XML (using a PML schema).

Orchestrating Cursor Cloud Agent from a Pipeline

A picture is worth a thousand words. You can see a service that monitors Cursor Cloud Agent runtime at the following address: https://jabrena.github.io/cursor-cloud-agent-rest-api-status/ 

Cursor Cloud Agent REST API Status

Every hour, the service tests the execution to verify different aspects of the solution. After a month of running the service, I can assert that latencies are stable, and this fact is important when designing AI solutions that don’t require near real-time feedback. Further information about the pipeline here: https://github.com/jabrena/cursor-cloud-agent-rest-api-status/blob/main/.github/workflows/scheduled-ping-agent.yaml

Note: The service has been running for more than 1 month (30 × 24 × 4 samples stored). Under the hood, the pipeline uses Churrera CLI, an Open source Java CLI tool designed to orchestrate Cursor Cloud Agents and measure latencies.

Takeaways

References

 

Author: Juan Antonio Breña Moral

Software Engineering Manager – Platform Engineer – Java Specialist
Exit mobile version