Site icon JVM Advent

Getting Started with SpringAI

A Hands-On Guide to Text Summarization

Over the past year, the Java ecosystem has made significant strides in making Generative AI development enterprise-ready.

For Spring developers, SpringAI has emerged as the go-to toolkit for seamlessly integrating enterprise data and APIs with AI models.

Are you curious in developing enterprise grade AI applications with Spring AI? Then read on.

Setting up SpringAI

Loading documents and evaluating them with Generative AI is a fundamental use case that you will encounter when working with Large Language Models (LLMs) in the industry.

Therefore, to get started with SpringAI, we are using the practical example of summarizing Wikipedia articles with LLMs. We are going to create a springai-wikipedia-demo project that you can find on GitHub.

The project has been build with Spring Boot v3.5.8 running on Java 21. As key ingredients we are using the Anthropic API integration that uses the Claude models. To extract text from PDF documents we are using the Apache Tika document reader.

The key dependencies are:

implementation("org.springframework.boot:spring-boot-starter-webflux")
implementation("org.springframework.ai:spring-ai-starter-model-anthropic")
implementation("org.springframework.ai:spring-ai-tika-document-reader")

Note that we have added a dependency on spring-webflux. For SpringAI to work, we need the Netty Client on the classpath to carry out synchronous and reactive HTTP requests (using its RestClient and WebClient). So even if you only run SpringAI from the CLI, you also need to include a dependency to spring-webflux.

As said, we use Claude from Anthropic as LLM for demo purpose, but you can of course use any of your choice.

Note that you need to provide an Anthropic API key for the Claude Console as an environment variable to authenticate your requests to the model. See the application.properties file:

spring.ai.anthropic.api-key=${ANTHROPIC_API_KEY:missing}

The Document reader infrastructure

One of the key abstractions when processing media content in SpringAI is the org.springframework.ai.document.Document interface. A Document contains the plain content and metadata about the document. The content can be textual, or optionally audio or video. The most important interface methods are:

String getText();
Media getMedia();
Map<String,Object> getMetadata();

The Document abstraction is closely tied to the context of Extract, Transform, Load (ETL) processes for Retrieval Augmented Generation (RAG). ETL is a three-step data integration process to collect data from various sources, clean and reshape it into a usable format, here a Document.

In simple terms, RAG is needed to feed your own private data to the LLM in order to take it into account when answering prompts. In the enterprise, this is of very high value. For privacy purposes, proprietary and open source LLMs won’t be trained on your specific company data.

In our example case, we are going to load five random articles of Wikipedia that have been saved as PDF. The articles have been placed in the resources folder:

├── application.properties
└── articles
    ├── 2023_Asia_Contents_Awards_&_Global_OTT_Awards.pdf
    ├── Anthony_Wonke.pdf
    ├── Chang_Tzi-chin.pdf
    ├── Indera_SC.pdf
    └── Neant-sur-Yvel.pdf

In the real world, you can imagine this content being Confluence docs, JIRA tickets, internal reports, literature and other kinds of publications that you want to provide to your LLM. You might want to let coworkers ask questions about your internal documentation. Or you want provide customers with answers about your products taking into account your own knowledge base.

To use the Tika document reader, we introduce a simple DocumentReader component; I am going to skip the import declarations in my examples:

package com.slissner.springai.infrastructure.document;

@Component
public class DocumentReader {
  public List<Document> loadText(final Resource resource) {
    final TikaDocumentReader tikaDocumentReader = new TikaDocumentReader(resource);
    return tikaDocumentReader.read();
  }
}

As you can see, SpringAI has nice abstractions. You just throw in some Resource reference, be it TXT, HTML, PDF, XLSX and so on, and the Tika Reader will answer with the appropriate Document.

Next, we are defining a repository reading the articles:

@Repository
public class ArticleRepository {

  private final DocumentReader documentReader;

  public ArticleRepository(final DocumentReader documentReader) {
    this.documentReader = documentReader;
  }

  private static final List<String> DOCUMENT_PATHS =
      Stream.of(
              "2023_Asia_Contents_Awards_&_Global_OTT_Awards.pdf",
              "Anthony_Wonke.pdf",
              "Chang_Tzi-chin.pdf",
              "Indera_SC.pdf",
              "Neant-sur-Yvel.pdf")
          .map(path -> "/articles/" + path)
          .toList();

  public List<Document> getAll() {
    return DOCUMENT_PATHS.stream()
        .map(ClassPathResource::new)
        .map(documentReader::loadText)
        .flatMap(Collection::stream)
        .toList();
  }
}

We simply load all articles with the List<Document> getAll() method, by first loading the classpath Resource and then sending it to the Tika reader.

Processing Documents with the ChatClient

As we can now load List<Document> from the PDFs, we can carry out a first prompt to let the LLM summarize the content of the first article in the list 2023_Asia_Contents_Awards_&_Global_OTT_Awards.pdf.

For this purpose, we have introduced the ArticleService application service. Note that we have not introduced yet a separate infrastructure class for the ChatClient. Our use case is so simple that we did not want to introduce a separate class for it.

package com.slissner.springai.application;

@Service
public class ArticleService {

  private final ChatClient chatClient;
  private final ArticleRepository articleRepository;

  public ArticleService(
      final ArticleRepository articleRepository, final ChatClient.Builder chatClientBuilder) {
    this.articleRepository = articleRepository;
    this.chatClient = chatClientBuilder.build();
  }

  public String summarizeArticles() {
    final List<Document> articles = articleRepository.getAll();

    // Use the first article as an example
    final Document articleContent = articles.getFirst();

    return chatClient
        .prompt()
        .user("Provide a summary of the following article:\n\n" + articleContent)
        .call()
        .content();
  }
}

We can now run the String summarizeArticles() application service method on the command line, with the following CommandLineRunner:

  @Bean
  public CommandLineRunner commandLineRunner(final ArticleService articleService) {
    return args -> {
      log.info("Okay, I am going to summarize articles...");

      final String summary = articleService.summarizeArticles();

      log.info("The summary is:");
      log.info(summary);
      log.info("Done!");
    };
  }

Great, that worked! Here is the answer from the LLM:

2025-11-28T16:26:02.457+01:00  INFO 4936 --- [springAI] [           main] c.slissner.springai.SpringAiApplication  : The summary is:
2025-11-28T16:26:02.457+01:00  INFO 4936 --- [springAI] [           main] c.slissner.springai.SpringAiApplication  : # 2023 Asia Contents Awards & Global OTT Awards Summary

The 2023 Asia Contents Awards & Global OTT Awards was held on October 8, 2023, at the BIFF Theater in Busan Cinema Center, South Korea. This event represents a rebranding and expansion of the previous Asia Contents Awards, now including global OTT (Over-The-Top) content and services.

[...]

The ChatClient offers a fluent API when communicating with the AI models.

chatClient
        .prompt()
        .user("Provide a summary of the following article:\n\n" + articleContent)
        .call()
        .content();

You declare to send a .prompt() to the model and its .user() input.

Note that we are passing the prompt here as a String text. You can also pass a Resource text handle. However, if you need full control over the user prompt, you can pass a well-defined Prompt to the ChatClientRequestSpec prompt(Prompt prompt) method. A Prompt gives you full control over the ChatOption such as the concrete model, max tokens or the temperature of the model.

Ultimately, the chat model can be synchronously called with the .call() method. There exists also a .stream() method that offers a reactive Flux<String> via calling the .content() method.

Memorizing Chat History with Advisors

So far, we have only sent a single prompt. What if we want to send a sequence of prompts, memorize the model answers and then run a final prompt on the memory?

For this purpose, SpringAI introduced the Advisors API. Think of the Advisors API as a plugin system for your AI calls.

Each advisor can intercept a request, add context, or modify the prompt before it reaches the model. Advisors are small middleware that automatically add missing context, such as previous messages or app-specific data, so the AI model always has what it needs.

The most common use cases are to add your own data to the conversation (see RAG); or to to add conversational history to the otherwise stateless chat model API.

Note that the SpringAI team warns that the order in which advisors are added to the advisor chain is crucial, like for other middleware. An advisor that has been added before another advisor to the advisor chain is executed first.

Interestingly, if you are having a look into the DefaultChatClient implementation, you can see that calling or streaming the chat model is realized bz just two Advisors that have been added at the end of the chain:

private BaseAdvisorChain buildAdvisorChain() {
    // At the stack bottom add the model call advisors.
    // They play the role of the last advisors in the advisor chain.
    this.advisors.add(ChatModelCallAdvisor.builder().chatModel(this.chatModel).build());
    this.advisors.add(ChatModelStreamAdvisor.builder().chatModel(this.chatModel).build());

    return DefaultAroundAdvisorChain.builder(this.observationRegistry)
        .observationConvention(this.advisorObservationConvention)
        .pushAll(this.advisors)
        .build();
}

Now, let’s enhance our current ArticleService implementation with the standard MessageChatMemoryAdvisor. First we need to inject a ChatMemory into our ArticleService:

@Service
public class ArticleService {

  private static final Logger log = LoggerFactory.getLogger(ArticleService.class);

    
  private final ArticleRepository articleRepository;
  private final ChatClient chatClient;
  private final ChatMemory chatMemory;

  public ArticleService(
      final ArticleRepository articleRepository,
      final ChatClient.Builder chatClientBuilder,
      final ChatMemory chatMemory) {
    this.articleRepository = articleRepository;
    this.chatClient = chatClientBuilder.build();
    this.chatMemory = chatMemory;
  }

As we have not declared any other bean, SpringAI will bind it with the default InMemoryChatMemoryRepository. There exists other ChatMemoryRepository implementations. For example, you can store your ChatMemory to a relational database with the help of a JdbcChatMemoryRepository.

This ChatMemory instance we are going to pass to our MessageChatMemoryAdvisor middleware:

  public String summarizeArticles() {
    final MessageChatMemoryAdvisor chatMemoryAdvisor =
        MessageChatMemoryAdvisor.builder(chatMemory).build();

    final List<Document> articles = articleRepository.getAll();

    articles.stream()
        .filter(article -> StringUtils.isNotBlank(article.getText()))
        // Max length of 8000 characters to avoid API limits
        .map(abbreviateArticleContent())
        .forEach(
            articleContent -> {
              try {
                log.info("Calling AI API with article. [id={}]", articleContent.id());

                chatClient
                    .prompt()
                    .advisors(chatMemoryAdvisor)
                    .user("Provide a summary of the following article:\n\n" + articleContent.text())
                    .call()
                    // We need to call .content() in order to actually retrieve the answer and store
                    // it
                    // in the chat memory.
                    .content();

                log.info(
                    "Successfully summarized article content with AI model. Sleeping now... [id={}]",
                    articleContent.id());

                // Sleep for 60 seconds after each API call to avoid rate limiting
                Thread.sleep(60000);

              } catch (InterruptedException e) {
                Thread.currentThread().interrupt();
                throw new RuntimeException("Thread was interrupted during sleep", e);
              }
            });

It is important that you call the .content() method and not only the .call() method.

Neither the .call() nor the .stream() method do actually trigger the AI model execution. Instead, they only instruct Spring AI whether to use synchronous or streaming calls. The actual terminal operations are .content(), .chatResponse(), and .responseEntity(). In our example, only by calling .content() method thus, we are actually storing the answers to the MessageChatMemoryAdvisor.

Furthermore, note the ugly Thread.sleep() call. The Anthropic API may return with a HTTP 429 error because of a 20,000 input tokens per minute rate limit. We circumvent this limit here with the help of the sleep. However, this gives you a first glimpse of the difficulties to scale AI model usage in a high data volume context, such as for enterprise applications.

With a final prompt, we will ask the AI to work with the previous answers. Let’s assume we build an editorial agent for a travel magazine. We want it to evaluate whether we can recommend arbitrary news articles as a travel destination.

    return chatClient
        .prompt()
        .advisors(chatMemoryAdvisor)
        .user(
            "Imagine you are a journalist in a news outlet. You work for a travel magazine that is based in Europe, "
                + "and thus its readers are European travelers. A colleague of yours has summarized five articles for "
                + "you. Now it is your turn to pick a subject out of these five articles and write a short travel "
                + "recommendation. The idea is that you write a single paragraph that praises a destination or activity "
                + "that you want to recommend to your readers."
                + "\n\n"
                + "Given the summaries provided, what is the most interesting topic?")
        .call()
        .content();

The idea here is that the LLM correctly picks the article about Néant-sur-Yvel, a village of a thousand inhabitants in Brittany, France.

Let’s run it through the model…

  @Bean
  public CommandLineRunner commandLineRunner(final ArticleService articleService) {
    return args -> {
      log.info("Okay, I am going to summarize articles and provide travel recommendations...");

      final String recommendation = articleService.summarizeArticles();

      log.info("The travel recommendation is:");
      log.info(recommendation);
      log.info("Done!");
    };
  }

And here comes its answer:

Tucked away in the enchanting landscape of Brittany, the commune of Néant-sur-Yvel offers discerning travelers a perfect escape from the well-trodden tourist paths of France. This picturesque village, nestled along the banks of the Yvel river, embodies authentic rural French charm that has remained largely undiscovered by mass tourism. With its medieval architecture, verdant countryside perfect for cycling and hiking, and proximity to the legendary Brocéliande Forest—steeped in Arthurian legends—Néant-sur-Yvel provides a genuine glimpse into traditional Breton life. The village makes an ideal base for exploring the wider Morbihan department, with its megalithic monuments and stunning coastline just a scenic drive away. Visit in late spring when the countryside bursts with wildflowers, and don't miss sampling local Breton specialties like galettes and cider in the village's unassuming but delightful eateries. For travelers seeking to experience the France that exists beyond the postcard views of Paris, Néant-sur-Yvel delivers an authentic slice of Brittany that will leave you enchanted.

With such a beautiful answer, who wouldn’t love to travel to Néant-sur-Yvel now? 🙂

Final word

The example has shown that Spring AI is a strong choice for enterprise AI applications. It offers a modular, modern, and well‑integrated feature set. However, scaling AI requires careful planning: token limits, API rate limits, and operational costs must all be considered. This is were the true challenge lies.

Author: Samuel Lissner

Samuel is a senior Java software engineer who lives with his family near Barcelona, Spain. As a Freelance Consultant he mainly works with clients from Spain & Germany on Spring Boot, performance and architecture.

Exit mobile version