Scope
Integrating ElasticSearch into your Spring Boot application enhances its search capabilities, enabling efficient storage, retrieval, and analysis of large datasets.
ElasticSearch is a distributed, RESTful search and analytics engine designed for scalability and real-time performance. By incorporating ElasticSearch, you can implement advanced search functionalities, full-text search, and complex data analytics within your application.
Our approach is to write the data we want to send to be logged in ElasticSearch in a buffer table, whenever it is required, and a scheduler will run every few minutes (2 minutes in our example) that communicates and writes to ElasticSearch. This way we keep our application unaffected by any ElasticSearch miscommunication.
Integration with ElasticSearch
Maven Dependencies
Scenario has been implemented and tested in SpringBoot 2.7.18 with Spring Data ElasticSearch 4.4.0 for ElasticSearch 7.17.3.
Check future versions and compatibilities for your implementation.
To integrate Elasticsearch with your Spring Boot application, include the following dependencies in your pom.xml
:
<dependency>
<groupId>org.springframework.data</groupId>
<artifactId>spring-data-elasticsearch</artifactId>
<version>4.4.0</version> <!-- Ensure compatibility with your Spring Boot version -->
</dependency>
<dependency>
<groupId>org.elasticsearch</groupId>
<artifactId>elasticsearch</artifactId>
<version>7.17.3</version> <!-- Match this with your Elasticsearch server version -->
</dependency>
These dependencies provide the necessary tools to interact with Elasticsearch, including templates and repository support.
Also, since we are going to us a scheduler, in order to use SchedLock and ShedLock JDBC we need the following dependencies:
<dependency>
<groupId>net.javacrumbs.shedlock</groupId>
<artifactId>shedlock-spring</artifactId>
<version>4.44.0</version>
</dependency>
<dependency>
<groupId>net.javacrumbs.shedlock</groupId>
<artifactId>shedlock-provider-jdbc-template</artifactId>
<version>4.44.0</version>
</dependency>
Note that ShedLock works only in environments with a shared database by declaring a proper LockProvider. It uses table shedlock we created in database to store the information about the current locks. Currently, ShedLock supports Mongo, Redis, Hazelcast, ZooKeeper, and anything with a JDBC driver.
Elasticsearch Configuration
Configure the RestHighLevelClient
bean to establish communication between your Spring Boot application and the Elasticsearch server:
@Configuration
@Slf4j
@EnableElasticsearchRepositories(basePackages = "com.nodal.project.elasticsearch.repos")
@ComponentScan(basePackages = { "com.nodal.project.elasticsearch" })
public class ElasticsearchConfig {
private static final String ES_INDEX_NAME = "elasticsearch.logindex";
private static final String ES_SERVER_URL = "elasticsearch.url";
private static final String ES_SERVER_PORT = "elasticsearch.port";
private static final String ES_SERVER_HTTP = "elasticsearch.http";
private static final String ES_SERVER_USERNAME = "elasticsearch.username";
private static final String ES_SERVER_PASSWORD = "elasticsearch.password";
@Autowired
private Environment environment;
@Bean
public RestHighLevelClient client() {
try {
final CredentialsProvider credentialsProvider = new BasicCredentialsProvider();
credentialsProvider.setCredentials(AuthScope.ANY,
new UsernamePasswordCredentials(environment.getProperty(ES_SERVER_USERNAME), environment.getProperty(ES_SERVER_PASSWORD)));
SSLContextBuilder sslBuilder = SSLContexts.custom()
.loadTrustMaterial(null, (x509Certificates, s) -> true);
final SSLContext sslContext = sslBuilder.build();
RestHighLevelClient client = new RestHighLevelClient(RestClient
.builder(new HttpHost(environment.getProperty(ES_SERVER_URL), Integer.parseInt(environment.getProperty(ES_SERVER_PORT)), environment.getProperty(ES_SERVER_HTTP)))
.setHttpClientConfigCallback(new RestClientBuilder.HttpClientConfigCallback() {
@Override
public HttpAsyncClientBuilder customizeHttpClient(HttpAsyncClientBuilder httpClientBuilder) {
return httpClientBuilder
.setSSLContext(sslContext)
.setSSLHostnameVerifier(NoopHostnameVerifier.INSTANCE)
.setDefaultCredentialsProvider(credentialsProvider);
}
})
.setRequestConfigCallback(new RestClientBuilder.RequestConfigCallback() {
@Override
public RequestConfig.Builder customizeRequestConfig(
RequestConfig.Builder requestConfigBuilder) {
return requestConfigBuilder.setConnectTimeout(5000)
.setSocketTimeout(120000);
}
}));
RestHighLevelClient esClient = new RestHighLevelClientBuilder(client.getLowLevelClient())
.setApiCompatibilityMode(true)
.build();
return esClient;
} catch (Exception e) {
log.error(e.getMessage());
throw new RuntimeException("Could not create an elasticsearch client");
}
}
@Bean
public ElasticsearchOperations elasticsearchTemplate() {
return new ElasticsearchRestTemplate(client());
}
@Bean
public String indexName() {
return environment.getProperty(ES_INDEX_NAME);
}
}
Ensure that your application.properties
file includes the necessary configurations:
elasticsearch.logindex=your_index
elasticsearch.url=your_url
elasticsearch.port=your_port
elasticsearch.http=http_or_https
elasticsearch.username=your_username
elasticsearch.password=your_password
In the above class configuration we need to specifically state where in our project it is defined the elastic search model that we are going to use and the related repositories (using @EnableElasticsearchRepositories and @ComponentScan).
Defining Elasticsearch Model and Repositories
As stated, our approach is to write the data we want to send to be logged in ElasticSearch in a buffer table, whenever it is required, and a scheduler will run every few minutes (2 minutes in our example) that communicates and writes to ElasticSearch. Thus, we will need one object for our projectLog as we want to handle it and store it and one object for our buffer table.
Log Buffer Object
@Entity
@Table(name = "project_log_buffer")
@Setter
@Getter
@NoArgsConstructor
@EqualsAndHashCode(onlyExplicitlyIncluded = true)
public class ProjectLogBuffer implements Serializable {
private static final long serialVersionUID = 7234095857294904908L;
public final static String INDEX_LOGINDEX = "elasticsearch.logindex";
@Id
@SequenceGenerator(name = "sequence-generator", sequenceName = "id_seq", allocationSize = 1)
@GeneratedValue(strategy = GenerationType.SEQUENCE, generator = "sequence-generator")
@Column(name = "id")
@EqualsAndHashCode.Include
private Long id;
@Column(name = "project_log_body")
private String projectLogBody;
@Column(name = "exception")
private String exceptionMessage;
@Column(name = "exception_time")
private Instant exceptionTime;
@Column(name = "elastic_index")
private String elasticIndex;
public ProjectLogBuffer(String projectLogBody, String elasticIndex, String exceptionMessage) {
this.projectLogBody= projectLogBody;
this.exceptionMessage = exceptionMessage;
this.elasticIndex = elasticIndex;
}
}
Note that in our buffer table we only need to keep the whole JSON as string. We don’t need to break down to specific attributes since ElasticSearch handles JSON and we will send our data in this format.
Log Object Mapping
In contrast to our buffer object, where we need only a string attribute for the whole json, in our actual log object we need to break down our attributes, so that we can handle these data in our SpringBoot app.
@Document(indexName = "#{@indexName}", createIndex = false)
@Getter
@Setter
@NoArgsConstructor
@EqualsAndHashCode
public class ProjectLog implements Serializable {
@Id
private String id;
private Long projectBufferId;
private Long actorId;
@Field(type = FieldType.Text)
private String actorName;
private String actorUsername;
private String actionType;
private String action;
private String actionTimestamp;
private Long taskId;
private String taskType;
private String taskStatus;
private Long taskAssigneeId;
private String taskAssigneeName;
private String ecId;
}
We use ‘createIndex = false’ so that the elasticsearch index is not created on startup of are SpringBoot application, but only when elasticsearch is actually called. Also notice the @Field annotation in attribute actorName, to search using string you must define these attributes accordingly.
Repositories
Now that we have our objects we need to also define our repos
public interface ProjectLogBufferRepository extends JpaRepository<ProjectLogBuffer, Long> {
}
public interface ProjectLogRepository extends ElasticsearchRepository<ProjectLog, String> {
Page<ProjectLog> findAllByActorIdAndActionIn(Long actorId, List<String> actionCodes, Pageable pageable);
Optional<ProjectLog> findByProjectBufferId(Long projectBufferId);
}
Note that we haven’t define any custom method for ProjectLogBuffer since we will only try to save and delete.
For ProjectLog we created two custom methods. As you can see, SpringData ElasticSearch dependency allow as to manipulate our application data that are coming from ElasticSearch in the same manner as if they were stored in our DB. ProjectLogRepository extends ElasticsearchRepository, which is responsible for the communication (through our configured client) with the ElasticSearch engine.
Scheduling Data Transfer to Elasticsearch
SpringBoot Configuration
We need to create (or add in an existing configuration) the following configuration to enable the scheduling in our application:
@Configuration
@EnableScheduling
@EnableAsync
public class SchedulerConfig implements SchedulingConfigurer {
@Override
public void configureTasks(ScheduledTaskRegistrar scheduledTaskRegistrar) {
}
}
and we need to configure our ElasticSearch client to communicate with the engine:
Services
All that is left now is to just use the above. We must of course create an autonomous service to be triggered by the scheduler and send any logs we want to the ElasticSearch engine.
To do so, we must have a scheduled method:
@Scheduled(cron = "${scheduler.elasticsearchLog.cron}")
@SchedulerLock(name = "LogToElasticsearch", lockAtLeastFor = "PT30S", lockAtMostFor = "PT2M")
public void logToElasticsearch() {
log.debug("Scheduler to send logs to elasticsearch.");
this.elasticService.sendLogsToElastic();
}
which calls a service that is responsible to call the autonomous transaction as required. By implementing it this way we ensure that each record holds its own transaction, so if one fails the rest are still sent as expected.
@Transactional
public void sendLogsToElastic() {
List<ProjectLogBuffer> logs = this.projectLogBufferRepository.findAll();
for (ProjectLogBuffer log : logs) {
this.elasticAutonomousService.sendProjectLogToElastic(log);
}
}
and autonomous service is implemented as follows (notice the REQUIRES_NEW propagation on our transaction):
@Transactional(propagation = Propagation.REQUIRES_NEW)
public void sendProjectLogToElastic(ProjectLogBuffer log) {
if (log.getElasticIndex() == null || !log.getElasticIndex().equals(environment.getProperty(ProjectLogBuffer.INDEX_LOGINDEX))) {
return;
}
try {
Gson g = new Gson();
if (!this.projectLogRepository.findByProjectBufferId(log.getId()).isPresent()) { //send the log only if not already exists (for some reason)
ProjectLog projectLog = g.fromJson(log.getProjectLogBody(), ProjectLog.class);
projectLog.setProjectBufferId(log.getId());
this.projectLogRepository.save(projectLog);
}
this.projectLogBufferRepository.delete(log);
} catch (Exception e) {
log.setExceptionMessage(e.getMessage());
log.setExceptionTime(Instant.now());
this.projectLogBufferRepository.save(log);
}
}
In the above we also delete any data from the buffer table that are successfully sent to ElasticSearch. This way the buffer table is populated only with data to send. Also if a record fail to be sent then it will be picked again in a future attempt.
Example
We can use the elasticsearch repository to search anything we want from the ElasticSearch engine data, by simple using the related method. For example:
@Transactional
public Page<ProjectLog> getLogs(List<String> availableActions, Person actor, Pageable pageable) {
return this.projectLogRepository.findAllByActorIdAndActionIn(actor.getId(), availableActions, pageable);
}
Of course we can also manipulate the data in the engine (update, delete etc), but up until this point we have not meet such a requirement in our projects.
By following these steps, you can effectively integrate Elasticsearch into your Spring Boot application, enhancing its search and analytics capabilities while maintaining system resilience and performance.
Cheers!
Related Sources
- https://www.elastic.co/guide/en/integrations/current/spring_boot.html
- https://www.baeldung.com/spring-data-elasticsearch-tutorial
- https://medium.com/%40abhishekranjandev/step-by-step-guide-to-using-elasticsearch-in-a-spring-boot-application-477ba7773dea
- https://www.pixeltrice.com/spring-boot-elasticsearch-crud-example/
- https://www.youtube.com/watch?v=g6AmXGI0RpQ
- SpringBoot integration with ElasticSearch - January 14, 2025
- Configure SwaggerUI in a SpringBoot application - December 6, 2024
- ViewCriteria issue when using more than once attribute with LOV based on switcher (Oracle JDeveloper 12.2.1.0) - May 6, 2016