Optimizing Backend API Performance

Video size:

Abstract

In today’s digital world, users expect fast and seamless interactions, making API performance a critical factor in the success of any application. Slow APIs can severely impact user experience, scalability, and business outcomes. In this talk, we will explore practical, real-world strategies for optimizing backend API performance, drawn from large-scale implementations. Key areas of focus will include handling large datasets through efficient pagination, improving throughput with asynchronous logging, reducing database load via caching, minimizing latency through payload compression, and optimizing database interactions with connection pooling. Attendees will leave with actionable insights and proven techniques to ensure their APIs are not only responsive and reliable but also prepared to scale under growing demands.

Summary

Transcript

This transcript was autogenerated. To make changes, submit a PR.

Hello everyone. My name is Gary Wan, and I have over a decade of experience in backend development and database management. Throughout my career, I have worked with companies like Journal Electronics, TCS, Wipro, Nike, NMI. Presently, I'm working with Bank of America as an application programmer, we focusing on building scalable and high performance backend system. Today I will be talking about optimizing backend API performance user demands fast response from APIs. A slow API can lead to poor user experience, reduce retention, and a negative impact on your brand. Today we are going to explore the best practices for improving API speed and efficiency. I will be talking about caching connection, pooling pagination, payload compression, and asynchronous processing. By the end of this session, you will have a clear understanding of how to optimize APIs for better scalability and responsiveness. Our agenda will be, I will be talking about why API performance matter. What are the common a p performance bottlenecks? Then the five techniques, which I mentioned in the introduction. And lastly, what are the key takeaways from this session? A P performance is very crucial in any application for several reasons. User experience fast. API provides a smooth and responsive user experience. Slow. API can lead to frustration and a negative perception of the application Scalability. performing APIs can handle more requests concurrently, allowing the application to scale efficiently under heavy load. Poorly performing API can become bottleneck. Limiting the application's ability to handle increased traffic, reliability, performance issue can sometimes indicate underlying problems in the application, such as inefficient code, database bottleneck, or a source leak. Addressing performance concern can improve the overall stability and reliability of the application. Resource utilization efficient APIs consume fewer resources like CPU Memory network per request, reducing infrastructure cost, and improving overall system efficiency. Business impact in many cases, API performance directly impact business metrics. For example, faster API can lead to higher conversion rate, increased sales, and improved customer satisfaction. What are the common API performance bottleneck. Now let's examine some common bottlenecks that affect API performance. Database queries are the operations that retrieve or modify data from your database. They are often the most time consuming part of your API Logic. Poorly designed or executed database queries can result in slow or inaccurate data delivery, high resource consumption and security vulnerabilities to optimize your database queries You. You can use indexes, join filters, pagination, and caching high latency, which is caused by synchronous logging and processing, which means log events are recorded and processed in a coordinated manner, which ensures that the logging operation is computed before any further processing occur. Effectively creating a wait and see approach where the system pauses until the log is fully returned before moving onto the next step, preventing potential data inconsistencies or misinformation. Large compressed payloads leading to slow data transmission, excessive database connection, causing resource contention. It means significant amount of time was being spent in creating and closing a database connection, redundant client request, increasing server load. It means a situation where a client sends the same API requests multiple times, unnecessarily overloading the server with identical data, which often. Cause issues like poor client side logic, network glitch or user interaction that trigger repeated requests for the same information. Lack of caching resulting in repeated database hit means that the API does not store previously retrieved data temporarily resulting in each request needing to be processed from scratch by the server, leading to slower response time, increased server load, and a less efficient user experience, especially when leading with frequently access data. Now that we have identified the key performance issue, let's go over the solutions pagination. When dealing with large data sets, fetching all records at words can lead to slow API response and high memory consumption. Pagination helps by retrieving data in smaller chunks, improving performance and user experience. I will be talking about three techniques here. Offset based pagination, cursor based pagination, and page-based pagination. Offset and limit pagination. These techniques involve using two parameters, offset and limit. The offset parameter determines the starting point or the position in the data set while the limit parameter specifies the maximum number of records to include on each page. Cursor based pagination. Instead of relying on numeric offset, cursor based pagination uses a unique identifier or token to mark the position in the dataset. The API consumer includes the cursor value in subsequent request to fetch the next page of data. Page-based pagination. In this technique, it involves a page parameter to specify the desired page number. The API consumer requests a specific page of data and the API response with the corresponding page along with the metadata, such as the total number of pages or total records count. Using Spring Boot built in pagination support ensures that only a subset of data is fetched per request. Reducing server code. Server load. I have attached a small code snippet from the spring boot where you can use, where you can see that I have used a GA mapping and a page able is sent as a request and page as a response. Asynchronous logging. Synchronous logging can slow down APIs because each log entry requires input output operation performed on disk. By switching to asynchronous logging, we can reduce blocking and improve response time. Asynchronous logging in Java involved decoupling the logging operation from the main application thread. Which prevents performance bottleneck caused by input output operation. This approach enhances application responsiveness spatially under heavy load libraries like Log back, log four J needs to offer built-in support for asynchronous logging to configure asynchronous logging in. Log back use Async append tag in your log back XML configuration file. This allow logs to be temporally stored in memory and return to disc asynchronously, which improves a API efficiency. caching in Java involves storing frequently access data in temporary storage location. You can call it as a cache to enable faster retrieval in subsequent request, thus improving application performance. The cache act as an intermediate layer between the application and the original data source. For example, database or external service. You can see I have attached a small snippet where you can see an annotation cacheable is used. It is a Java annotation, which is used in spring framework to enable caching for method results. When a method is annotated with add cacheable, spring intercepts a method. Call and check if the result for the given argument is already cached. If it is, the cache result is return and the method is not executed. Otherwise, the method is executed. The result is cache, and then return. Caching helps in reducing the load on the database and improves API response time significantly. Payload compression, large responses can slow down APIs. By compressing payload using techniques like Z zipp broadly, we can reduce the amount of data transmitted over the network. Z Zipp, it's a widely used compression algorithm that is efficient for text-based data, broadly. Generally provide higher compression rate shows compressed to Zzi, but may have slower compression speeds. Few consideration. CPU usage, compression and decompression can be CPU Intensive operation. It is important to balance compression levels with performance requirement network overhead, while compression, it reduces payload size. The overhead of compression and decompression should be considered, especially for smaller payloads. The compatibility, which ensures that both the client and server support the same compression algorithm error handling, implement proper error handling for compression and decompression failure security. If the payload contains sensitive information, consider using encryption in addition to compression a small. Configuration has been attached, which is being used in Spring Boot. This configuration enables response compression, reduced bandwidth usage, and improving API performance. Connection, pooling connection. Pooling in Java is a technique used to manage and reuse database connection, improving the performance and efficiency of the application that interact with the database. Instead of creating a new connection every time data access is needed, a pool of connection is established and maintained. When a connection is required, it is borrowed from the pool used and then returned to the pool for reuse, minimizing the overhead of repeatedly creating and closing connection. The benefits of connection pooling are improved Performance. reduce overhead of creating and closing connection resource management, which means efficiently managed database connection, preventing resource exertion and scalability, which enables application to handle a large number of concurrent requests, the best practices. We can use ARI CP for optimal performance, or we can use tune pool size based on the traffic pattern. hi. Config has been attached here. Connection, pooling, minimize overhead, ensuring the database queries execute faster and efficiently. Lastly, the key takeaways from this session. Monitor and profile API. Using tools like Pros and Grafana, implement caching and payload compression for faster response, optimize database queries and use connection pooling to enhance performance leverage, asynchronous processing for logging and. Test and scale APIs proactively to handle P loads. Adopt API gateway and rate limiting to manage high traffic. By following these practices, you can build high performance APIs that scale efficiently in a cloud native environment. Thank you so much. Thank you all for attending this session on optimizing backend a P performance. I hope you found it insightful. You can also connect me on LinkedIn. I. Thank you.

Slides

Download slides (PDF)

See all 81 talks at this event!

Conf42 Cloud Native 2025 - Online

March 06 2025 - premiere 5PM GMT

Optimizing Backend API Performance

Video size:

Abstract

Summary

Transcript

Slides

Garima Agarwal

Application Programmer @ Bank of America

Join the community!

Featured event

2025

2024

Info

Conf42 Cloud Native 2025 - Online

March 06 2025 - premiere 5PM GMT

Optimizing Backend API Performance

Video size:

Abstract

Summary

Transcript

Slides

Garima Agarwal

Application Programmer @ Bank of America

Join the community!