API Design API Basics

What are API rate limiting/throttling, API quota, and API bursts?

What is an API Rate Limit? Axway on Rate Limiting APIs & More

As an API provider, it can be hard to know the exact usage of your services. Yet, part of managing these digital assets means limiting access to them. That way, providers can avoid API misuse and ensure all users have fair access to available APIs.

When talking about API access limitation, there are some key terms to define. Specifically, what are API rate limiting/throttling, API quota, and API bursts? Read on for definitions and examples, and learn how enterprises can better monitor, manage, and govern all APIs from a single pane of glass.

What is API rate limiting/throttling?

The number of API calls your backend can process per time unit is typically measured by TPS, or transaction per second. In some cases, systems also have a physical limit of data transferred in Bytes.

Let’s say your backend can process 2,000 TPS — what’s known as backend rate limiting. With API rate limiting or API throttling, you can cap the number of requests an API gateway can process in a given period. Doing so protects backend services from being flooded with excessive messages.

Dive deeper into API rate limits here and FAQs here.

You can configure a rate limit for specified clients that limits the number of messages they can send. This configuration is referred to as application rate limiting.

If a client exceeds their allotted number of requests, their connection is throttled. Processing slows down, but the connection remains open to reduce errors.

It’s important to note there is the risk of connections timing out. The risk of maintaining longer connections could also open a vector to denial-of-service (DoS) attacks.

What is API burst?

When your system has the capacity or is idle, you may want to let a single client send more requests than the defined limit. Within this API peak, clients cannot always control the number of API calls emitted.

An API burst temporarily accommodates this higher volume of requests while avoiding the potential for overload. Based on the defined burst size, you can control the number of excess requests a client can make at the specified rate limit at the millisecond level.

If you have a configured rate limit of 500 TPS, that’s one request per 2 milliseconds (the burst zone). If your burst size is 0, and 2 requests are made in that 2-millisecond zone, one request will be processed and the other rejected.

The key to API burst is balancing client demand with rate-limiting measures. That way, you can support surges in traffic without hindering API performance.

What is API quota?

If you’re looking at more of the commercial side and long-term consumption of API calls and data, API quotas can be a useful tool. API quotas usually describe a certain number of allotted calls for longer intervals.

For example, you might set your API quota at 5,000 calls per month. (Remember, you can combine this quota with a rate limit, such as 20 TPS.)

The quota time window is activated when that first API call is made. Once the time window lapses, the counter resets to zero. It remains zero until the next API call is made.

To enforce an API quota, you need to identify the client or consumer. That’s why we use the term user quota. Through an API marketplace that supports full lifecycle API management, consumers can easily select the subscription plan that suits their quota needs.

They can also access documentation that helps them better understand the API’s value and how to test and use it. SLAs are often also attached to define service response times and availability.

Looking at API quota in more detail, you can imagine setting limits not only based on a client/consumer but also on a per-consumption application level. This is known as an application quota.

You can also limit API calls that consume more backend computing power and impact service.

Gain visibility and control over APIs with Amplify Platform

This level of visibility and control over APIs is best achieved through an API platform that provides federated API management functionality.

Modern enterprises deal with significant complexity, as business units often develop APIs independently of each other. This can lead to silos that fragment CX as well as time-consuming management, automation and standardization headaches, and significant resource duplication.

A universal API management platform like Axway’s Amplify helps you securely manage the full API lifecycle and simplify API discovery and use.

  • Operational tooling lets you monitor and support higher levels of service.
  • Thanks to a policy-based security gateway, teams can define policy, accessibility, rate limits, and quotas with over 200 prebuilt security policies.

And when it’s time to bring your digital products to market, the API Marketplace component allows you to track adoption, usage, and performance metrics for all your API products, offering key insights so you can make better decisions about future investments.


Further modernize your API strategy by curating and packaging your APIs for business value. Tune into this API Talks webinar.

Key Takeaways

  • Establishing access limitations is essential to an API strategy.
  • API rate limiting/throttling, API burst, and API quota are all measures that help support clients while protecting backend services.
  • An API marketplace backed by a universal API management platform supports these measures to protect API performance.