Server-Side Caching
Server-side HTTP response caching is fundamentally different from client-side caching. While client-side middleware caches responses from external APIs, server-side middleware caches your own application's responses to reduce load and improve performance.
What is Server-Side Caching?
Server-side caching stores the responses your application generates so that subsequent identical requests can be served from cache without re-executing expensive operations like database queries or complex computations.
Example Flow
Without Server-Side Caching:
Request → Routing → Handler → Database Query → Response (200ms)
Request → Routing → Handler → Database Query → Response (200ms)
Request → Routing → Handler → Database Query → Response (200ms)
With Server-Side Caching:
Request → Routing → Cache MISS → Handler → Database Query → Response (200ms) → Cached
Request → Routing → Cache HIT → Response (2ms)
Request → Routing → Cache HIT → Response (2ms)
Key Differences from Client-Side Caching
| Aspect | Client-Side | Server-Side |
|---|---|---|
| What it caches | External API responses | Your app's responses |
| Position | Before making outbound requests | After routing, before handlers |
| Use case | Reduce external API calls | Reduce internal computation |
| RFC 7234 behavior | Client cache rules | Shared cache rules |
| Request extensions | N/A | Must preserve (path params, state) |
Available Implementations
Currently, server-side caching is available for:
- Tower-based servers (Axum, Hyper, Tonic) - See tower-server
When to Use Server-Side Caching
Good Use Cases ✅
- Public API endpoints with expensive database queries
- Read-heavy workloads where data doesn't change frequently
- Dashboard or analytics data that updates periodically
- Static-like content that requires dynamic generation
- Search results for common queries
- Rendered HTML for public pages
Avoid Caching ❌
- User-specific data (unless using proper cache key differentiation)
- Authenticated endpoints (without user ID in cache key)
- Real-time data that must always be fresh
- Write operations (POST/PUT/DELETE requests)
- Sensitive information that shouldn't be shared
- Session-dependent responses (without session ID in cache key)
Security Considerations
Server-side caches are shared caches - cached responses are served to ALL users. This is different from client-side caches which are per-client.
Critical Security Rule
Never cache user-specific data without including the user/session identifier in the cache key.
Safe Patterns
Pattern 1: Mark user-specific responses as private
#![allow(unused)] fn main() { async fn user_profile() -> Response { ( [(header::CACHE_CONTROL, "private")], // Won't be cached "User profile data" ).into_response() } }
Pattern 2: Include user ID in cache key
#![allow(unused)] fn main() { let keyer = CustomKeyer::new(|req: &Request<()>| { let user_id = extract_user_id(req); format!("{} {} user:{}", req.method(), req.uri().path(), user_id) }); }
Pattern 3: Don't cache at all
#![allow(unused)] fn main() { async fn sensitive_data() -> Response { ( [(header::CACHE_CONTROL, "no-store")], "Sensitive data" ).into_response() } }
RFC 7234 Compliance
Server-side caches implement shared cache semantics as defined in RFC 7234:
Must NOT Cache
- Responses with
Cache-Control: private(user-specific) - Responses with
Cache-Control: no-store(sensitive) - Responses with
Cache-Control: no-cache(requires revalidation) - Non-2xx status codes (errors)
- Responses with
Authorizationheader (unless explicitly allowed)
Must Cache Correctly
- Prefer
s-maxageovermax-age(shared cache specific) - Respect
Varyheaders (content negotiation) - Handle
Expiresheader as fallback - Support
max-ageandpublicdirectives
Performance Characteristics
Benefits
- Reduced database load: Cached responses don't hit the database
- Lower CPU usage: Expensive computations run once
- Faster response times: Cache hits are typically <5ms
- Better scalability: Handle more requests with same resources
Considerations
- Memory usage: Cached responses stored in memory or disk
- Stale data: Cached data may become outdated
- Cache warming: Initial requests (cache misses) are slower
- Invalidation complexity: Updating cached data can be tricky
Cache Invalidation Strategies
Time-Based (TTL)
Set expiration times on cached responses:
#![allow(unused)] fn main() { async fn handler() -> Response { ( [(header::CACHE_CONTROL, "max-age=300")], // 5 minutes "Response data" ).into_response() } }
Event-Based
Manually invalidate cache entries when data changes:
#![allow(unused)] fn main() { // After updating user data cache_manager.delete(&format!("GET /users/{}", user_id)).await?; }
Hybrid Approach
Combine TTL with manual invalidation:
- Use TTL for automatic expiration
- Invalidate early when you know data changed
Best Practices
- Start conservative: Use shorter TTLs initially, increase as you gain confidence
- Monitor cache hit rates: Track X-Cache headers to measure effectiveness
- Set size limits: Prevent cache from consuming too much memory
- Use appropriate keyers: Match cache key strategy to your needs
- Document caching behavior: Make it clear which endpoints are cached
- Test cache invalidation: Ensure updates propagate correctly
- Consider cache warming: Pre-populate cache for common requests
- Handle cache failures gracefully: Application should work even if cache fails
Monitoring and Debugging
Enable Cache Status Headers
#![allow(unused)] fn main() { let options = ServerCacheOptions { cache_status_headers: true, ..Default::default() }; }
This adds X-Cache headers to responses:
X-Cache: HIT- Served from cacheX-Cache: MISS- Generated by handler
Track Metrics
Monitor these key metrics:
- Cache hit rate (hits / total requests)
- Average response time (hits vs misses)
- Cache size and memory usage
- Cache eviction rate
- Stale response rate
Getting Started
See the tower-server documentation for detailed implementation guide.