Improve Latency - duohub Documentation

On this page

Minimize Client-Server Communication
Consider Geographic Proximity
Implement Data Streaming
Utilize Caching Effectively
Choose Optimized Services

Minimize Client-Server Communication

Avoid making multiple calls between the client (user’s machine) and server (cloud processes). Instead, aim to:

Make a single call to the cloud
Keep cross-service communication within the cloud infrastructure
Batch requests where possible

Consider Geographic Proximity

Location matters significantly for latency:

duohub automatically routes queries to the nearest server
Ensure your servers are strategically located close to your user base
Use geo-distributed infrastructure when possible

Implement Data Streaming

Leverage streaming capabilities:

Stream data whenever possible
Begin processing data before the entire stream completes
Enables parallel processing and faster response times

Utilize Caching Effectively

Implement caching strategies:

Use caching for frequently accessed data
Consider Redis cache for high-performance caching needs
Implement appropriate cache invalidation policies

Choose Optimized Services

Select services designed for low latency:

Use platforms like duohub, Vercel, AWS Lambda @ Edge, etc. that are optimized for performance
Look for services that offer:
- Geo-distribution
- Built-in caching
- Optimized network routing

Retrieval Modes Use with Pipecat