Perplexity splits AI inference between PCs and cloud to cut costs

By TNW June 2, 2026

Perplexity AI announced a platform at Computex that dynamically routes AI inference between PCs and cloud servers in real time, acting as an “air-traffic controller” for AI tasks. The chip-agnostic system targets the cost crisis of centralised inference as Perplexity’s revenue hits $500 million.

Perplexity splits AI inference between PCs and cloud to cut costs Perplexity AI has launched a new platform that intelligently distributes AI workloads between personal computers and cloud servers in real time to reduce inference costs. This “air-traffic controller” system routes simpler tasks to local PCs and complex ones to the cloud, aiming for efficient value per watt per user. The approach addresses the significant financial burden of centralized AI inference and aims to lower Perplexity’s own server expenses while potentially speeding up responses.

Perplexity AI developed a platform that dynamically splits AI workloads between PCs and cloud servers.
The system acts as an “air-traffic controller for AI tasks” to reduce inference costs.
Simple tasks run locally on PCs, while complex tasks are routed to cloud servers.
This hybrid approach offloads inference work to existing PCs, reducing strain on data centers.
The platform is “chip agnostic” and works with processors from Intel and Nvidia.
Perplexity’s revenue grew fivefold to $500 million, with significant growth per employee added.
By using user hardware, Perplexity can reduce marginal cost per query and improve response latency. Continue reading https://thenextweb.com/news/perplexity-ai-split-compute-pc-cloud-inference-cost

Reference: https://foxvector.com/articles/27505bb3-788e-423f-8a0e-fc272885fe7a

Write a comment

No comments yet.