Why would you want to do this? In many production workflows you must answer several independent questions about the same piece of content. Doing those analyses one-by-one increases latency and can increase total cost if any step fails and forces a retry. By "fanning out" multiple specialized agents at the same time and then "fanning in" their outputs to a final “meta” agent, you're able to reduce this latency.
This notebook present a toy example that you likely wouldn't parallelize in the real world, but that shows:
- How to define several focused agents with the OpenAI Agents SDK.
- How to execute them concurrently using either Python asyncio for lower latency, lightweight parallelization or directly through the Agents SDK for ease of management and dynamic tool call planning.
- How to gather their individual outputs and feed them into a downstream meta-agent that produces the final, user-ready answer.
- A simple timeline visualization so you can see the latency benefit of parallelization.
This same pattern can be adapted to real world scenarios such as customer-support triage, content moderation, or other scenarios where you might want to run multiple independent analyses on an input and merge them into a single outcome.