test
Some checks are pending
CodeQL / Analyze (csharp) (push) Waiting to run
CodeQL / Analyze (python) (push) Waiting to run
dotnet-build-and-test / paths-filter (push) Waiting to run
dotnet-build-and-test / dotnet-build-and-test (Debug, windows-latest, net9.0) (push) Blocked by required conditions
dotnet-build-and-test / dotnet-build-and-test (Release, integration, true, ubuntu-latest, net10.0) (push) Blocked by required conditions
dotnet-build-and-test / dotnet-build-and-test (Release, integration, true, windows-latest, net472) (push) Blocked by required conditions
dotnet-build-and-test / dotnet-build-and-test (Release, ubuntu-latest, net8.0) (push) Blocked by required conditions
dotnet-build-and-test / dotnet-build-and-test-check (push) Blocked by required conditions
Some checks are pending
CodeQL / Analyze (csharp) (push) Waiting to run
CodeQL / Analyze (python) (push) Waiting to run
dotnet-build-and-test / paths-filter (push) Waiting to run
dotnet-build-and-test / dotnet-build-and-test (Debug, windows-latest, net9.0) (push) Blocked by required conditions
dotnet-build-and-test / dotnet-build-and-test (Release, integration, true, ubuntu-latest, net10.0) (push) Blocked by required conditions
dotnet-build-and-test / dotnet-build-and-test (Release, integration, true, windows-latest, net472) (push) Blocked by required conditions
dotnet-build-and-test / dotnet-build-and-test (Release, ubuntu-latest, net8.0) (push) Blocked by required conditions
dotnet-build-and-test / dotnet-build-and-test-check (push) Blocked by required conditions
This commit is contained in:
54
docs/FAQS.md
Normal file
54
docs/FAQS.md
Normal file
@@ -0,0 +1,54 @@
|
||||
# Frequently Asked Questions
|
||||
|
||||
### How do I get access to nightly builds?
|
||||
|
||||
Nightly builds of the Agent Framework are available [here](https://github.com/orgs/microsoft/packages?repo_name=agent-framework).
|
||||
|
||||
To download nightly builds follow the following steps:
|
||||
|
||||
1. You will need a GitHub account to complete these steps.
|
||||
1. Create a GitHub Personal Access Token with the `read:packages` scope using these [instructions](https://docs.github.com/en/authentication/keeping-your-account-and-data-secure/managing-your-personal-access-tokens#creating-a-personal-access-token-classic).
|
||||
1. If your account is part of the Microsoft organization then you must authorize the `Microsoft` organization as a single sign-on organization.
|
||||
1. Click the "Configure SSO" next to the Personal Access Token you just created and then authorize `Microsoft`.
|
||||
1. Use the following command to add the Microsoft GitHub Packages source to your NuGet configuration:
|
||||
|
||||
```powershell
|
||||
dotnet nuget add source --username GITHUBUSERNAME --password GITHUBPERSONALACCESSTOKEN --store-password-in-clear-text --name GitHubMicrosoft "https://nuget.pkg.github.com/microsoft/index.json"
|
||||
```
|
||||
|
||||
1. Or you can manually create a `NuGet.Config` file.
|
||||
|
||||
```xml
|
||||
<?xml version="1.0" encoding="utf-8"?>
|
||||
<configuration>
|
||||
<packageSources>
|
||||
<add key="nuget.org" value="https://api.nuget.org/v3/index.json" protocolVersion="3" />
|
||||
<add key="GitHubMicrosoft" value="https://nuget.pkg.github.com/microsoft/index.json" />
|
||||
</packageSources>
|
||||
|
||||
<packageSourceMapping>
|
||||
<packageSource key="nuget.org">
|
||||
<package pattern="*" />
|
||||
</packageSource>
|
||||
<packageSource key="GitHubMicrosoft">
|
||||
<package pattern="*nightly"/>
|
||||
</packageSource>
|
||||
</packageSourceMapping>
|
||||
|
||||
<packageSourceCredentials>
|
||||
<GitHubMicrosoft>
|
||||
<add key="Username" value="<Your GitHub Id>" />
|
||||
<add key="ClearTextPassword" value="<Your Personal Access Token>" />
|
||||
</GitHubMicrosoft>
|
||||
</packageSourceCredentials>
|
||||
</configuration>
|
||||
```
|
||||
|
||||
* If you place this file in your project folder make sure to have Git (or whatever source control you use) ignore it.
|
||||
* For more information on where to store this file go [here](https://learn.microsoft.com/en-us/nuget/reference/nuget-config-file).
|
||||
1. You can now add packages from the nightly build to your project.
|
||||
* E.g. use this command `dotnet add package Microsoft.Agents.AI --version 0.0.1-nightly-250731.6-alpha`
|
||||
1. And the latest package release can be referenced in the project like this:
|
||||
* `<PackageReference Include="Microsoft.Agents.AI" Version="*-*" />`
|
||||
|
||||
For more information see: <https://docs.github.com/en/packages/working-with-a-github-packages-registry/working-with-the-nuget-registry>
|
||||
BIN
docs/assets/Agentic-framework_high-res.png
Normal file
BIN
docs/assets/Agentic-framework_high-res.png
Normal file
Binary file not shown.
|
After Width: | Height: | Size: 590 KiB |
BIN
docs/assets/readme-banner.png
Normal file
BIN
docs/assets/readme-banner.png
Normal file
Binary file not shown.
|
After Width: | Height: | Size: 136 KiB |
515
docs/decisions/0001-agent-run-response.md
Normal file
515
docs/decisions/0001-agent-run-response.md
Normal file
@@ -0,0 +1,515 @@
|
||||
---
|
||||
# These are optional elements. Feel free to remove any of them.
|
||||
status: accepted
|
||||
contact: westey-m
|
||||
date: 2025-07-10 {YYYY-MM-DD when the decision was last updated}
|
||||
deciders: sergeymenshykh, markwallace, rbarreto, dmytrostruk, westey-m, eavanvalkenburg, stephentoub
|
||||
consulted:
|
||||
informed:
|
||||
---
|
||||
|
||||
# Agent Run Responses Design
|
||||
|
||||
## Context and Problem Statement
|
||||
|
||||
Agents may produce lots of output during a run including
|
||||
|
||||
1. **[Primary]** General response messages to the caller (this may be in the form of text, including structured output, images, sound, etc.)
|
||||
2. **[Primary]** Structured confirmation requests to the caller
|
||||
3. **[Secondary]** Tool invocation activities executed (both local and remote). For information only.
|
||||
4. Reasoning/Thinking output.
|
||||
1. **[Primary]** In some cases an LLM may return reasoning output intermixed with as part of the answer to the caller, since the caller's prompt asked for this detail in some way. This should be considered a specialization of 1.
|
||||
1. **[Secondary]** Reasonining models optionally produce reasoning output separate from the answer to the caller's question, and this should be considered secondary content.
|
||||
5. **[Secondary]** Handoffs / transitions from agent to agent where an agent contains sub agents.
|
||||
6. **[Secondary]** An indication that the agent is responding (i.e. typing) as if it's a real human.
|
||||
7. Complete messages in addition to updates, when streaming
|
||||
8. Id for long running process that is launched
|
||||
9. and more
|
||||
|
||||
We need to ensure that with this diverse list of output, we are able to
|
||||
|
||||
- Support all with abstractions where needed
|
||||
- Provide a simple getting started experience that doesn't overwhelm developers
|
||||
|
||||
### Agent response data types
|
||||
|
||||
When comparing various agent SDKs and protocols, agent output is often divided into two categories:
|
||||
|
||||
1. **Result**: A response from the agent that communicates the result of the agent's work to the caller in natural language (or images/sound/etc.). Let's call this **Primary** output.
|
||||
1. Includes cases where the agent finished because it requires more input from the user.
|
||||
2. **Progress**: Updates while the agent is running, which are informational only, typically showing what the agent is doing, and does not allow any actions to be taken by the caller that modify the behavior of the agent before completing the run. Let's call this **Secondary** output.
|
||||
|
||||
A potential third category is:
|
||||
|
||||
3. **Long Running**: A response that does not contain a Primary response or Secondary updates, but rather a reference to a long running task.
|
||||
|
||||
### Different use cases for Primary and Secondary output
|
||||
|
||||
To solve complex problems, many agents must be used together. These agents typically have their own capabilities and responsibilities and communicate via input messages and final responses/handoff calls, while the internal workings of each agent is not of interest to the other agents participating in solving the problem.
|
||||
|
||||
When an agent is in conversation with one or more humans, the information that may be displayed to the user(s) can vary. E.g. When an agent is part of a conversation with multiple humans it may be asked to perform tasks by the humans, and they may not want a stream of distracting updates posted to the conversation, but rather just a final response. On the other hand, if an agent is being used by a single human to perform a task, the human may be waiting for the agent to complete the task. Therefore, they may be interested in getting updates of what the agent is doing.
|
||||
|
||||
Where agents are nested, consumers would also likely want to constrain the amount of data from an agent that bubbles up into higher level conversations to avoid exceeding the context window, therefore limiting it to the Primary response only.
|
||||
|
||||
### Comparison with other SDKs / Protocols
|
||||
|
||||
Approaches observed from the compared SDKs:
|
||||
|
||||
1. Response object with separate properties for Primary and Secondary
|
||||
2. Response stream that contains Primary and Secondary entries and callers need to filter.
|
||||
3. Response containing just Primary.
|
||||
|
||||
| SDK | Non-Streaming | Streaming |
|
||||
|-|-|-|
|
||||
| AutoGen | **Approach 1** Separates messages into Agent-Agent (maps to Primary) and Internal (maps to Secondary) and these are returned as separate properties on the agent response object. See [types of messages](https://microsoft.github.io/autogen/stable/user-guide/agentchat-user-guide/tutorial/messages.html#types-of-messages) and [Response](https://microsoft.github.io/autogen/stable/reference/python/autogen_agentchat.base.html#autogen_agentchat.base.Response) | **Approach 2** Returns a stream of internal events and the last item is a Response object. See [ChatAgent.on_messages_stream](https://microsoft.github.io/autogen/stable/reference/python/autogen_agentchat.base.html#autogen_agentchat.base.ChatAgent.on_messages_stream) |
|
||||
| OpenAI Agent SDK | **Approach 1** Separates new_items (Primary+Secondary) from final output (Primary) as separate properties on the [RunResult](https://github.com/openai/openai-agents-python/blob/main/src/agents/result.py#L39) | **Approach 1** Similar to non-streaming, has a way of streaming updates via a method on the response object which includes all data, and then a separate final output property on the response object which is populated only when the run is complete. See [RunResultStreaming](https://github.com/openai/openai-agents-python/blob/main/src/agents/result.py#L136) |
|
||||
| Google ADK | **Approach 2** [Emits events](https://google.github.io/adk-docs/runtime/#step-by-step-breakdown) with [FinalResponse](https://github.com/google/adk-java/blob/main/core/src/main/java/com/google/adk/events/Event.java#L232) true (Primary) / false (Secondary) and callers have to filter out those with false to get just the final response message | **Approach 2** Similar to non-streaming except [events](https://google.github.io/adk-docs/runtime/#streaming-vs-non-streaming-output-partialtrue) are emitted with [Partial](https://github.com/google/adk-java/blob/main/core/src/main/java/com/google/adk/events/Event.java#L133) true to indicate that they are streaming messages. A final non partial event is also emitted. |
|
||||
| AWS (Strands) | **Approach 3** Returns an [AgentResult](https://strandsagents.com/latest/documentation/docs/api-reference/python/agent/agent_result/) (Primary) with messages and a reason for the run's completion. | **Approach 2** [Streams events](https://strandsagents.com/latest/documentation/docs/api-reference/python/agent/agent/#strands.agent.agent.Agent.stream_async) (Primary+Secondary) including, response text, current_tool_use, even data from "callbacks" (strands plugins) |
|
||||
| LangGraph | **Approach 2** A mixed list of all [messages](https://langchain-ai.github.io/langgraph/agents/run_agents/#output-format) | **Approach 2** A mixed list of all [messages](https://langchain-ai.github.io/langgraph/agents/run_agents/#output-format) |
|
||||
| Agno | **Combination of various approaches** Returns a [RunResponse](https://docs.agno.com/reference/agents/run-response) object with text content, messages (essentially chat history including inputs and instructions), reasoning and thinking text properties. Secondary events could potentially be extracted from messages. | **Approach 2** Returns [RunResponseEvent](https://docs.agno.com/reference/agents/run-response#runresponseevent-types-and-attributes) objects including tool call, memory update, etc, information, where the [RunResponseCompletedEvent](https://docs.agno.com/reference/agents/run-response#runresponsecompletedevent) has similar properties to RunResponse|
|
||||
| A2A | **Approach 3** Returns a [Task or Message](https://a2aproject.github.io/A2A/latest/specification/#71-messagesend) where the message is the final result (Primary) and task is a reference to a long running process. | **Approach 2** Returns a [stream](https://a2aproject.github.io/A2A/latest/specification/#72-messagestream) that contains task updates (Secondary) and a final message (Primary) |
|
||||
| Protocol Activity | **Approach 2** Single stream of responses including secondary events and final response messages (Primary). | No separate behavior for streaming. |
|
||||
|
||||
## Decision Drivers
|
||||
|
||||
- Solutions provides an easy to use experience for users who are getting started and just want the answer to a question.
|
||||
- Solution must be extensible to future requirements, e.g. long running agent processes.
|
||||
- Experience is in line or better than the best in class experience from other SDKs
|
||||
|
||||
## Response Type Options
|
||||
|
||||
- **Option 1** Run: Messages List contains mix of Primary and Secondary content, RunStreaming: Stream of Primary + Secondary
|
||||
- **Option 1.1** Secondary content do not use `TextContent`
|
||||
- **Option 1.2** Presence of Secondary Content is determined by a runtime parameter
|
||||
- **Option 1.3** Use ChatClient response types
|
||||
- **Option 1.4** Return derived ChatClient response types
|
||||
- **Option 2** Run: Container with Primary and Secondary Properties, RunStreaming: Stream of Primary + Secondary
|
||||
- **Option 2.1** Response types extend MEAI types
|
||||
- **Option 2.2** New Response types
|
||||
- **Option 3** Run: Primary-only, RunStreaming: Stream of Primary + Secondary
|
||||
- **Option 4** Remove Run API and retain RunStreaming API only, which returns a Stream of Primary + Secondary.
|
||||
|
||||
Since the suggested options vary only for the non-streaming case, the following detailed explanations for each
|
||||
focuses on the non-streaming case.
|
||||
|
||||
### Option 1 Run: Messages List contains mix of Primary and Secondary content, RunStreaming: Stream of Primary + Secondary
|
||||
|
||||
Run returns a `Task<ChatResponse>` and RunStreaming returns a `IAsyncEnumerable<ChatResponseUpdate>`.
|
||||
For Run, the returned `ChatResponse.Messages` contains an ordered list of messages that contain both the Primary and Secondary content.
|
||||
|
||||
`ChatResponse.Text` automatically aggregates all text from any `TextContent` items in all `ChatMessage` items in the response.
|
||||
If we can ensure that no updates ever contain `TextContent`, this will mean that `ChatResponse.Text` will always contain
|
||||
the Primary response text. See option 1.1.
|
||||
If we cannot ensure this, either the solution or usage becomes more complex, see 1.3 and 1.4.
|
||||
|
||||
#### Option 1.1 `TextContent`, `DataContent` and `UriContent` means Primary content
|
||||
|
||||
`ChatResponse.Text` aggregates all `TextContent` values, and no secondary updates use `TextContent`
|
||||
so `ChatResponse.Text` will always contain the Primary content.
|
||||
|
||||
```csharp
|
||||
// Since the Text property contains the primary content, it's a simple getting started experience.
|
||||
var response = await agent.RunAsync("Do Something");
|
||||
Console.WriteLine(response.Text);
|
||||
|
||||
// Callers can still get access to all updates too.
|
||||
foreach (var update in response.Messages)
|
||||
{
|
||||
Console.WriteLine(update.Contents.FirstOrDefault()?.GetType().Name);
|
||||
}
|
||||
|
||||
// For streaming, it's possible to output the primary content by also using the Text property on each update.
|
||||
await foreach (var update in agent.RunStreamingAsync("Do Something"))
|
||||
{
|
||||
Console.Writeline(update.Text)
|
||||
}
|
||||
```
|
||||
|
||||
- **PROS**: Easy and familiar user experience, reuse response types from IChatClient. Similar experience for both streaming and non streaming.
|
||||
- **CONS**: The agent response types cannot evolve separately from MEAI if needed.
|
||||
|
||||
#### Option 1.1a `TextContent`, `DataContent` and `UriContent` means Primary content, with custom Agent response types
|
||||
|
||||
Same as 1.1 but with custom Agent Framework response types.
|
||||
The response types should preferably resemble ChatResponse types closely, to ensure user's have a fimilar experience when moving between the two.
|
||||
Therefore something like `AgentResponse.Text` which also aggregates all `TextContent` values similar to 1.1 makes sense.
|
||||
|
||||
- **PROS**: Easy getting started experience, and response types can be customized for the Agent Framework where needed.
|
||||
- **CONS**: More work to define custom response types.
|
||||
|
||||
#### Option 1.2 Presence of Secondary Content is determined by a runtime parameter
|
||||
|
||||
We can allow callers to choose whether to include secondary content in the list of reponse messages.
|
||||
Open Question: Do we allow secondary content to use `TextContent` types?
|
||||
|
||||
```csharp
|
||||
// By default the response only has the primary content, so text
|
||||
// contains the primary content, and it's a good starting experience.
|
||||
var response = await agent.RunAsync("Do Something");
|
||||
Console.WriteLine(response.Text);
|
||||
|
||||
// we can also optionally include updates via an option.
|
||||
var response = await agent.RunAsync("Do Something", options: new() { IncludeUpdates = true });
|
||||
// Callers can now access all updates.
|
||||
foreach (var update in response.Messages)
|
||||
{
|
||||
Console.WriteLine(update.Contents.FirstOrDefault()?.GetType().Name);
|
||||
}
|
||||
```
|
||||
|
||||
- **PROS**: Easy getting started experience, reuse response types from IChatClient.
|
||||
- **CONS**: Since the basic experience is the same as 1.1, and when you look at individual messages, you most likely want all anyway, it seems arbitrarily limiting compared to 1.1.
|
||||
|
||||
### Option 2 Run: Container with Primary and Secondary Properties, RunStreaming: Stream of Primary + Secondary
|
||||
|
||||
Run returns a new response type that has separate properties for the Primary Content and the Secondary Updates leading up to it.
|
||||
The Primary content is available in the `AgentResponse.Messages` property while Secondary updates are in a new `AgentResponse.Updates` property.
|
||||
`AgentResponse.Text` returns the Primary content text.
|
||||
|
||||
Since streaming would still need to return an `IAsyncEnumerable` of updates, the design would differ from non-streaming.
|
||||
With non-streaming Primary and Secondary content is split into separate lists, while with streaming it's combined in one stream.
|
||||
|
||||
```csharp
|
||||
// Since text contains the primary content, it's a good getting started experience.
|
||||
var response = await agent.RunAsync("Do Something");
|
||||
Console.WriteLine(response.Text);
|
||||
|
||||
// Callers can still get access to all updates too.
|
||||
foreach (var update in response.Updates)
|
||||
{
|
||||
Console.WriteLine(update.Contents.FirstOrDefault()?.GetType().Name);
|
||||
}
|
||||
```
|
||||
|
||||
- **PROS**: Primary content and Secondary Updates are categorised for non-streaming and therefore easy to distinguish and this design matches popular SDKs like AutoGen and OpenAI SDK.
|
||||
- **CONS**: Requires custom response types and design would differ between streaming and non-streaming.
|
||||
|
||||
### Option 3 Run: Primary-only, RunStreaming: Stream of Primary + Secondary
|
||||
|
||||
Run returns a `Task<ChatResponse>` and RunStreaming returns a `IAsyncEnumerable<ChatResponseUpdate>`.
|
||||
For Run, the returned `ChatResponse.Messages` contains only the Primary content messages.
|
||||
`ChatResponse.Text` will contain the aggregate text of `ChatResponse.Messages` and therefore the primary content messages text.
|
||||
|
||||
```csharp
|
||||
// Since text contains the primary content response, it's a good getting started experience.
|
||||
var response = await agent.RunAsync("Do Something");
|
||||
Console.WriteLine(response.Text);
|
||||
|
||||
// Callers cannot get access to all updates, since only the primary content is in messages.
|
||||
var primaryContentOnly = response.Messages.FirstOrDefault();
|
||||
```
|
||||
|
||||
- **PROS**: Simple getting started experience, Reusing IChatClient response types.
|
||||
- **CONS**: Intermediate updates are only availble in streaming mode.
|
||||
|
||||
### Option 4: Remove Run API and retain RunStreaming API only, which returns a Stream of Primary + Secondary
|
||||
|
||||
With this option, we remove the `RunAsync` method and only retain the `RunStreamingAsync` method, but
|
||||
we add helpers to process the streaming responses and extract information from it.
|
||||
|
||||
```csharp
|
||||
// User can get the primary content through an extension method on the async enumerable stream.
|
||||
var responses = agent.RunStreamingAsync("Do Something");
|
||||
// E.g. an extension method that builds the primary content text.
|
||||
Console.WriteLine(await responses.AggregateFinalResult());
|
||||
// Or an extention method that builds complete messages from the updates.
|
||||
Console.WriteLine(await responses.BuildMessage().Text);
|
||||
|
||||
// Callers can also iterate through all updates if needed
|
||||
await foreach (var update in responses)
|
||||
{
|
||||
Console.WriteLine(update.Contents.FirstOrDefault()?.GetType().Name);
|
||||
}
|
||||
```
|
||||
|
||||
- **PROS**: Single API for streaming/non-streaming
|
||||
- **CONS**: More complex to for inexperienced users.
|
||||
|
||||
## Custom Response Type Design Options
|
||||
|
||||
### Option 1 Response types extend MEAI types
|
||||
|
||||
```csharp
|
||||
class Agent
|
||||
{
|
||||
public abstract Task<AgentResponse> RunAsync(
|
||||
IReadOnlyCollection<ChatMessage> messages,
|
||||
AgentThread? thread = null,
|
||||
AgentRunOptions? options = null,
|
||||
CancellationToken cancellationToken = default);
|
||||
|
||||
public abstract IAsyncEnumerable<AgentResponseUpdate> RunStreamingAsync(
|
||||
IReadOnlyCollection<ChatMessage> messages,
|
||||
AgentThread? thread = null,
|
||||
AgentRunOptions? options = null,
|
||||
CancellationToken cancellationToken = default);
|
||||
}
|
||||
|
||||
class AgentResponse : ChatResponse
|
||||
{
|
||||
}
|
||||
|
||||
public class AgentResponseUpdate : ChatResponseUpdate
|
||||
{
|
||||
}
|
||||
```
|
||||
|
||||
- **PROS**: Fimilar response types for anyone already using MEAI.
|
||||
- **CONS**: Agent response types cannot evolve separately.
|
||||
|
||||
### Option 2 New Response types
|
||||
|
||||
We could create new response types for Agents.
|
||||
The new types could also exclude properties that make less sense for agents, like ConversationId, which is abstracted away by AgentThread, or ModelId, where an agent might use multiple models.
|
||||
|
||||
```csharp
|
||||
class Agent
|
||||
{
|
||||
public abstract Task<AgentResponse> RunAsync(
|
||||
IReadOnlyCollection<ChatMessage> messages,
|
||||
AgentThread? thread = null,
|
||||
AgentRunOptions? options = null,
|
||||
CancellationToken cancellationToken = default);
|
||||
|
||||
public abstract IAsyncEnumerable<AgentResponseUpdate> RunStreamingAsync(
|
||||
IReadOnlyCollection<ChatMessage> messages,
|
||||
AgentThread? thread = null,
|
||||
AgentRunOptions? options = null,
|
||||
CancellationToken cancellationToken = default);
|
||||
}
|
||||
|
||||
class AgentResponse // Compare with ChatResponse
|
||||
{
|
||||
public string Text { get; } // Aggregation of TextContent from messages.
|
||||
|
||||
public IList<ChatMessage> Messages { get; set; }
|
||||
|
||||
public string? ResponseId { get; set; }
|
||||
|
||||
// Metadata
|
||||
public string? AuthorName { get; set; }
|
||||
public DateTimeOffset? CreatedAt { get; set; }
|
||||
public object? RawRepresentation { get; set; }
|
||||
public UsageDetails? Usage { get; set; }
|
||||
public AdditionalPropertiesDictionary? AdditionalProperties { get; set; }
|
||||
}
|
||||
|
||||
// Not Included in AgentResponse compared to ChatResponse
|
||||
public ChatFinishReason? FinishReason { get; set; }
|
||||
public string? ConversationId { get; set; }
|
||||
public string? ModelId { get; set; }
|
||||
|
||||
public class AgentResponseUpdate // Compare with ChatResponseUpdate
|
||||
{
|
||||
public string Text { get; } // Aggregation of TextContent from Contents.
|
||||
|
||||
public IList<AIContent> Contents { get; set; }
|
||||
|
||||
public string? ResponseId { get; set; }
|
||||
public string? MessageId { get; set; }
|
||||
|
||||
// Metadata
|
||||
public ChatRole? Role { get; set; }
|
||||
public string? AuthorName { get; set; }
|
||||
public DateTimeOffset? CreatedAt { get; set; }
|
||||
public UsageDetails? Usage { get; set; }
|
||||
public object? RawRepresentation { get; set; }
|
||||
public AdditionalPropertiesDictionary? AdditionalProperties { get; set; }
|
||||
}
|
||||
|
||||
// Not Included in AgentResponseUpdate compared to ChatResponseUpdate
|
||||
public ChatFinishReason? FinishReason { get; set; }
|
||||
public string? ConversationId { get; set; }
|
||||
public string? ModelId { get; set; }
|
||||
```
|
||||
|
||||
- **PROS**: Agent response types can evolve separately. Types can still resemble MEAI response types to ensure a fimilar experience for developers.
|
||||
- **CONS**: No automatic inheritence of new properties from MEAI. (this might also be a pro)
|
||||
|
||||
## Long Running Processes Options
|
||||
|
||||
Some agent protocols, like A2A, support long running agentic processes. When invoking the agent
|
||||
in the non-streaming case, the agent may respond with an id of a process that was launched.
|
||||
|
||||
The caller is then expected to poll the service to get status updates using the id.
|
||||
The caller may also subscribe to updates from the process using the id.
|
||||
|
||||
We therefore need to be able to support providing this type of response to agent callers.
|
||||
|
||||
- **Option 1** Add a new `AIContent` type and `ChatFinishReason` for long running processes.
|
||||
- **Option 2** Add another property on a custom response type.
|
||||
|
||||
### Option 1: Add another AIContent type and ChatFinishReason for long running processes
|
||||
|
||||
```csharp
|
||||
public class AgentRunContent : AIContent
|
||||
{
|
||||
public string AgentRunId { get; set; }
|
||||
}
|
||||
|
||||
// Add a new long running chat finish reason.
|
||||
public class ChatFinishReason
|
||||
{
|
||||
public static ChatFinishReason LongRunning { get; } = new ChatFinishReason("long_running");
|
||||
}
|
||||
```
|
||||
|
||||
- **PROS**: Fits well into existing `ChatResponse` design.
|
||||
- **CONS**: More complex for users to extract the required long running result (can be mitigated with extenion methods)
|
||||
|
||||
### Option 2: Add another property on responses for AgentRun
|
||||
|
||||
```csharp
|
||||
class AgentResponse
|
||||
{
|
||||
...
|
||||
public AgentRun RunReference { get; set; } // Reference to long running process
|
||||
...
|
||||
}
|
||||
|
||||
|
||||
public class AgentResponseUpdate
|
||||
{
|
||||
...
|
||||
public AgentRun RunReference { get; set; } // Reference to long running process
|
||||
...
|
||||
}
|
||||
|
||||
// Add a new long running chat finish reason.
|
||||
public class ChatFinishReason
|
||||
{
|
||||
...
|
||||
public static ChatFinishReason LongRunning { get; } = new ChatFinishReason("long_running");
|
||||
...
|
||||
}
|
||||
|
||||
// Can be added in future: Class representing long running processing by the agent
|
||||
// that can be used to check for updates and status of the processing.
|
||||
public class AgentRun
|
||||
{
|
||||
public string AgentRunId { get; set; }
|
||||
}
|
||||
```
|
||||
|
||||
- **PROS**: Easy access to long running result values
|
||||
- **CONS**: Requires custom response types.
|
||||
|
||||
## Structured user input options (Work in progress)
|
||||
|
||||
Some agent services may ask end users a question while also providing a list of options that the user can pick from or a template for the input required.
|
||||
We need to decide whether to maintain an abstraction for these, so that similar types of structured input from different agents can be used by callers without
|
||||
needing to break out of the abstraction.
|
||||
|
||||
## Tool result options (Work in progress)
|
||||
|
||||
We need to consider abstractions for `AIContent` derived types for tool call results for common tool types beyond Function calls, e.g. CodeInterpreter, WebSearch, etc.
|
||||
|
||||
## StructuredOutputs
|
||||
|
||||
Structured outputs is a valueable aspect of any Agent system, since it forces an Agent to produce output in a required format, and may include required fields. This allows turning unstructured data into structured data easily using a general purpose language model.
|
||||
|
||||
Not all agent types necessarily support this or necessarily support this in the same way.
|
||||
Requesting a specific output schema at invocation time is widely supported by inference services though, and therefore inference based agents would support this well.
|
||||
Custom agents on the other hand may not necessarily want to support this, and forcing all custom Agent implementations to have a final structured output step to produce this complicates implementations.
|
||||
Custom agents may also have a built in output schema, that they always produce.
|
||||
|
||||
Options:
|
||||
|
||||
1. Support configuring the preferred structured output schema at agent construction time for those agents that support structured outputs.
|
||||
2. Support configuring the preferred structured output schema at invocation time, and ignore/throw if not supported (similar to IChatClient)
|
||||
3. Support both options with the invocation time schema overriding the construction time (or built in) schema if both are supported.
|
||||
|
||||
Note that where an agent doesn't support structured output, it may also be possible to use a decorator to produce structured output from the agent's unstructured response, thereby turning an agent that doesn't support this into one that does.
|
||||
|
||||
See [Structured Outputs Support](#structured-outputs-support) for a comparison on what other agent frameworks and protocols support.
|
||||
|
||||
To support a good user experience for structured outputs, I'm proposing that we follow the pattern used by MEAI.
|
||||
We would add a generic version of `AgentResponse<T>`, that allows us to get the agent result already deserialized into our preferred type.
|
||||
This would be coupled with generic overload extension methods for Run that automatically builds a schema from the supplied type and updates
|
||||
the run options.
|
||||
|
||||
If we support requesting a schema at invocation time the following would be the preferred approach:
|
||||
|
||||
```csharp
|
||||
class Movie
|
||||
{
|
||||
public string Title { get; set; }
|
||||
public string DirectorFullName { get; set; }
|
||||
public int ReleaseYear { get; set; }
|
||||
}
|
||||
|
||||
AgentResponse<Movie[]> response = agent.RunAsync<Movie[]>("What are the top 3 children's movies of the 80s.");
|
||||
Movie[] movies = response.Result
|
||||
```
|
||||
|
||||
If we only support requesting a schema at agent creation time or where an agent has a built in schema, the following would be the preferred approach:
|
||||
|
||||
```csharp
|
||||
AgentResponse response = agent.RunAsync("What are the top 3 children's movies of the 80s.");
|
||||
Movie[] movies = response.TryParseStructuredOutput<Movie[]>();
|
||||
```
|
||||
|
||||
## Decision Outcome
|
||||
|
||||
### Response Type Options Decision
|
||||
|
||||
Option 1.1 with the caveate that we cannot control the output of all agents. However, as far as possible we should have appropriate AIContext derived types for
|
||||
progress updates so that TextContent is not used for these.
|
||||
|
||||
### Custom Response Type Design Options Decision
|
||||
|
||||
Option 2 chosen so that we can vary Agent responses independently of Chat Client.
|
||||
|
||||
### StructuredOutputs Decision
|
||||
|
||||
We will not support structured output per run request, but individual agents are free to allow this on the concrete implementation or at construction time.
|
||||
We will however add support for easily extracting a structured output type from the `AgentResponse`.
|
||||
|
||||
## Addendum 1: AIContext Derived Types for different response types / Gap Analysis (Work in progress)
|
||||
|
||||
We need to decide what AIContent types, each agent response type will be mapped to.
|
||||
|
||||
| Number | DataType | AIContent Type |
|
||||
|-|-|-|
|
||||
| 1. | General response messages to the user | TextContent + DataContent + UriContent |
|
||||
| 2. | Structured confirmation requests to the user | ? |
|
||||
| 3. | Function invocation activities executed (both local and remote). For information only. | FunctionCallContent + FunctionResultContent |
|
||||
| 4. | Tool invocation activities executed (both local and remote). For information only. | FunctionCallContent/FunctionResultContent/Custom ? |
|
||||
| 5. | Reasoning/Thinking output. For information only. | TextReasoningContent |
|
||||
| 6. | Handoffs / transitions from agent to agent. | ? |
|
||||
| 7. | An indication that the agent is responding (i.e. typing) as if it's a real human. | ? |
|
||||
| 8. | Complete messages in addition to updates, when streaming | TextContent |
|
||||
| 9. | Id for long running process that is launched | ? |
|
||||
| 10. | Memory storage / lookups (are these just traces?) | ? |
|
||||
| 11. | RAG indexing / lookups (are these just traces?) | ? |
|
||||
| 12. | General status updates for human consumption / Tracing | ? |
|
||||
| 13. | Unknown Type | AIContent |
|
||||
|
||||
## Addendum 2: Other SDK feature comparison
|
||||
|
||||
### Structured Outputs Support
|
||||
|
||||
1. Configure Schema on Agent at Agent construction
|
||||
2. Pass schema at Agent invocation
|
||||
|
||||
| SDK | Structured Outputs support |
|
||||
|-|-|
|
||||
| AutoGen | **Approach 1** Supports [configuring an agent](https://microsoft.github.io/autogen/stable/user-guide/agentchat-user-guide/tutorial/agents.html#structured-output) at agent creation. |
|
||||
| Google ADK | **Approach 1** Both [input and output schemas can be specified for LLM Agents](https://google.github.io/adk-docs/agents/llm-agents/#structuring-data-input_schema-output_schema-output_key) at construction time. This option is specific to this agent type and other agent types do not necessarily support |
|
||||
| AWS (Strands) | **Approach 2** Supports a special invocation method called [structured_output](https://strandsagents.com/latest/documentation/docs/api-reference/python/agent/agent/#strands.agent.agent.Agent.structured_output) |
|
||||
| LangGraph | **Approach 1** Supports [configuring an agent](https://langchain-ai.github.io/langgraph/agents/agents/?h=structured#6-configure-structured-output) at agent construction time, and a [structured response](https://langchain-ai.github.io/langgraph/agents/run_agents/#output-format) can be retrieved as a special property on the agent response |
|
||||
| Agno | **Approach 1** Supports [configuring an agent](https://docs.agno.com/examples/getting-started/structured-output) at agent construction time |
|
||||
| A2A | **Informal Approach 2** Doesn't formally support schema negotiation, but [hints can be provided via metadata](https://a2a-protocol.org/latest/specification/#97-structured-data-exchange-requesting-and-providing-json) at invocation time |
|
||||
| Protocol Activity | Supports returning [Complex types](https://github.com/microsoft/Agents/blob/main/specs/activity/protocol-activity.md#complex-types) but no support for requesting a type |
|
||||
|
||||
### Response Reason Support
|
||||
|
||||
| SDK | Response Reason support |
|
||||
|-|-|
|
||||
| AutoGen | Supports a [stop reason](https://microsoft.github.io/autogen/stable/reference/python/autogen_agentchat.base.html#autogen_agentchat.base.TaskResult.stop_reason) which is a freeform text string |
|
||||
| Google ADK | [No equivalent present](https://github.com/google/adk-python/blob/main/src/google/adk/events/event.py) |
|
||||
| AWS (Strands) | Exposes a [stop_reason](https://strandsagents.com/latest/documentation/docs/api-reference/python/types/event_loop/#strands.types.event_loop.StopReason) property on the [AgentResult](https://strandsagents.com/latest/documentation/docs/api-reference/python/agent/agent_result/) class with options that are tied closely to LLM operations. |
|
||||
| LangGraph | No equivalent present, output contains only [messages](https://langchain-ai.github.io/langgraph/agents/run_agents/#output-format) |
|
||||
| Agno | [No equivalent present](https://docs.agno.com/reference/agents/run-response) |
|
||||
| A2A | No equivalent present, response only contains a [message](https://a2a-protocol.org/latest/specification/#64-message-object) or [task](https://a2a-protocol.org/latest/specification/#61-task-object). |
|
||||
| Protocol Activity | [No equivalent present.](https://github.com/microsoft/Agents/blob/main/specs/activity/protocol-activity.md) |
|
||||
1896
docs/decisions/0002-agent-tools.md
Normal file
1896
docs/decisions/0002-agent-tools.md
Normal file
File diff suppressed because it is too large
Load Diff
144
docs/decisions/0003-agent-opentelemetry-instrumentation.md
Normal file
144
docs/decisions/0003-agent-opentelemetry-instrumentation.md
Normal file
@@ -0,0 +1,144 @@
|
||||
---
|
||||
status: proposed
|
||||
contact: rogerbarreto
|
||||
date: 2025-07-14
|
||||
deciders: stephentoub, markwallace-microsoft, rogerbarreto, westey-m
|
||||
informed: {}
|
||||
---
|
||||
|
||||
# Agent OpenTelemetry Instrumentation
|
||||
|
||||
## Context and Problem Statement
|
||||
|
||||
Currently, the Agent Framework lacks comprehensive observability and telemetry capabilities, making it difficult for developers to monitor agent performance, track usage patterns, debug issues, and gain insights into agent behavior in production environments. While the underlying ChatClient implementations may have their own telemetry, there is no standardized way to capture agent-specific metrics and traces that provide visibility into agent operations, token usage, response times, and error patterns at the agent abstraction level.
|
||||
|
||||
## Decision Drivers
|
||||
|
||||
- **Compliance**: The implementation should adhere to established OpenTelemetry semantic conventions for agents, ensuring consistency and interoperability with existing telemetry systems.
|
||||
- **Observability Requirements**: Developers need comprehensive telemetry to monitor agent performance, track usage patterns, and debug issues in production environments.
|
||||
- **Standardization**: The solution must follow established OpenTelemetry semantic conventions and integrate seamlessly with existing .NET telemetry infrastructure.
|
||||
- **Microsoft.Extensions.AI Alignment**: The implementation should follow the exact patterns and conventions established by Microsoft.Extensions.AI's OpenTelemetry instrumentation.
|
||||
- **Non-Intrusive Design**: Telemetry should be optional and not impact the core agent functionality or performance when disabled.
|
||||
- **Agent-Level Insights**: The telemetry should capture agent-specific operations without duplicating underlying ChatClient telemetry.
|
||||
- **Extensibility**: The solution should support future enhancements and additional telemetry scenarios.
|
||||
|
||||
## Considered Options
|
||||
|
||||
### Option 1: Direct Integration into Core Agent Classes
|
||||
|
||||
Embed OpenTelemetry instrumentation directly into the base `Agent` class and `ChatClientAgent` implementations.
|
||||
|
||||
#### Pros
|
||||
- Automatic telemetry for all agent implementations
|
||||
- No additional wrapper classes needed
|
||||
- Consistent telemetry across all agents
|
||||
|
||||
#### Cons
|
||||
- Violates single responsibility principle
|
||||
- Increases complexity of core agent classes
|
||||
- Makes telemetry mandatory rather than optional
|
||||
- Harder to test and maintain
|
||||
- Couples telemetry concerns with business logic
|
||||
|
||||
### Option 2: Aspect-Oriented Programming (AOP) Approach
|
||||
|
||||
Use interceptors or AOP frameworks to inject telemetry behavior into agent methods.
|
||||
|
||||
#### Pros
|
||||
- Clean separation of concerns
|
||||
- Non-intrusive to existing code
|
||||
- Can be applied selectively
|
||||
|
||||
#### Cons
|
||||
- Adds complexity with AOP framework dependencies
|
||||
- Runtime overhead for interception
|
||||
- Harder to debug and understand
|
||||
- Not consistent with Microsoft.Extensions.AI patterns
|
||||
|
||||
### Option 3: OpenTelemetryAgent Wrapper Pattern
|
||||
|
||||
Create a delegating `OpenTelemetryAgent` wrapper class that implements the `Agent` interface and wraps any existing agent with telemetry instrumentation, following the exact pattern of Microsoft.Extensions.AI's `OpenTelemetryChatClient`.
|
||||
|
||||
#### Pros
|
||||
- Follows established Microsoft.Extensions.AI patterns exactly
|
||||
- Clean separation of concerns
|
||||
- Optional and non-intrusive
|
||||
- Easy to test and maintain
|
||||
- Consistent with .NET telemetry conventions
|
||||
- Supports any agent implementation
|
||||
- Provides agent-level telemetry without duplicating ChatClient telemetry
|
||||
|
||||
#### Cons
|
||||
- Requires explicit wrapping of agents
|
||||
- Additional object allocation for wrapper
|
||||
|
||||
## Decision Outcome
|
||||
|
||||
Chosen option: "OpenTelemetryAgent Wrapper Pattern", because it follows the established Microsoft.Extensions.AI patterns exactly, provides clean separation of concerns, maintains optional telemetry, and offers the best balance of functionality, maintainability, and consistency with existing .NET telemetry infrastructure.
|
||||
|
||||
### Implementation Details
|
||||
|
||||
The implementation includes:
|
||||
|
||||
1. **OpenTelemetryAgent Wrapper Class**: A delegating agent that wraps any `Agent` implementation with telemetry instrumentation
|
||||
2. **AgentOpenTelemetryConsts**: Comprehensive constants for telemetry attribute names and metric definitions
|
||||
3. **Extension Methods**: `.WithOpenTelemetry()` extension method for easy agent wrapping
|
||||
4. **Comprehensive Test Suite**: Full test coverage following Microsoft.Extensions.AI testing patterns
|
||||
|
||||
### Telemetry Data Captured
|
||||
|
||||
**Activities/Spans:**
|
||||
- `agent.operation.name` (agent.run, agent.run_streaming)
|
||||
- `agent.request.id`, `agent.request.name`, `agent.request.instructions`
|
||||
- `agent.request.message_count`, `agent.request.thread_id`
|
||||
- `agent.response.id`, `agent.response.message_count`, `agent.response.finish_reason`
|
||||
- `agent.usage.input_tokens`, `agent.usage.output_tokens`
|
||||
- Error information and activity status codes
|
||||
|
||||
**Metrics:**
|
||||
- Operation duration histogram with proper buckets
|
||||
- Token usage histogram (input/output tokens)
|
||||
- Request count counter
|
||||
- All metrics tagged with operation type and agent name
|
||||
|
||||
### Consequences
|
||||
|
||||
- **Good**: Provides comprehensive agent-level observability following established patterns
|
||||
- **Good**: Non-intrusive and optional implementation that doesn't affect core functionality
|
||||
- **Good**: Consistent with Microsoft.Extensions.AI telemetry conventions
|
||||
- **Good**: Easy to integrate with existing OpenTelemetry infrastructure
|
||||
- **Good**: Supports debugging, monitoring, and performance analysis
|
||||
- **Neutral**: Requires explicit wrapping of agents with `.WithOpenTelemetry()`
|
||||
- **Neutral**: Additional object allocation for telemetry wrapper
|
||||
|
||||
## Validation
|
||||
|
||||
The implementation is validated through:
|
||||
|
||||
1. **Comprehensive Unit Tests**: 16 test methods covering all scenarios including success, error, streaming, and edge cases
|
||||
2. **Integration Testing**: Step05 telemetry sample demonstrating real-world usage
|
||||
3. **Pattern Compliance**: Exact adherence to Microsoft.Extensions.AI OpenTelemetry patterns
|
||||
4. **Semantic Convention Compliance**: Follows OpenTelemetry semantic conventions for telemetry data
|
||||
|
||||
## More Information
|
||||
|
||||
### Usage Example
|
||||
|
||||
```csharp
|
||||
// Create TracerProvider
|
||||
using var tracerProvider = Sdk.CreateTracerProviderBuilder()
|
||||
.AddSource(AgentOpenTelemetryConsts.DefaultSourceName)
|
||||
.AddConsoleExporter()
|
||||
.Build();
|
||||
|
||||
// Create and wrap agent with telemetry
|
||||
var baseAgent = new ChatClientAgent(chatClient, options);
|
||||
using var telemetryAgent = baseAgent.WithOpenTelemetry();
|
||||
|
||||
// Use agent normally - telemetry is captured automatically
|
||||
var response = await telemetryAgent.RunAsync(messages);
|
||||
```
|
||||
|
||||
### Relationship to Microsoft.Extensions.AI
|
||||
|
||||
This implementation follows the exact patterns established by Microsoft.Extensions.AI's OpenTelemetry instrumentation, ensuring consistency across the AI ecosystem and leveraging proven patterns for telemetry integration.
|
||||
62
docs/decisions/0004-foundry-sdk-extensions.md
Normal file
62
docs/decisions/0004-foundry-sdk-extensions.md
Normal file
@@ -0,0 +1,62 @@
|
||||
---
|
||||
# These are optional elements. Feel free to remove any of them.
|
||||
status: proposed
|
||||
contact: markwallace-microsoft
|
||||
date: 2025-08-06
|
||||
deciders: markwallace-microsoft, westey-m, quibitron, trrwilson
|
||||
consulted:
|
||||
informed:
|
||||
---
|
||||
|
||||
# `Azure.AI.Agents.Persistent` package Extensions Methods for Agent Framework
|
||||
|
||||
## Context and Problem Statement
|
||||
|
||||
To align the `Azure.AI.Agents.Persistent` package and Agent Framework a set of extensions methods have been created which allow a developer to create or retrieve an `AIAgent` using the `PersistentAgentsClient`.
|
||||
The purpose of this ADR is to decide where these extension methods should live.
|
||||
|
||||
## Decision Drivers
|
||||
|
||||
- Provide the optimum experience for developers.
|
||||
- Avoid adding additional dependencies to the `Azure.AI.Agents.Persistent` package (and not in the future)
|
||||
|
||||
## Considered Options
|
||||
|
||||
- Add the extension methods to the `Azure.AI.Agents.Persistent` package and change it's dependencies
|
||||
- Add the extension methods to the `Azure.AI.Agents.Persistent` package without changing it's dependencies
|
||||
- Add the extension methods to a `Microsoft.Extensions.AI.Azure` package
|
||||
|
||||
|
||||
### Add the extension methods to the `Azure.AI.Agents.Persistent` package and change it's dependencies
|
||||
|
||||
- `Azure.AI.Agents.Persistent` would depend on `Microsoft.Extensions.AI` instead of `Microsoft.Extensions.AI.Abstractions`
|
||||
|
||||
- Good because, extension methods are in the `Azure.AI.Agents.Persistent` package and can be easily kept up-to-date
|
||||
- Good because, developers don't need to explicitly depend on a new package to get Agent Framework functionality
|
||||
- Bad because, it introduces additional dependencies which would possibly grow overtime
|
||||
|
||||
|
||||
### - Add the extension methods to the `Azure.AI.Agents.Persistent` package without changing it's dependencies
|
||||
|
||||
- `Azure.AI.Agents.Persistent` would depend on `Microsoft.Extensions.AI.Abstractions` (as it currently does)
|
||||
- `ChatClientAgent` and `FunctionInvokingChatClient` would move to `Microsoft.Extensions.AI.Abstractions`
|
||||
|
||||
- Good because, extension methods are in the `Azure.AI.Agents.Persistent` package and can be easily kept up-to-date
|
||||
- Good because, developers don't need to explicitly depend on a new package to get Agent Framework functionality
|
||||
- Good because, it introduces minimal additional dependencies
|
||||
- Bad because, it adds additional dependencies to `Microsoft.Extensions.AI.Abstractions` and these additional dependencies add up as transitive to `Azure`.AI.Agents.Persistent`
|
||||
|
||||
|
||||
### Add the extension methods to a `Microsoft.Extensions.AI.Azure` package
|
||||
|
||||
- Introduce a new package called `Microsoft.Extensions.AI.Azure` where the extension methods would live
|
||||
- `Azure.AI.Agents.Persistent` does not change
|
||||
|
||||
- Good because, it introduces no additional dependencies to `Azure.AI.Agents.Persistent` package
|
||||
- Bad because, extension methods are not in the `Azure.AI.Agents.Persistent` package and cannot be easily kept up-to-date
|
||||
- Bad because, developers need to explicitly depend on a new package to get Agent Framework functionality
|
||||
|
||||
## Decision Outcome
|
||||
|
||||
Chosen option: "Add the extension methods to a `Microsoft.Extensions.AI.Azure` package", because
|
||||
it introduces no additional dependencies to `Azure.AI.Agents.Persistent` package.
|
||||
70
docs/decisions/0005-python-naming-conventions.md
Normal file
70
docs/decisions/0005-python-naming-conventions.md
Normal file
@@ -0,0 +1,70 @@
|
||||
---
|
||||
status: accepted
|
||||
contact: eavanvalkenburg
|
||||
date: 2025-09-04
|
||||
deciders: markwallace-microsoft, dmytrostruk, peterychang, ekzhu, sphenry
|
||||
consulted: taochenosu, alliscode, moonbox3, johanste
|
||||
---
|
||||
|
||||
# Python naming conventions and renames (ADR)
|
||||
|
||||
## Context and Problem Statement
|
||||
|
||||
The project has a public .NET surface and a Python surface. During a cross-language alignment effort the community proposed renames to make the Python surface more idiomatic while preserving discoverability and mapping to the .NET names. This ADR captures the final naming decisions (or the proposed ones), the rationale, and the alternatives considered and rejected.
|
||||
|
||||
## Decision drivers
|
||||
|
||||
- Follow Python naming conventions (PEP 8) where appropriate (snake_case for functions and module-level variables, PascalCase for classes).
|
||||
- Preserve conceptual parity with .NET names to make it easy for developers reading both surfaces to correlate types and behaviors.
|
||||
- Avoid ambiguous or overloaded names in Python that could conflict with stdlib, common third-party packages, or existing package/module names.
|
||||
- Prefer clarity and discoverability in the public API surface over strict symmetry with .NET when Python conventions conflict.
|
||||
- Minimize churn and migration burden for existing Python users where backwards compatibility is feasible.
|
||||
|
||||
## Principles applied
|
||||
|
||||
- Map .NET PascalCase class names to PascalCase Python classes when they represent types.
|
||||
- Map .NET method/field names that are camelCase to snake_case in Python where they will be used as functions or module-level attributes.
|
||||
- When a .NET name is an acronym or initialism, use Python-friendly casing (e.g., `Http` -> `HTTP` in classes, but acronyms in function names should be lowercased per PEP 8 where sensible).
|
||||
- Avoid names that shadow common stdlib modules (e.g., `logging`, `asyncio`) or widely used third-party modules.
|
||||
- When multiple reasonable Python names exist, prefer the one that communicates intent most clearly to Python users, and record rejected alternatives in the table with justification.
|
||||
|
||||
## Renaming table
|
||||
|
||||
The table below represents the majority of the naming changes discussed in issue #506. Each row has:
|
||||
- Original and/or .NET name — the canonical name used in dotnet or earlier Python variants.
|
||||
- New name — the chosen Python name.
|
||||
- Status — accepted if the new name differs from the original, rejected if unchanged.
|
||||
- Reasoning — short rationale why the new name was chosen.
|
||||
- Rejected alternatives — other candidate new names that were considered and rejected; include the rejected 'new name' values and the reason each was rejected.
|
||||
|
||||
| Original and/or .NET name | New name (Python) | Status | Reasoning | Rejected alternatives (as "new name" + reason rejected) |
|
||||
|---|---|---|---|---|
|
||||
| AIAgent | AgentProtocol | accepted | The AI prefix is meaningless in the context of the Agent Framework, and the `protocol` suffix makes it very clear that this is a protocol, and not a concrete agent implementation. | <ul><li>AgentLike, not seen in many other places, but was a frontrunner.</li><li>Agent, as too generic.</li><li>BaseAgent/AbstractAgent, it is not a base/ABC class and should not be treated as such.</li></ul> |
|
||||
| ChatClientAgent | ChatAgent | accepted | Type name is shorter, while it is still clear that a ChatClient is used, also by virtue of the first parameter for initialization. | Agent, as too generic. |
|
||||
| ChatClient/IChatClient (in dotnet) | ChatClientProtocol | accepted | Keeping this protocol in sync with the AgentProtocol naming. | Similar as AgentProtocol. |
|
||||
| ChatClientBase | BaseChatClient | accepted | Following convention, serves as base class so, should be named accordingly. | None |
|
||||
| AITool | ToolProtocol | accepted | In line with other protocols. | Tool, too generic. |
|
||||
| AIToolBase | BaseTool | accepted | More descriptive than just Tool, while still concise. | AbstractTool/BaseTool, it is not an abstract/base class and should not be treated as such. |
|
||||
| ChatRole | Role | accepted | More concise while still clear in context. | None |
|
||||
| ChatFinishReason | FinishReason | accepted | More concise while still clear in context. | None |
|
||||
| AIContent | BaseContent | accepted | More accurate as it serves as the base class for all content types. | Content, too generic. |
|
||||
| AIContents | Contents | accepted | This is the annotated typing object that is the union of all concrete content types, so plural makes sense and since this is used as a type hint, the generic nature of the name is acceptable. | None |
|
||||
| AIAnnotations | Annotations | accepted | In sync with contents | None |
|
||||
| AIAnnotation | BaseAnnotation | accepted | In sync with contents | None |
|
||||
| *Mcp* & *Http* | *MCP* & *HTTP* | accepted | Acronyms should be uppercased in class names, according to PEP 8. | None |
|
||||
| `agent.run_streaming` | `agent.run_stream` | accepted | Shorter and more closely aligns with AutoGen and Semantic Kernel names for the same methods. | None |
|
||||
| `workflow.run_streaming` | `workflow.run_stream` | accepted | In sync with `agent.run_stream` and shorter and more closely aligns with AutoGen and Semantic Kernel names for the same methods. | None |
|
||||
| AgentResponse & AgentResponseUpdate | AgentResponse & AgentResponseUpdate | rejected | Rejected, because it is the response to a run invocation and AgentResponse is too generic. | None |
|
||||
| *Content | * | rejected | Rejected other content type renames (removing `Content` suffix) because it would reduce clarity and discoverability. | Item was also considered, but rejected as it is very similar to Content, but would be inconsistent with dotnet. |
|
||||
| ChatResponse & ChatResponseUpdate | Response & ResponseUpdate | rejected | Rejected, because Response is too generic. | None |
|
||||
|
||||
## Naming guidance
|
||||
In general Python tends to prefer shorter names, while .NET tends to prefer more descriptive names. The table above captures the specific renames agreed upon, but in general the following guidelines were applied:
|
||||
- Use [PEP 8](https://peps.python.org/pep-0008/) for generic naming conventions (snake_case for functions and module-level variables, PascalCase for classes).
|
||||
|
||||
When mapping .NET names to Python:
|
||||
- Remove `AI` prefix when appropriate, as it is often redundant in the context of an AI SDK.
|
||||
- Remove `Chat` prefix when the context is clear (e.g., Role and FinishReason).
|
||||
- Use `Protocol` suffix for interfaces/protocols to clarify their purpose.
|
||||
- Use `Base` prefix for base classes that are not abstract but serve as a common ancestor for internal implementations.
|
||||
- When readability improves while it is still easy to understand what it does and how it maps to the .NET name, prefer the shorter name.
|
||||
521
docs/decisions/0006-userapproval.md
Normal file
521
docs/decisions/0006-userapproval.md
Normal file
@@ -0,0 +1,521 @@
|
||||
---
|
||||
# These are optional elements. Feel free to remove any of them.
|
||||
status: accepted
|
||||
contact: westey-m
|
||||
date: 2025-09-12 {YYYY-MM-DD when the decision was last updated}
|
||||
deciders: sergeymenshykh, markwallace-microsoft, rogerbarreto, dmytrostruk, westey-m, eavanvalkenburg, stephentoub, peterychang
|
||||
consulted:
|
||||
informed:
|
||||
---
|
||||
|
||||
# Agent User Approvals Content Types and FunctionCall approvals Design
|
||||
|
||||
## Context and Problem Statement
|
||||
|
||||
When agents are operating on behalf of a user, there may be cases where the agent requires user approval to continue an operation.
|
||||
This is complicated by the fact that an agent may be remote and the user may not immediately be available to provide the approval.
|
||||
|
||||
Inference services are also increasingly supporting built-in tools or service side MCP invocation, which may require user approval before the tool can be invoked.
|
||||
|
||||
This document aims to provide options and capture the decision on how to model this user approval interaction with the agent caller.
|
||||
|
||||
See various features that would need to be supported via this type of mechanism, plus how various other frameworks support this:
|
||||
|
||||
- Also see [dotnet issue 6492](https://github.com/dotnet/extensions/issues/6492), which discusses the need for a similar pattern in the context of MCP approvals.
|
||||
- Also see [the openai human-in-the-loop guide](https://openai.github.io/openai-agents-js/guides/human-in-the-loop/#approval-requests).
|
||||
- Also see [the openai MCP guide](https://openai.github.io/openai-agents-js/guides/mcp/#optional-approval-flow).
|
||||
- Also see [MCP Approval Requests from OpenAI](https://platform.openai.com/docs/guides/tools-remote-mcp#approvals).
|
||||
- Also see [Azure AI Foundry MCP Approvals](https://learn.microsoft.com/en-us/azure/ai-foundry/agents/how-to/tools/model-context-protocol-samples?pivots=rest#submit-your-approval).
|
||||
- Also see [MCP Elicitation requests](https://modelcontextprotocol.io/specification/draft/client/elicitation)
|
||||
|
||||
## Decision Drivers
|
||||
|
||||
- Agents should encapsulate their internal logic and not leak it to the caller.
|
||||
- We need to support approvals for local actions as well as remote actions.
|
||||
- We need to support approvals for service-side tool use, such as remote MCP tool invocations
|
||||
- We should consider how other user input requests will be modeled, so that we can have a consistent approach for user input requests and approvals.
|
||||
|
||||
## Considered Options
|
||||
|
||||
### 1. Return a FunctionCallContent to the agent caller, that it executes
|
||||
|
||||
This introduces a manual function calling element to agents, where the caller of the agent is expected to invoke the function if the user approves it.
|
||||
|
||||
This approach is problematic for a number of reasons:
|
||||
|
||||
- This may not work for remote agents (e.g. via A2A), where the function that the agent wants to call does not reside on the caller's machine.
|
||||
- The main value prop of an agent is to encapsulate the internal logic of the agent, but this leaks that logic to the caller, requiring the caller to know how to invoke the agent's function calls.
|
||||
- Inference services are introducing their own approval content types for server side tool or function invocation, and will not be addressed by this approach.
|
||||
|
||||
### 2. Introduce an ApprovalCallback in AgentRunOptions and ChatOptions
|
||||
|
||||
This approach allows a caller to provide a callback that the agent can invoke when it requires user approval.
|
||||
|
||||
This approach is easy to use when the user and agent are in the same application context, such as a desktop application, where the application can show the approval request to the user and get their response from the callback before continuing the agent run.
|
||||
|
||||
This approach does not work well for cases where the agent is hosted in a remote service, and where there is no user available to provide the approval in the same application context.
|
||||
For cases like this, the agent needs to be suspended, and a network response must be sent to the client app. After the user provides their approval, the client app must call the service that hosts the agent again, with the user's decision, and the agent needs to be resumed. However, with a callback, the agent is deep in the call stack and cannot be suspended or resumed like this.
|
||||
|
||||
```csharp
|
||||
class AgentRunOptions
|
||||
{
|
||||
public Func<ApprovalRequestContent, Task<ApprovalResponseContent>>? ApprovalCallback { get; set; }
|
||||
}
|
||||
|
||||
agent.RunAsync("Please book me a flight for Friday to Paris.", thread, new AgentRunOptions
|
||||
{
|
||||
ApprovalCallback = async (approvalRequest) =>
|
||||
{
|
||||
// Show the approval request to the user in the appropriate format.
|
||||
// The user can then approve or reject the request.
|
||||
// The optional FunctionCallContent can be used to show the user what function the agent wants to call with the parameter set:
|
||||
// approvalRequest.FunctionCall?.Arguments.
|
||||
|
||||
// If the user approves:
|
||||
return true;
|
||||
}
|
||||
});
|
||||
```
|
||||
|
||||
### 3. Introduce new ApprovalRequestContent and ApprovalResponseContent types
|
||||
|
||||
The agent would return an `ApprovalRequestContent` to the caller, which would then be responsible for getting approval from the user in whatever way is appropriate for the application.
|
||||
The caller would then invoke the agent again with an `ApprovalResponseContent` to the agent containing the user decision.
|
||||
|
||||
When an agent returns an `ApprovalRequestContent`, the run is finished for the time being, and to continue, the agent must be invoked again with an `ApprovalResponseContent` on the same thread as the original request. This doesn't of course have to be the exact same thread object, but it should have the equivalent contents as the original thread, since the agent would have stored the `ApprovalRequestContent` in its thread state.
|
||||
|
||||
The `ApprovalRequestContent` could contain an optional `FunctionCallContent` if the approval is for a function call, along with any additional information that the agent wants to provide to the user to help them make a decision.
|
||||
|
||||
It is up to the agent to decide when and if a user approval is required, and therefore when to return an `ApprovalRequestContent`.
|
||||
|
||||
`ApprovalRequestContent` and `ApprovalResponseContent` will not necessarily always map to a supported content type for the underlying service or agent thread storage.
|
||||
Specifically, when we are deciding in the IChatClient stack to ask for approval from the user, for a function call, this does not mean that the underlying ai service or
|
||||
service side thread type (where applicable) supports the concept of a function call approval request. While we can store the approval requests and response in local
|
||||
threads, service managed threads won't necessarily support this. For service managed threads, there will therefore be no long term record of the approval request in the chat history.
|
||||
We should however log approvals so that there is a trace of this for debugging and auditing purposes.
|
||||
|
||||
Suggested Types:
|
||||
|
||||
```csharp
|
||||
class ApprovalRequestContent : AIContent
|
||||
{
|
||||
// An ID to uniquely identify the approval request/response pair.
|
||||
public string Id { get; set; }
|
||||
|
||||
// An optional user targeted message to explain what needs to be approved.
|
||||
public string? Text { get; set; }
|
||||
|
||||
// Optional: If the approval is for a function call, this will contain the function call content.
|
||||
public FunctionCallContent? FunctionCall { get; set; }
|
||||
|
||||
public ApprovalResponseContent CreateApproval()
|
||||
{
|
||||
return new ApprovalResponseContent
|
||||
{
|
||||
Id = this.Id,
|
||||
Approved = true,
|
||||
FunctionCall = this.FunctionCall
|
||||
};
|
||||
}
|
||||
|
||||
public ApprovalResponseContent CreateRejection()
|
||||
{
|
||||
return new ApprovalResponseContent
|
||||
{
|
||||
Id = this.Id,
|
||||
Approved = false,
|
||||
FunctionCall = this.FunctionCall
|
||||
};
|
||||
}
|
||||
}
|
||||
|
||||
class ApprovalResponseContent : AIContent
|
||||
{
|
||||
// An ID to uniquely identify the approval request/response pair.
|
||||
public string Id { get; set; }
|
||||
|
||||
// Indicates whether the user approved the request.
|
||||
public bool Approved { get; set; }
|
||||
|
||||
// Optional: If the approval is for a function call, this will contain the function call content.
|
||||
public FunctionCallContent? FunctionCall { get; set; }
|
||||
}
|
||||
|
||||
var response = await agent.RunAsync("Please book me a flight for Friday to Paris.", thread);
|
||||
while (response.ApprovalRequests.Count > 0)
|
||||
{
|
||||
List<ChatMessage> messages = new List<ChatMessage>();
|
||||
foreach (var approvalRequest in response.ApprovalRequests)
|
||||
{
|
||||
// Show the approval request to the user in the appropriate format.
|
||||
// The user can then approve or reject the request.
|
||||
// The optional FunctionCallContent can be used to show the user what function the agent wants to call with the parameter set:
|
||||
// approvalRequest.FunctionCall?.Arguments.
|
||||
// The Text property of the ApprovalRequestContent can also be used to show the user any additional textual context about the request.
|
||||
|
||||
// If the user approves:
|
||||
messages.Add(new ChatMessage(ChatRole.User, [approvalRequest.CreateApproval()]));
|
||||
}
|
||||
|
||||
// Get the next response from the agent.
|
||||
response = await agent.RunAsync(messages, thread);
|
||||
}
|
||||
|
||||
class AgentResponse
|
||||
{
|
||||
...
|
||||
|
||||
// A new property on AgentResponse to aggregate the ApprovalRequestContent items from
|
||||
// the response messages (Similar to the Text property).
|
||||
public IEnumerable<ApprovalRequestContent> ApprovalRequests { get; set; }
|
||||
|
||||
...
|
||||
}
|
||||
```
|
||||
|
||||
### 4. Introduce new Container UserInputRequestContent and UserInputResponseContent types
|
||||
|
||||
This approach is similar to the `ApprovalRequestContent` and `ApprovalResponseContent` types, but is more generic and can be used for any type of user input request, not just approvals.
|
||||
|
||||
There is some ambiguity with this approach. When using an LLM based agent the LLM may return a text response about missing user input.
|
||||
E.g the LLM may need to invoke a function but the user did not supply all necessary information to fill out all arguments.
|
||||
Typically an LLM would just respond with a text message asking the user for the missing information.
|
||||
In this case, the message is not distinguishable from any other result message, and therefore cannot be returned to the caller as a `UserInputRequestContent`, even though it is conceptually a type of unstructured user input request. Ultimately our types are modeled to make it easy for callers to decide on the right way to represent this to users. E.g. is it just a regular message to show to users, or do we need a special UX for it.
|
||||
|
||||
Suggested Types:
|
||||
|
||||
```csharp
|
||||
class UserInputRequestContent : AIContent
|
||||
{
|
||||
// An ID to uniquely identify the approval request/response pair.
|
||||
public string ApprovalId { get; set; }
|
||||
|
||||
// DecisionTarget could contain:
|
||||
// FunctionCallContent: The function call that the agent wants to invoke.
|
||||
// TextContent: Text that describes the question for that the user should answer.
|
||||
object? DecisionTarget { get; set; } // Anything else the user may need to make a decision about.
|
||||
|
||||
// Possible InputFormat subclasses:
|
||||
// SchemaInputFormat: Contains a schema for the user input.
|
||||
// ApprovalInputFormat: Indicates that the user needs to approve something.
|
||||
// FreeformTextInputFormat: Indicates that the user can provide freeform text input.
|
||||
// Other formats can be added as needed, e.g. cards when using activity protocol.
|
||||
public InputFormat InputFormat { get; set; } // How the user should provide input (e.g., form, options, etc.).
|
||||
}
|
||||
|
||||
class UserInputResponseContent : AIContent
|
||||
{
|
||||
// An ID to uniquely identify the approval request/response pair.
|
||||
public string ApprovalId { get; set; }
|
||||
|
||||
// Possible UserInputResult subclasses:
|
||||
// SchemaInputResult: Contains the structured data provided by the user.
|
||||
// ApprovalResult: Contains a bool with approved / rejected.
|
||||
// FreeformTextResult: Contains the freeform text input provided by the user.
|
||||
public UserInputResult Result { get; set; } // The user input.
|
||||
|
||||
public object? DecisionTarget { get; set; } // A copy of the DecisionTarget from the UserInputRequestContent, if applicable.
|
||||
}
|
||||
|
||||
var response = await agent.RunAsync("Please book me a flight for Friday to Paris.", thread);
|
||||
while (response.UserInputRequests.Any())
|
||||
{
|
||||
List<ChatMessage> messages = new List<ChatMessage>();
|
||||
foreach (var userInputRequest in response.UserInputRequests)
|
||||
{
|
||||
// Show the user input request to the user in the appropriate format.
|
||||
// The DecisionTarget can be used to show the user what function the agent wants to call with the parameter set.
|
||||
// The InputFormat property can be used to determine the type of UX when allowing users to provide input.
|
||||
|
||||
if (userInputRequest.InputFormat is ApprovalInputFormat approvalInputFormat)
|
||||
{
|
||||
// Here we need to show the user an approval request.
|
||||
// We can use the DecisionTarget to show e.g. the function call that the agent wants to invoke.
|
||||
// The user can then approve or reject the request.
|
||||
|
||||
// If the user approves:
|
||||
var approvalMessage = new ChatMessage(ChatRole.User, new UserInputResponseContent {
|
||||
ApprovalId = userInputRequest.ApprovalId,
|
||||
Result = new ApprovalResult { Approved = true },
|
||||
DecisionTarget = userInputRequest.DecisionTarget
|
||||
});
|
||||
messages.Add(approvalMessage);
|
||||
}
|
||||
else
|
||||
{
|
||||
throw new NotSupportedException("Unsupported InputFormat type.");
|
||||
}
|
||||
}
|
||||
|
||||
// Get the next response from the agent.
|
||||
response = await agent.RunAsync(messages, thread);
|
||||
}
|
||||
|
||||
class AgentResponse
|
||||
{
|
||||
...
|
||||
|
||||
// A new property on AgentResponse to aggregate the UserInputRequestContent items from
|
||||
// the response messages (Similar to the Text property).
|
||||
public IReadOnlyList<UserInputRequestContent> UserInputRequests { get; set; }
|
||||
|
||||
...
|
||||
}
|
||||
```
|
||||
|
||||
### 5. Introduce new Base UserInputRequestContent and UserInputResponseContent types
|
||||
|
||||
This approach is similar to option 4, but the `UserInputRequestContent` and `UserInputResponseContent` types are base classes rather than generic container types.
|
||||
|
||||
Suggested Types:
|
||||
|
||||
```csharp
|
||||
class UserInputRequestContent : AIContent
|
||||
{
|
||||
// An ID to uniquely identify the approval request/response pair.
|
||||
public string Id { get; set; }
|
||||
}
|
||||
|
||||
class UserInputResponseContent : AIContent
|
||||
{
|
||||
// An ID to uniquely identify the approval request/response pair.
|
||||
public string Id { get; set; }
|
||||
}
|
||||
|
||||
// -----------------------------------
|
||||
// Used for approving a function call.
|
||||
class FunctionApprovalRequestContent : UserInputRequestContent
|
||||
{
|
||||
// Contains the function call that the agent wants to invoke.
|
||||
public FunctionCallContent FunctionCall { get; set; }
|
||||
|
||||
public ApprovalResponseContent CreateApproval()
|
||||
{
|
||||
return new ApprovalResponseContent
|
||||
{
|
||||
Id = this.Id,
|
||||
Approved = true,
|
||||
FunctionCall = this.FunctionCall
|
||||
};
|
||||
}
|
||||
|
||||
public ApprovalResponseContent CreateRejection()
|
||||
{
|
||||
return new ApprovalResponseContent
|
||||
{
|
||||
Id = this.Id,
|
||||
Approved = false,
|
||||
FunctionCall = this.FunctionCall
|
||||
};
|
||||
}
|
||||
}
|
||||
class FunctionApprovalResponseContent : UserInputResponseContent
|
||||
{
|
||||
// Indicates whether the user approved the request.
|
||||
public bool Approved { get; set; }
|
||||
|
||||
// Contains the function call that the agent wants to invoke.
|
||||
public FunctionCallContent FunctionCall { get; set; }
|
||||
}
|
||||
|
||||
// --------------------------------------------------
|
||||
// Used for approving a request described using text.
|
||||
class TextApprovalRequestContent : UserInputRequestContent
|
||||
{
|
||||
// A user targeted message to explain what needs to be approved.
|
||||
public string Text { get; set; }
|
||||
}
|
||||
class TextApprovalResponseContent : UserInputResponseContent
|
||||
{
|
||||
// Indicates whether the user approved the request.
|
||||
public bool Approved { get; set; }
|
||||
}
|
||||
|
||||
// ------------------------------------------------
|
||||
// Used for providing input in a structured format.
|
||||
class StructuredDataInputRequestContent : UserInputRequestContent
|
||||
{
|
||||
// A user targeted message to explain what is being requested.
|
||||
public string? Text { get; set; }
|
||||
|
||||
// Contains the schema for the user input.
|
||||
public JsonElement Schema { get; set; }
|
||||
}
|
||||
class StructuredDataInputResponseContent : UserInputResponseContent
|
||||
{
|
||||
// Contains the structured data provided by the user.
|
||||
public JsonElement StructuredData { get; set; }
|
||||
}
|
||||
|
||||
var response = await agent.RunAsync("Please book me a flight for Friday to Paris.", thread);
|
||||
while (response.UserInputRequests.Any())
|
||||
{
|
||||
List<ChatMessage> messages = new List<ChatMessage>();
|
||||
foreach (var userInputRequest in response.UserInputRequests)
|
||||
{
|
||||
if (userInputRequest is FunctionApprovalRequestContent approvalRequest)
|
||||
{
|
||||
// Here we need to show the user an approval request.
|
||||
// We can use the FunctionCall property to show e.g. the function call that the agent wants to invoke.
|
||||
// If the user approves:
|
||||
messages.Add(new ChatMessage(ChatRole.User, approvalRequest.CreateApproval()));
|
||||
}
|
||||
}
|
||||
|
||||
// Get the next response from the agent.
|
||||
response = await agent.RunAsync(messages, thread);
|
||||
}
|
||||
|
||||
class AgentResponse
|
||||
{
|
||||
...
|
||||
|
||||
// A new property on AgentResponse to aggregate the UserInputRequestContent items from
|
||||
// the response messages (Similar to the Text property).
|
||||
public IEnumerable<UserInputRequestContent> UserInputRequests { get; set; }
|
||||
|
||||
...
|
||||
}
|
||||
```
|
||||
|
||||
## Decision Outcome
|
||||
|
||||
Chosen option 5.
|
||||
|
||||
## Appendices
|
||||
|
||||
### ChatClientAgent Approval Process Flow
|
||||
|
||||
1. User passes a User message to the agent with a request.
|
||||
1. Agent calls IChatClient with any functions registered on the agent.
|
||||
(IChatClient has FunctionInvokingChatClient)
|
||||
1. Model responds with FunctionCallContent indicating function calls required.
|
||||
1. FunctionInvokingChatClient decorator identifies any function calls that require user approval and returns an FunctionApprovalRequestContent.
|
||||
(If there are multiple parallel function calls, all function calls will be returned as FunctionApprovalRequestContent even if only some require approval.)
|
||||
1. Agent updates the thread with the FunctionApprovalRequestContent (or this may have already been done by a service threaded agent).
|
||||
1. Agent returns the FunctionApprovalRequestContent to the caller which shows it to the user in the appropriate format.
|
||||
1. User (via caller) invokes the agent again with FunctionApprovalResponseContent.
|
||||
1. Agent adds the FunctionApprovalResponseContent to the thread.
|
||||
1. Agent calls IChatClient with the provided FunctionApprovalResponseContent.
|
||||
1. Agent invokes IChatClient with FunctionApprovalResponseContent and the FunctionInvokingChatClient decorator identifies the response as an approval for the function call.
|
||||
Any rejected approvals are converted to FunctionResultContent with a message indicating that the function invocation was denied.
|
||||
Any approved approvals are executed by the FunctionInvokingChatClient decorator.
|
||||
1. FunctionInvokingChatClient decorator passes the FunctionCallContent and FunctionResultContent for the approved and rejected function calls to the model.
|
||||
1. Model responds with the result.
|
||||
1. FunctionInvokingChatClient returns the FunctionCallContent, FunctionResultContent, and the result message to the agent.
|
||||
1. Agent responds to caller with the same messages and updates the thread with these as well.
|
||||
|
||||
### CustomAgent Approval Process Flow
|
||||
|
||||
1. User passes a User message to the agent with a request.
|
||||
1. Agent adds this message to the thread.
|
||||
1. Agent executes various steps.
|
||||
1. Agent encounters a step for which it requires user input to continue.
|
||||
1. Agent responds with an UserInputRequestContent and also adds it to its thread.
|
||||
1. User (via caller) invokes the agent again with UserInputResponseContent.
|
||||
1. Agent adds the UserInputResponseContent to the thread.
|
||||
1. Agent responds to caller with result message and thread is updated with the result message.
|
||||
|
||||
### Sequence Diagram: FunctionInvokingChatClient with built in Approval Generation
|
||||
|
||||
This is a ChatClient Approval Stack option has been proven to work via a proof of concept implementation.
|
||||
|
||||
```mermaid
|
||||
---
|
||||
title: Multiple Functions with partial approval
|
||||
---
|
||||
|
||||
sequenceDiagram
|
||||
note right of Developer: Developer asks question with two functions.
|
||||
Developer->>+FunctionInvokingChatClient: What is the special soup today?<br/>[GetMenu, GetSpecials]
|
||||
FunctionInvokingChatClient->>+ResponseChatClient: What is the special soup today?<br/>[GetMenu, GetSpecials]
|
||||
|
||||
ResponseChatClient-->>-FunctionInvokingChatClient: [FunctionCallContent(GetMenu)],<br/>[FunctionCallContent(GetSpecials)]
|
||||
note right of FunctionInvokingChatClient: FICC turns FunctionCallContent<br/>into FunctionApprovalRequestContent
|
||||
FunctionInvokingChatClient->>+Developer: [FunctionApprovalRequestContent(GetMenu)]<br/>[FunctionApprovalRequestContent(GetSpecials)]
|
||||
|
||||
note right of Developer:Developer asks user for approval
|
||||
Developer->>+FunctionInvokingChatClient: [FunctionApprovalRequestContent(GetMenu, approved=false)]<br/>[FunctionApprovalRequestContent(GetSpecials, approved=true)]
|
||||
note right of FunctionInvokingChatClient:FunctionInvokingChatClient executes the approved<br/>function and generates a failed FunctionResultContent<br/>for the rejected one, before invoking the model again.
|
||||
FunctionInvokingChatClient->>+ResponseChatClient: What is the special soup today?<br/>[FunctionCallContent(GetMenu)],<br/>[FunctionCallContent(GetSpecials)],<br/>[FunctionResultContent(GetMenu, Function invocation denied")]<br/>[FunctionResultContent(GetSpecials, "Special Soup: Clam Chowder...")]
|
||||
|
||||
ResponseChatClient-->>-FunctionInvokingChatClient: [TextContent("The specials soup is...")]
|
||||
FunctionInvokingChatClient->>+Developer: [FunctionCallContent(GetMenu)],<br/>[FunctionCallContent(GetSpecials)],<br/>[FunctionResultContent(GetMenu, Function invocation denied")]<br/>[FunctionResultContent(GetSpecials, "Special Soup: Clam Chowder...")]<br/>[TextContent("The specials soup is...")]
|
||||
```
|
||||
|
||||
### Sequence Diagram: Post FunctionInvokingChatClient ApprovalGeneratingChatClient - Multiple function calls with partial approval
|
||||
|
||||
This is a discarded ChatClient Approval Stack option, but is included here for reference.
|
||||
|
||||
```mermaid
|
||||
---
|
||||
title: Multiple Functions with partial approval
|
||||
---
|
||||
|
||||
sequenceDiagram
|
||||
note right of Developer: Developer asks question with two functions.
|
||||
Developer->>+FunctionInvokingChatClient: What is the special soup today? [GetMenu, GetSpecials]
|
||||
FunctionInvokingChatClient->>+ApprovalGeneratingChatClient: What is the special soup today? [GetMenu, GetSpecials]
|
||||
ApprovalGeneratingChatClient->>+ResponseChatClient: What is the special soup today? [GetMenu, GetSpecials]
|
||||
|
||||
ResponseChatClient-->>-ApprovalGeneratingChatClient: [FunctionCallContent(GetMenu)],<br/>[FunctionCallContent(GetSpecials)]
|
||||
ApprovalGeneratingChatClient-->>-FunctionInvokingChatClient: [FunctionApprovalRequestContent(GetMenu)],<br/>[FunctionApprovalRequestContent(GetSpecials)]
|
||||
FunctionInvokingChatClient-->>-Developer: [FunctionApprovalRequestContent(GetMenu)]<br/>[FunctionApprovalRequestContent(GetSpecials)]
|
||||
|
||||
note right of Developer: Developer approves one function call and rejects the other.
|
||||
Developer->>+FunctionInvokingChatClient: [FunctionApprovalResponseContent(GetMenu, approved=true)]<br/>[FunctionApprovalResponseContent(GetSpecials, approved=false)]
|
||||
FunctionInvokingChatClient->>+ApprovalGeneratingChatClient: [FunctionApprovalResponseContent(GetMenu, approved=true)]<br/>[FunctionApprovalResponseContent(GetSpecials, approved=false)]
|
||||
|
||||
note right of FunctionInvokingChatClient: ApprovalGeneratingChatClient only returns FunctionCallContent<br/>for approved FunctionApprovalResponseContent.
|
||||
ApprovalGeneratingChatClient-->>-FunctionInvokingChatClient: [FunctionCallContent(GetMenu)]
|
||||
note right of FunctionInvokingChatClient: FunctionInvokingChatClient has to also include all<br/>FunctionApprovalResponseContent in the new downstream request.
|
||||
FunctionInvokingChatClient->>+ApprovalGeneratingChatClient: [FunctionResultContent(GetMenu, "mains.... deserts...")]<br/>[FunctionApprovalResponseContent(GetMenu, approved=true)]<br/>[FunctionApprovalResponseContent(GetSpecials, approved=false)]
|
||||
|
||||
note right of ApprovalGeneratingChatClient: ApprovalGeneratingChatClient now throws away<br/>approvals for executed functions, and creates<br/>failed FunctionResultContent for denied function calls.
|
||||
ApprovalGeneratingChatClient->>+ResponseChatClient: [FunctionResultContent(GetMenu, "mains.... deserts...")]<br/>[FunctionResultContent(GetSpecials, "Function invocation denied")]
|
||||
```
|
||||
|
||||
### Sequence Diagram: Pre FunctionInvokingChatClient ApprovalGeneratingChatClient - Multiple function calls with partial approval
|
||||
|
||||
This is a discarded ChatClient Approval Stack option, but is included here for reference.
|
||||
|
||||
It doesn't work for the scenario where we have multiple function calls for the same function in serial with different arguments.
|
||||
|
||||
Flow:
|
||||
|
||||
- AGCC turns AIFunctions into AIFunctionDefinitions (not invocable) and FICC ignores these.
|
||||
- We get back a FunctionCall for one of these and it gets approved.
|
||||
- We invoke the FICC again, this time with an AIFunction.
|
||||
- We call the service with the FCC and FRC.
|
||||
- We get back a new Function call for the same function again with different arguments.
|
||||
- Since we were passed an AIFunction instead of an AIFunctionDefinition, we now incorrectly execute this FC without approval.
|
||||
|
||||
```mermaid
|
||||
---
|
||||
title: Multiple Functions with partial approval
|
||||
---
|
||||
|
||||
sequenceDiagram
|
||||
note right of Developer: Developer asks question with two functions.
|
||||
Developer->>+ApprovalGeneratingChatClient: What is the special soup today? [GetMenu, GetSpecials]
|
||||
note right of ApprovalGeneratingChatClient: AGCC marks functions as not-invocable
|
||||
ApprovalGeneratingChatClient->>+FunctionInvokingChatClient: What is the special soup today?<br/>[GetMenu(invocable=false)]<br/>[GetSpecials(invocable=false)]
|
||||
FunctionInvokingChatClient->>+ResponseChatClient: What is the special soup today?<br/>[GetMenu(invocable=false)]<br/>[GetSpecials(invocable=false)]
|
||||
|
||||
ResponseChatClient-->>-FunctionInvokingChatClient: [FunctionCallContent(GetMenu)],<br/>[FunctionCallContent(GetSpecials)]
|
||||
note right of FunctionInvokingChatClient: FICC doesn't invoke functions since they are not invocable.
|
||||
FunctionInvokingChatClient-->>-ApprovalGeneratingChatClient: [FunctionCallContent(GetMenu)],<br/>[FunctionCallContent(GetSpecials)]
|
||||
note right of ApprovalGeneratingChatClient: AGCC turns functions into approval requests
|
||||
ApprovalGeneratingChatClient-->>-Developer: [FunctionApprovalRequestContent(GetMenu)]<br/>[FunctionApprovalRequestContent(GetSpecials)]
|
||||
|
||||
note right of Developer: Developer approves one function call and rejects the other.
|
||||
Developer->>+ApprovalGeneratingChatClient: [FunctionApprovalResponseContent(GetMenu, approved=true)]<br/>[FunctionApprovalResponseContent(GetSpecials, approved=false)]
|
||||
note right of ApprovalGeneratingChatClient: AGCC turns turns approval requests<br/>into FCC or failed function calls
|
||||
ApprovalGeneratingChatClient->>+FunctionInvokingChatClient: [FunctionCallContent(GetMenu)]<br/>[FunctionCallContent(GetSpecials)<br/>[FunctionResultContent(GetSpecials, "Function invocation denied"))]
|
||||
note right of FunctionInvokingChatClient: FICC invokes GetMenu since it's the only remaining one.
|
||||
FunctionInvokingChatClient->>+ResponseChatClient: [FunctionCallContent(GetMenu)]<br/>[FunctionResultContent(GetMenu, "mains.... deserts...")]<br/>[FunctionCallContent(GetSpecials)<br/>[FunctionResultContent(GetSpecials, "Function invocation denied"))]
|
||||
|
||||
ResponseChatClient-->>-FunctionInvokingChatClient: [FunctionCallContent(GetMenu)]<br/>[FunctionResultContent(GetMenu, "mains.... deserts...")]<br/>[FunctionCallContent(GetSpecials)<br/>[FunctionResultContent(GetSpecials, "Function invocation denied"))]<br/>[TextContent("The specials soup is...")]
|
||||
FunctionInvokingChatClient-->>-ApprovalGeneratingChatClient: [FunctionCallContent(GetMenu)]<br/>[FunctionResultContent(GetMenu, "mains.... deserts...")]<br/>[FunctionCallContent(GetSpecials)<br/>[FunctionResultContent(GetSpecials, "Function invocation denied"))]<br/>[TextContent("The specials soup is...")]
|
||||
ApprovalGeneratingChatClient-->>-Developer: [FunctionCallContent(GetMenu)]<br/>[FunctionResultContent(GetMenu, "mains.... deserts...")]<br/>[FunctionCallContent(GetSpecials)<br/>[FunctionResultContent(GetSpecials, "Function invocation denied"))]<br/>[TextContent("The specials soup is...")]
|
||||
```
|
||||
1190
docs/decisions/0007-agent-filtering-middleware.md
Normal file
1190
docs/decisions/0007-agent-filtering-middleware.md
Normal file
File diff suppressed because it is too large
Load Diff
92
docs/decisions/0008-python-subpackages.md
Normal file
92
docs/decisions/0008-python-subpackages.md
Normal file
@@ -0,0 +1,92 @@
|
||||
---
|
||||
status: accepted
|
||||
contact: eavanvalkenburg
|
||||
date: 2025-09-19
|
||||
deciders: eavanvalkenburg, markwallace-microsoft, ekzhu, sphenry, alliscode
|
||||
consulted: taochenosu, moonbox3, dmytrostruk, giles17
|
||||
---
|
||||
|
||||
# Python Subpackages Design
|
||||
|
||||
## Context and Problem Statement
|
||||
|
||||
The goal is to design a subpackage structure for the Python agent framework that balances ease of use, maintainability, and scalability. How can we organize the codebase to facilitate the development and integration of connectors while minimizing complexity for users?
|
||||
|
||||
## Decision Drivers
|
||||
|
||||
- Ease of use for developers
|
||||
- Maintainability of the codebase
|
||||
- User experience for installing and using the integrations
|
||||
- Clear lifecycle management for integrations
|
||||
- Minimize non-GA dependencies in the main package
|
||||
|
||||
## Considered Options
|
||||
|
||||
1. One subpackage per vendor, so a `google` package that contains all Google related connectors, such as `GoogleChatClient`, `BigQueryCollection`, etc.
|
||||
* Pros:
|
||||
- fewer packages to manage, publish and maintain
|
||||
- easier for users to find and install the right package.
|
||||
- users that work primarily with one platform have a single package to install.
|
||||
* Cons:
|
||||
- larger packages with more dependencies
|
||||
- larger installation sizes
|
||||
- more difficult to version, since some parts may be GA, while other are in preview.
|
||||
2. One subpackage per connector, so a i.e. `google_chat` package, a i.e. `google_bigquery` package, etc.
|
||||
* Pros:
|
||||
- smaller packages with fewer dependencies
|
||||
- smaller installation sizes
|
||||
- easy to version and do lifecycle management on
|
||||
* Cons:
|
||||
- more packages to manage, register, publish and maintain
|
||||
- more extras, means more difficult for users to find and install the right package.
|
||||
3. Group connectors by vendor and maturity, so that you can graduate something from the i.e. the `google-preview` package to the `google` package when it becomes GA.
|
||||
* Pros:
|
||||
- fewer packages to manage, publish and maintain
|
||||
- easier for users to find and install the right package.
|
||||
- users that work primarily with one platform have a single package to install.
|
||||
- clear what the status is based on extra name
|
||||
* Cons:
|
||||
- moving something from one to the other might be a breaking change
|
||||
- still larger packages with more dependencies
|
||||
It could be mitigated that the `google-preview` package is still imported from `agent_framework.google`, so that the import path does not change, when something graduates, but it is still a clear choice for users to make. And we could then have three extras on that package, `google`, `google-preview` and `google-all` to make it easy to install the right package or just all.
|
||||
4. Group connectors by vendor and type, so that you have a `google-chat` package, a `google-data` package, etc.
|
||||
* Pros:
|
||||
- smaller packages with fewer dependencies
|
||||
- smaller installation sizes
|
||||
* Cons:
|
||||
- more packages to manage, register, publish and maintain
|
||||
- more extras, means more difficult for users to find and install the right package.
|
||||
- still keeps the lifecycle more difficult, since some parts may be GA, while other are in preview.
|
||||
5. Add `meta`-extras, that combine different subpackages as one extra, so we could have a `google` extra that includes `google-chat`, `google-bigquery`, etc.
|
||||
* Pros:
|
||||
- easier for users on a single platform
|
||||
* Cons:
|
||||
- more packages to manage, register, publish and maintain
|
||||
- more extras, means more difficult for users to find and install the right package.
|
||||
- makes developer package management more complex, because that meta-extra will include both GA and non-GA packages, so during dev they could use that, but then during prod they have to figure out which one they actually need and make a change in their dependencies, leading to mismatches between dev and prod.
|
||||
6. Make all imports happen from `agent_framework.connectors` (or from two or three groups `agent_framework.chat_clients`, `agent_framework.context_providers`, or something similar) while the underlying code comes from different packages.
|
||||
* Pros:
|
||||
- best developer experience, since all imports are from the same place and it is easy to find what you need, and we can raise a meaningfull error with which extra to install.
|
||||
- easier for users to find and install the right package.
|
||||
* Cons:
|
||||
- larger overhead in maintaining the `__init__.py` files that do the lazy loading and error handling.
|
||||
- larger overhead in package management, since we have to ensure that the main package.
|
||||
7. Subpackage existence will be based off status of dependencies and/or possibilities of a external support mechanism. What this means is that:
|
||||
- Integrations that need non-GA dependencies will be subpackages, so that we can avoid having non-GA dependencies in the main package.
|
||||
- Integrations where the AF-code is still experimental, preview or release candidate will be subpackages, so that we can avoid having non-GA code in the main package and we can version those packages properly.
|
||||
- Integrations that are outside Microsoft and where we might not always be able to fast-follow breaking changes, will stay as subpackages, to provide some isolation and to be able to version them properly.
|
||||
- Integrations that are mature and that have released (GA) dependencies and or features on the service side will be moved into the main package, the dependencies of those packages will stay installable under the same `extra` name, so that users do not have to change anything, and we then remove the subpackage itself.
|
||||
- All subpackage imports in the code should be from a stable place, mostly vendor-based, so that when something moves from a subpackage to the main package, the import path does not change, so `from agent_framework.google import GoogleChatClient` will always work, even if it moves from the `agent-framework-google` package to the main `agent-framework` package.
|
||||
- The imports in those vendor namespaces (these won't be actual python namespaces, just the folders with a __init__.py file and any code) will do lazy loading and raise a meaningful error if the subpackage or dependencies are not installed, so that users know which extra to install with ease.
|
||||
- On a case by case basis we can decide to create additional `extras`, that combine multiple subpackages into one extra, so that users that work primarily with one platform can install everything they need with a single extra, for instance you can install with the `agent-framework[azure-purview]` extra that only implement a Azure Purview Middleware, or you can install with the `agent-framework[azure]` extra that includes all Azure related connectors, like `purview`, `content safety` and others (all examples, not actual packages (yet)), regardless of where the code sits, these should always be importable from `agent_framework.azure`.
|
||||
- Subpackage naming should also follow this, so in principle a package name is `<vendor/folder>-<feature/brand>`, so `google-gemini`, `azure-purview`, `microsoft-copilotstudio`, etc. For smaller vendors, with less likely to have a multitude of connectors, we can skip the feature/brand part, so `mem0`, `redis`, etc.
|
||||
|
||||
## Decision Outcome
|
||||
|
||||
Option 7: This provides us a good balance between developer experience, user experience, package management and maintenance, while also allowing us to evolve the package structure over time as dependencies and features mature. And it ensures the main package, installed without extras does not include non-GA dependencies or code, extras do not carry that guarantee, for both the code and the dependencies.
|
||||
|
||||
# Microsoft vs Azure packages
|
||||
Another consideration is for Microsoft, since we have a lot of Azure services, but also other Microsoft services, such as Microsoft Copilot Studio, and potentially other services in the future, and maybe Foundry also will be marketed separate from Azure at some point. We could also have both a `microsoft` and an `azure` package, where the `microsoft` package contains all Microsoft services, excluding Azure, while the `azure` package only contains Azure services. Only applicable for the variants where we group by vendor, including with meta packages.
|
||||
|
||||
## Decision Outcome
|
||||
Azure and Microsoft will be the two vendor folders for Microsoft services, so Copilot Studio will be imported from `agent_framework.microsoft`, while Foundry, Azure OpenAI and other Azure services will be imported from `agent_framework.azure`.
|
||||
1689
docs/decisions/0009-support-long-running-operations.md
Normal file
1689
docs/decisions/0009-support-long-running-operations.md
Normal file
File diff suppressed because it is too large
Load Diff
95
docs/decisions/0010-ag-ui-support.md
Normal file
95
docs/decisions/0010-ag-ui-support.md
Normal file
@@ -0,0 +1,95 @@
|
||||
---
|
||||
status: accepted
|
||||
contact: javiercn
|
||||
date: 2025-10-29
|
||||
deciders: javiercn, DeagleGross, moonbox3, markwallace-microsoft
|
||||
consulted: Agent Framework team
|
||||
informed: .NET community
|
||||
---
|
||||
|
||||
# AG-UI Protocol Support for .NET Agent Framework
|
||||
|
||||
## Context and Problem Statement
|
||||
|
||||
The .NET Agent Framework needed a standardized way to enable communication between AI agents and user-facing applications with support for streaming, real-time updates, and bidirectional communication. Without AG-UI protocol support, .NET agents could not interoperate with the growing ecosystem of AG-UI-compatible frontends and agent frameworks (LangGraph, CrewAI, Pydantic AI, etc.), limiting the framework's adoption and utility.
|
||||
|
||||
The AG-UI (Agent-User Interaction) protocol is an open, lightweight, event-based protocol that addresses key challenges in agentic applications including streaming support for long-running agents, event-driven architecture for nondeterministic behavior, and protocol interoperability that complements MCP (tool/context) and A2A (agent-to-agent) protocols.
|
||||
|
||||
## Decision Drivers
|
||||
|
||||
- Need for streaming communication between agents and client applications
|
||||
- Requirement for protocol interoperability with other AI frameworks
|
||||
- Support for long-running, multi-turn conversation sessions
|
||||
- Real-time UI updates for nondeterministic agent behavior
|
||||
- Standardized approach to agent-to-UI communication
|
||||
- Framework abstraction to protect consumers from protocol changes
|
||||
|
||||
## Considered Options
|
||||
|
||||
1. **Implement AG-UI event types as public API surface** - Expose AG-UI event models directly to consumers
|
||||
2. **Use custom AIContent types for lifecycle events** - Create new content types (RunStartedContent, RunFinishedContent, RunErrorContent)
|
||||
3. **Current approach** - Internal event types with framework-native abstractions
|
||||
|
||||
## Decision Outcome
|
||||
|
||||
Chosen option: "Current approach with internal event types and framework-native abstractions", because it:
|
||||
|
||||
- Protects consumers from protocol changes by keeping AG-UI events internal
|
||||
- Maintains framework abstractions through conversion at boundaries
|
||||
- Uses existing framework types (AgentResponseUpdate, ChatMessage) for public API
|
||||
- Focuses on core text streaming functionality
|
||||
- Leverages existing properties (ConversationId, ResponseId, ErrorContent) instead of custom types
|
||||
- Provides bidirectional client and server support
|
||||
|
||||
### Implementation Details
|
||||
|
||||
**In Scope:**
|
||||
1. **Client-side AG-UI consumption** (`Microsoft.Agents.AI.AGUI` package)
|
||||
- `AGUIAgent` class for connecting to remote AG-UI servers
|
||||
- `AGUIAgentThread` for managing conversation threads
|
||||
- HTTP/SSE streaming support
|
||||
- Event-to-framework type conversion
|
||||
|
||||
2. **Server-side AG-UI hosting** (`Microsoft.Agents.AI.Hosting.AGUI.AspNetCore` package)
|
||||
- `MapAGUIAgent` extension method for ASP.NET Core
|
||||
- Server-Sent Events (SSE) response formatting
|
||||
- Framework-to-event type conversion
|
||||
- Agent factory pattern for per-request instantiation
|
||||
|
||||
3. **Text streaming events**
|
||||
- Lifecycle events: `RunStarted`, `RunFinished`, `RunError`
|
||||
- Text message events: `TextMessageStart`, `TextMessageContent`, `TextMessageEnd`
|
||||
- Thread and run ID management via `ConversationId` and `ResponseId`
|
||||
|
||||
### Key Design Decisions
|
||||
|
||||
1. **Event Models as Internal Types** - AG-UI event types are internal with conversion via extension methods; public API uses the existing types in Microsoft.Extensions.AI as those are the abstractions people are familiar with
|
||||
|
||||
2. **No Custom Content Types** - Run lifecycle communicated through existing `ChatResponseUpdate` properties (`ConversationId`, `ResponseId`) and standard `ErrorContent` type
|
||||
|
||||
3. **Agent Factory Pattern** - `MapAGUIAgent` uses factory function `(messages) => AIAgent` to allow request-specific agent configuration supporting multi-tenancy
|
||||
|
||||
4. **Bidirectional Conversion Architecture** - Symmetric conversion logic in shared namespace compiled into both packages for server (`AgentResponseUpdate` → AG-UI events) and client (AG-UI events → `AgentResponseUpdate`)
|
||||
|
||||
5. **Thread Management** - `AGUIAgentThread` stores only `ThreadId` with thread ID communicated via `ConversationId`; applications manage persistence for parity with other implementations and to be compliant with the protocol. Future extensions will support having the server manage the conversation.
|
||||
|
||||
6. **Custom JSON Converter** - Uses custom polymorphic deserialization via `BaseEventJsonConverter` instead of built-in System.Text.Json support to handle AG-UI protocol's flexible discriminator positioning
|
||||
|
||||
### Consequences
|
||||
|
||||
**Positive:**
|
||||
- .NET developers can consume AG-UI servers from any framework
|
||||
- .NET agents accessible from any AG-UI-compatible client
|
||||
- Standardized streaming communication patterns
|
||||
- Protected from protocol changes through internal implementation
|
||||
- Symmetric conversion logic between client and server
|
||||
- Framework-native public API surface
|
||||
|
||||
**Negative:**
|
||||
- Custom JSON converter required (internal implementation detail)
|
||||
- Shared code uses preprocessor directives (`#if ASPNETCORE`)
|
||||
- Additional abstraction layer between protocol and public API
|
||||
|
||||
**Neutral:**
|
||||
- Initial implementation focused on text streaming
|
||||
- Applications responsible for thread persistence
|
||||
368
docs/decisions/0011-create-get-agent-api.md
Normal file
368
docs/decisions/0011-create-get-agent-api.md
Normal file
@@ -0,0 +1,368 @@
|
||||
---
|
||||
status: proposed
|
||||
contact: dmytrostruk
|
||||
date: 2025-12-12
|
||||
deciders: dmytrostruk, markwallace-microsoft, eavanvalkenburg, giles17
|
||||
---
|
||||
|
||||
# Create/Get Agent API
|
||||
|
||||
## Context and Problem Statement
|
||||
|
||||
There is a misalignment between the create/get agent API in the .NET and Python implementations.
|
||||
|
||||
In .NET, the `CreateAIAgent` method can create either a local instance of an agent or a remote instance if the backend provider supports it. For remote agents, once the agent is created, you can retrieve an existing remote agent by using the `GetAIAgent` method. If a backend provider doesn't support remote agents, `CreateAIAgent` just initializes a new local agent instance and `GetAIAgent` is not available. There is also a `BuildAIAgent` method, which is an extension for the `ChatClientBuilder` class from `Microsoft.Extensions.AI`. It builds pipelines of `IChatClient` instances with an `IServiceProvider`. This functionality does not exist in Python, so `BuildAIAgent` is out of scope.
|
||||
|
||||
In Python, there is only one `create_agent` method, which always creates a local instance of the agent. If the backend provider supports remote agents, the remote agent is created only on the first `agent.run()` invocation.
|
||||
|
||||
Below is a short summary of different providers and their APIs in .NET:
|
||||
|
||||
| Package | Method | Behavior | Python support |
|
||||
|---|---|---|---|
|
||||
| Microsoft.Agents.AI | `CreateAIAgent` (based on `IChatClient`) | Creates a local instance of `ChatClientAgent`. | Yes (`create_agent` in `BaseChatClient`). |
|
||||
| Microsoft.Agents.AI.Anthropic | `CreateAIAgent` (based on `IBetaService` and `IAnthropicClient`) | Creates a local instance of `ChatClientAgent`. | Yes (`AnthropicClient` inherits `BaseChatClient`, which exposes `create_agent`). |
|
||||
| Microsoft.Agents.AI.AzureAI (V2) | `GetAIAgent` (based on `AIProjectClient` with `AgentReference`) | Creates a local instance of `ChatClientAgent`. | Partial (Python uses `create_agent` from `BaseChatClient`). |
|
||||
| Microsoft.Agents.AI.AzureAI (V2) | `GetAIAgent`/`GetAIAgentAsync` (with `Name`/`ChatClientAgentOptions`) | Fetches `AgentRecord` via HTTP, then creates a local `ChatClientAgent` instance. | No |
|
||||
| Microsoft.Agents.AI.AzureAI (V2) | `CreateAIAgent`/`CreateAIAgentAsync` (based on `AIProjectClient`) | Creates a remote agent first, then wraps it into a local `ChatClientAgent` instance. | No |
|
||||
| Microsoft.Agents.AI.AzureAI.Persistent (V1) | `GetAIAgent` (based on `PersistentAgentsClient` with `PersistentAgent`) | Creates a local instance of `ChatClientAgent`. | Partial (Python uses `create_agent` from `BaseChatClient`). |
|
||||
| Microsoft.Agents.AI.AzureAI.Persistent (V1) | `GetAIAgent`/`GetAIAgentAsync` (with `AgentId`) | Fetches `PersistentAgent` via HTTP, then creates a local `ChatClientAgent` instance. | No |
|
||||
| Microsoft.Agents.AI.AzureAI.Persistent (V1) | `CreateAIAgent`/`CreateAIAgentAsync` | Creates a remote agent first, then wraps it into a local `ChatClientAgent` instance. | No |
|
||||
| Microsoft.Agents.AI.OpenAI | `GetAIAgent` (based on `AssistantClient` with `Assistant`) | Creates a local instance of `ChatClientAgent`. | Partial (Python uses `create_agent` from `BaseChatClient`). |
|
||||
| Microsoft.Agents.AI.OpenAI | `GetAIAgent`/`GetAIAgentAsync` (with `AgentId`) | Fetches `Assistant` via HTTP, then creates a local `ChatClientAgent` instance. | No |
|
||||
| Microsoft.Agents.AI.OpenAI | `CreateAIAgent`/`CreateAIAgentAsync` (based on `AssistantClient`) | Creates a remote agent first, then wraps it into a local `ChatClientAgent` instance. | No |
|
||||
| Microsoft.Agents.AI.OpenAI | `CreateAIAgent` (based on `ChatClient`) | Creates a local instance of `ChatClientAgent`. | Yes (`create_agent` in `BaseChatClient`). |
|
||||
| Microsoft.Agents.AI.OpenAI | `CreateAIAgent` (based on `OpenAIResponseClient`) | Creates a local instance of `ChatClientAgent`. | Yes (`create_agent` in `BaseChatClient`). |
|
||||
|
||||
Another difference between Python and .NET implementation is that in .NET `CreateAIAgent`/`GetAIAgent` methods are implemented as extension methods based on underlying SDK client, like `AIProjectClient` from Azure AI or `AssistantClient` from OpenAI:
|
||||
|
||||
```csharp
|
||||
// Definition
|
||||
public static ChatClientAgent CreateAIAgent(
|
||||
this AIProjectClient aiProjectClient,
|
||||
string name,
|
||||
string model,
|
||||
string instructions,
|
||||
string? description = null,
|
||||
IList<AITool>? tools = null,
|
||||
Func<IChatClient, IChatClient>? clientFactory = null,
|
||||
IServiceProvider? services = null,
|
||||
CancellationToken cancellationToken = default)
|
||||
{ }
|
||||
|
||||
// Usage
|
||||
AIProjectClient aiProjectClient = new(new Uri(endpoint), new AzureCliCredential()); // Initialization of underlying SDK client
|
||||
|
||||
var newAgent = await aiProjectClient.CreateAIAgentAsync(name: AgentName, model: deploymentName, instructions: AgentInstructions, tools: [tool]); // ChatClientAgent creation from underlying SDK client
|
||||
|
||||
// Alternative usage (same as extension method, just explicit syntax)
|
||||
var newAgent = await AzureAIProjectChatClientExtensions.CreateAIAgentAsync(
|
||||
aiProjectClient,
|
||||
name: AgentName,
|
||||
model: deploymentName,
|
||||
instructions: AgentInstructions,
|
||||
tools: [tool]);
|
||||
```
|
||||
|
||||
Python doesn't support extension methods. Currently `create_agent` method is defined on `BaseChatClient`, but this method only creates a local instance of `ChatAgent` and it can't create remote agents for providers that support it for a couple of reasons:
|
||||
|
||||
- It's defined as non-async.
|
||||
- `BaseChatClient` implementation is stateful for providers like Azure AI or OpenAI Assistants. The implementation stores agent/assistant metadata like `AgentId` and `AgentName`, so currently it's not possible to create different instances of `ChatAgent` from a single `BaseChatClient` in case if the implementation is stateful.
|
||||
|
||||
## Decision Drivers
|
||||
|
||||
- API should be aligned between .NET and Python.
|
||||
- API should be intuitive and consistent between backend providers in .NET and Python.
|
||||
|
||||
## Considered Options
|
||||
|
||||
Add missing implementations on the Python side. This should include the following:
|
||||
|
||||
### agent-framework-azure-ai (both V1 and V2)
|
||||
|
||||
- Add a `get_agent` method that accepts an underlying SDK agent instance and creates a local instance of `ChatAgent`.
|
||||
- Add a `get_agent` method that accepts an agent identifier, performs an additional HTTP request to fetch agent data, and then creates a local instance of `ChatAgent`.
|
||||
- Override the `create_agent` method from `BaseChatClient` to create a remote agent instance and wrap it into a local `ChatAgent`.
|
||||
|
||||
.NET:
|
||||
|
||||
```csharp
|
||||
var agent1 = new AIProjectClient(...).GetAIAgent(agentInstanceFromSdkType); // Creates a local ChatClientAgent instance from Azure.AI.Projects.OpenAI.AgentReference
|
||||
var agent2 = new AIProjectClient(...).GetAIAgent(agentName); // Fetches agent data, creates a local ChatClientAgent instance
|
||||
var agent3 = new AIProjectClient(...).CreateAIAgent(...); // Creates a remote agent, returns a local ChatClientAgent instance
|
||||
```
|
||||
|
||||
### agent-framework-core (OpenAI Assistants)
|
||||
|
||||
- Add a `get_agent` method that accepts an underlying SDK agent instance and creates a local instance of `ChatAgent`.
|
||||
- Add a `get_agent` method that accepts an agent name, performs an additional HTTP request to fetch agent data, and then creates a local instance of `ChatAgent`.
|
||||
- Override the `create_agent` method from `BaseChatClient` to create a remote agent instance and wrap it into a local `ChatAgent`.
|
||||
|
||||
.NET:
|
||||
|
||||
```csharp
|
||||
var agent1 = new AssistantClient(...).GetAIAgent(agentInstanceFromSdkType); // Creates a local ChatClientAgent instance from OpenAI.Assistants.Assistant
|
||||
var agent2 = new AssistantClient(...).GetAIAgent(agentId); // Fetches agent data, creates a local ChatClientAgent instance
|
||||
var agent3 = new AssistantClient(...).CreateAIAgent(...); // Creates a remote agent, returns a local ChatClientAgent instance
|
||||
```
|
||||
|
||||
### Possible Python implementations
|
||||
|
||||
Methods like `create_agent` and `get_agent` should be implemented separately or defined on some stateless component that will allow to create multiple agents from the same instance/place.
|
||||
|
||||
Possible options:
|
||||
|
||||
#### Option 1: Module-level functions
|
||||
|
||||
Implement free functions in the provider package that accept the underlying SDK client as the first argument (similar to .NET extension methods, but expressed in Python).
|
||||
|
||||
Example:
|
||||
|
||||
```python
|
||||
from agent_framework.azure import create_agent, get_agent
|
||||
|
||||
ai_project_client = AIProjectClient(...)
|
||||
|
||||
# Creates a remote agent first, then returns a local ChatAgent wrapper
|
||||
created_agent = await create_agent(
|
||||
ai_project_client,
|
||||
name="",
|
||||
instructions="",
|
||||
tools=[tool],
|
||||
)
|
||||
|
||||
# Gets an existing remote agent and returns a local ChatAgent wrapper
|
||||
first_agent = await get_agent(ai_project_client, agent_id=agent_id)
|
||||
|
||||
# Wraps an SDK agent instance (no extra HTTP call)
|
||||
second_agent = get_agent(ai_project_client, agent_reference)
|
||||
```
|
||||
|
||||
Pros:
|
||||
|
||||
- Naturally supports async `create_agent` / `get_agent`.
|
||||
- Supports multiple agents per SDK client.
|
||||
- Closest conceptual match to .NET extension methods while staying Pythonic.
|
||||
|
||||
Cons:
|
||||
|
||||
- Discoverability is lower (users need to know where the functions live).
|
||||
- Verbose when creating multiple agents (client must be passed every time):
|
||||
|
||||
```python
|
||||
agent1 = await azure_agents.create_agent(client, name="Agent1", ...)
|
||||
agent2 = await azure_agents.create_agent(client, name="Agent2", ...)
|
||||
```
|
||||
|
||||
#### Option 2: Provider object
|
||||
|
||||
Introduce a dedicated provider type that is constructed from the underlying SDK client, and exposes async `create_agent` / `get_agent` methods.
|
||||
|
||||
Example:
|
||||
|
||||
```python
|
||||
from agent_framework.azure import AzureAIAgentProvider
|
||||
|
||||
ai_project_client = AIProjectClient(...)
|
||||
provider = AzureAIAgentProvider(ai_project_client)
|
||||
|
||||
agent = await provider.create_agent(
|
||||
name="",
|
||||
instructions="",
|
||||
tools=[tool],
|
||||
)
|
||||
|
||||
agent = await provider.get_agent(agent_id=agent_id)
|
||||
agent = provider.get_agent(agent_reference=agent_reference)
|
||||
```
|
||||
|
||||
Pros:
|
||||
|
||||
- High discoverability and clear grouping of related behavior.
|
||||
- Keeps SDK clients unchanged and supports multiple agents per SDK client.
|
||||
- Concise when creating multiple agents (client passed once):
|
||||
|
||||
```python
|
||||
provider = AzureAIAgentProvider(ai_project_client)
|
||||
agent1 = await provider.create_agent(name="Agent1", ...)
|
||||
agent2 = await provider.create_agent(name="Agent2", ...)
|
||||
```
|
||||
|
||||
Cons:
|
||||
|
||||
- Adds a new public concept/type for users to learn.
|
||||
|
||||
#### Option 3: Inheritance (SDK client subclass)
|
||||
|
||||
Create a subclass of the underlying SDK client and add `create_agent` / `get_agent` methods.
|
||||
|
||||
Example:
|
||||
|
||||
```python
|
||||
class ExtendedAIProjectClient(AIProjectClient):
|
||||
async def create_agent(self, *, name: str, model: str, instructions: str, **kwargs) -> ChatAgent:
|
||||
...
|
||||
|
||||
async def get_agent(self, *, agent_id: str | None = None, sdk_agent=None, **kwargs) -> ChatAgent:
|
||||
...
|
||||
|
||||
client = ExtendedAIProjectClient(...)
|
||||
agent = await client.create_agent(name="", instructions="")
|
||||
```
|
||||
|
||||
Pros:
|
||||
|
||||
- Discoverable and ergonomic call sites.
|
||||
- Mirrors the .NET “methods on the client” feeling.
|
||||
|
||||
Cons:
|
||||
|
||||
- Many SDK clients are not designed for inheritance; SDK upgrades can break subclasses.
|
||||
- Users must opt into subclass everywhere.
|
||||
- Typing/initialization can be tricky if the SDK client has non-trivial constructors.
|
||||
|
||||
#### Option 4: Monkey patching
|
||||
|
||||
Attach `create_agent` / `get_agent` methods to an SDK client class (or instance) at runtime.
|
||||
|
||||
Example:
|
||||
|
||||
```python
|
||||
def _create_agent(self, *, name: str, model: str, instructions: str, **kwargs) -> ChatAgent:
|
||||
...
|
||||
|
||||
AIProjectClient.create_agent = _create_agent # monkey patch
|
||||
```
|
||||
|
||||
Pros:
|
||||
|
||||
- Produces “extension method-like” call sites without wrappers or subclasses.
|
||||
|
||||
Cons:
|
||||
|
||||
- Fragile across SDK updates and difficult to type-check.
|
||||
- Surprising behavior (global side effects), potential conflicts across packages.
|
||||
- Harder to support/debug, especially in larger apps and test suites.
|
||||
|
||||
## Decision Outcome
|
||||
|
||||
Implement `create_agent`/`get_agent`/`as_agent` API via **Option 2: Provider object**.
|
||||
|
||||
### Rationale
|
||||
|
||||
| Aspect | Option 1 (Functions) | Option 2 (Provider) |
|
||||
|--------|----------------------|---------------------|
|
||||
| Multiple implementations | One package may contain V1, V2, and other agent types. Function names like `create_agent` become ambiguous - which agent type does it create? | Each provider class is explicit: `AzureAIAgentsProvider` vs `AzureAIProjectAgentProvider` |
|
||||
| Discoverability | Users must know to import specific functions from the package | IDE autocomplete on provider instance shows all available methods |
|
||||
| Client reuse | SDK client must be passed to every function call: `create_agent(client, ...)`, `get_agent(client, ...)` | SDK client passed once at construction: `provider = Provider(client)` |
|
||||
|
||||
**Option 1 example:**
|
||||
```python
|
||||
from agent_framework.azure import create_agent, get_agent
|
||||
agent1 = await create_agent(client, name="Agent1", ...) # Which agent type, V1 or V2?
|
||||
agent2 = await create_agent(client, name="Agent2", ...) # Repetitive client passing
|
||||
```
|
||||
|
||||
**Option 2 example:**
|
||||
```python
|
||||
from agent_framework.azure import AzureAIProjectAgentProvider
|
||||
provider = AzureAIProjectAgentProvider(client) # Clear which service, client passed once
|
||||
agent1 = await provider.create_agent(name="Agent1", ...)
|
||||
agent2 = await provider.create_agent(name="Agent2", ...)
|
||||
```
|
||||
|
||||
### Method Naming
|
||||
|
||||
| Operation | Python | .NET | Async |
|
||||
|-----------|--------|------|-------|
|
||||
| Create on service | `create_agent()` | `CreateAIAgent()` | Yes |
|
||||
| Get from service | `get_agent(id=...)` | `GetAIAgent(agentId)` | Yes |
|
||||
| Wrap SDK object | `as_agent(reference)` | `AsAIAgent(agentInstance)` | No |
|
||||
|
||||
The method names (`create_agent`, `get_agent`) do not explicitly mention "service" or "remote" because:
|
||||
- In Python, the provider class name explicitly identifies the service (`AzureAIAgentsProvider`, `OpenAIAssistantProvider`), making additional qualifiers in method names redundant.
|
||||
- In .NET, these are extension methods on `AIProjectClient` or `AssistantClient`, which already imply service operations.
|
||||
|
||||
### Provider Class Naming
|
||||
|
||||
| Package | Provider Class | SDK Client | Service |
|
||||
|---------|---------------|------------|---------|
|
||||
| `agent_framework.azure` | `AzureAIProjectAgentProvider` | `AIProjectClient` | Azure AI Agent Service, based on Responses API (V2) |
|
||||
| `agent_framework.azure` | `AzureAIAgentsProvider` | `AgentsClient` | Azure AI Agent Service (V1) |
|
||||
| `agent_framework.openai` | `OpenAIAssistantProvider` | `AsyncOpenAI` | OpenAI Assistants API |
|
||||
|
||||
> **Note:** Azure AI naming is temporary. Final naming will be updated according to Azure AI / Microsoft Foundry renaming decisions.
|
||||
|
||||
### Usage Examples
|
||||
|
||||
#### Azure AI Agent Service V2 (based on Responses API)
|
||||
|
||||
```python
|
||||
from agent_framework.azure import AzureAIProjectAgentProvider
|
||||
from azure.ai.projects import AIProjectClient
|
||||
|
||||
client = AIProjectClient(endpoint, credential)
|
||||
provider = AzureAIProjectAgentProvider(client)
|
||||
|
||||
# Create new agent on service
|
||||
agent = await provider.create_agent(name="MyAgent", model="gpt-4", instructions="...")
|
||||
|
||||
# Get existing agent by name
|
||||
agent = await provider.get_agent(agent_name="MyAgent")
|
||||
|
||||
# Wrap already-fetched SDK object (no HTTP calls)
|
||||
agent_ref = await client.agents.get("MyAgent")
|
||||
agent = provider.as_agent(agent_ref)
|
||||
```
|
||||
|
||||
#### Azure AI Persistent Agents V1
|
||||
|
||||
```python
|
||||
from agent_framework.azure import AzureAIAgentsProvider
|
||||
from azure.ai.agents import AgentsClient
|
||||
|
||||
client = AgentsClient(endpoint, credential)
|
||||
provider = AzureAIAgentsProvider(client)
|
||||
|
||||
agent = await provider.create_agent(name="MyAgent", model="gpt-4", instructions="...")
|
||||
agent = await provider.get_agent(agent_id="persistent-agent-456")
|
||||
agent = provider.as_agent(persistent_agent)
|
||||
```
|
||||
|
||||
#### OpenAI Assistants
|
||||
|
||||
```python
|
||||
from agent_framework.openai import OpenAIAssistantProvider
|
||||
from openai import OpenAI
|
||||
|
||||
client = OpenAI()
|
||||
provider = OpenAIAssistantProvider(client)
|
||||
|
||||
agent = await provider.create_agent(name="MyAssistant", model="gpt-4", instructions="...")
|
||||
agent = await provider.get_agent(assistant_id="asst_123")
|
||||
agent = provider.as_agent(assistant)
|
||||
```
|
||||
|
||||
#### Local-Only Agents (No Provider)
|
||||
|
||||
Current method `create_agent` (python) / `CreateAIAgent` (.NET) can be renamed to `as_agent` (python) / `AsAIAgent` (.NET) to emphasize the conversion logic rather than creation/initialization logic and to avoid collision with `create_agent` method for remote calls.
|
||||
|
||||
```python
|
||||
from agent_framework import ChatAgent
|
||||
from agent_framework.openai import OpenAIChatClient
|
||||
|
||||
# Convert chat client to ChatAgent (no remote service involved)
|
||||
client = OpenAIChatClient(model="gpt-4")
|
||||
agent = client.as_agent(name="LocalAgent", instructions="...") # instead of create_agent
|
||||
```
|
||||
|
||||
### Adding New Agent Types
|
||||
|
||||
Python:
|
||||
|
||||
1. Create provider class in appropriate package.
|
||||
2. Implement `create_agent`, `get_agent`, `as_agent` as applicable.
|
||||
|
||||
.NET:
|
||||
|
||||
1. Create static class for extension methods.
|
||||
2. Implement `CreateAIAgentAsync`, `GetAIAgentAsync`, `AsAIAgent` as applicable.
|
||||
129
docs/decisions/0012-python-typeddict-options.md
Normal file
129
docs/decisions/0012-python-typeddict-options.md
Normal file
@@ -0,0 +1,129 @@
|
||||
---
|
||||
# These are optional elements. Feel free to remove any of them.
|
||||
status: proposed
|
||||
contact: eavanvalkenburg
|
||||
date: 2026-01-08
|
||||
deciders: eavanvalkenburg, markwallace-microsoft, sphenry, alliscode, johanst, brettcannon
|
||||
consulted: taochenosu, moonbox3, dmytrostruk, giles17
|
||||
---
|
||||
|
||||
# Leveraging TypedDict and Generic Options in Python Chat Clients
|
||||
|
||||
## Context and Problem Statement
|
||||
|
||||
The Agent Framework Python SDK provides multiple chat client implementations for different providers (OpenAI, Anthropic, Azure AI, Bedrock, Ollama, etc.). Each provider has unique configuration options beyond the common parameters defined in `ChatOptions`. Currently, developers using these clients lack type safety and IDE autocompletion for provider-specific options, leading to runtime errors and a poor developer experience.
|
||||
|
||||
How can we provide type-safe, discoverable options for each chat client while maintaining a consistent API across all implementations?
|
||||
|
||||
## Decision Drivers
|
||||
|
||||
- **Type Safety**: Developers should get compile-time/static analysis errors when using invalid options
|
||||
- **IDE Support**: Full autocompletion and inline documentation for all available options
|
||||
- **Extensibility**: Users should be able to define custom options that extend provider-specific options
|
||||
- **Consistency**: All chat clients should follow the same pattern for options handling
|
||||
- **Provider Flexibility**: Each provider can expose its unique options without affecting the common interface
|
||||
|
||||
## Considered Options
|
||||
|
||||
- **Option 1: Status Quo - Class `ChatOptions` with `**kwargs`**
|
||||
- **Option 2: TypedDict with Generic Type Parameters**
|
||||
|
||||
### Option 1: Status Quo - Class `ChatOptions` with `**kwargs`
|
||||
|
||||
The current approach uses a base `ChatOptions` Class with common parameters, and provider-specific options are passed via `**kwargs` or loosely typed dictionaries.
|
||||
|
||||
```python
|
||||
# Current usage - no type safety for provider-specific options
|
||||
response = await client.get_response(
|
||||
messages=messages,
|
||||
temperature=0.7,
|
||||
top_k=40,
|
||||
random=42, # No validation
|
||||
)
|
||||
```
|
||||
|
||||
**Pros:**
|
||||
- Simple implementation
|
||||
- Maximum flexibility
|
||||
|
||||
**Cons:**
|
||||
- No type checking for provider-specific options
|
||||
- No IDE autocompletion for available options
|
||||
- Runtime errors for typos or invalid options
|
||||
- Documentation must be consulted for each provider
|
||||
|
||||
### Option 2: TypedDict with Generic Type Parameters (Chosen)
|
||||
|
||||
Each chat client is parameterized with a TypeVar bound to a provider-specific `TypedDict` that extends `ChatOptions`. This enables full type safety and IDE support.
|
||||
|
||||
```python
|
||||
# Provider-specific TypedDict
|
||||
class AnthropicChatOptions(ChatOptions, total=False):
|
||||
"""Anthropic-specific chat options."""
|
||||
top_k: int
|
||||
thinking: ThinkingConfig
|
||||
# ... other Anthropic-specific options
|
||||
|
||||
# Generic chat client
|
||||
class AnthropicChatClient(ChatClientBase[TAnthropicChatOptions]):
|
||||
...
|
||||
|
||||
client = AnthropicChatClient(...)
|
||||
|
||||
# Usage with full type safety
|
||||
response = await client.get_response(
|
||||
messages=messages,
|
||||
options={
|
||||
"temperature": 0.7,
|
||||
"top_k": 40,
|
||||
"random": 42, # fails type checking and IDE would flag this
|
||||
}
|
||||
)
|
||||
|
||||
# Users can extend for custom options
|
||||
class MyAnthropicOptions(AnthropicChatOptions, total=False):
|
||||
custom_field: str
|
||||
|
||||
|
||||
client = AnthropicChatClient[MyAnthropicOptions](...)
|
||||
|
||||
# Usage of custom options with full type safety
|
||||
response = await client.get_response(
|
||||
messages=messages,
|
||||
options={
|
||||
"temperature": 0.7,
|
||||
"top_k": 40,
|
||||
"custom_field": "value",
|
||||
}
|
||||
)
|
||||
|
||||
```
|
||||
|
||||
**Pros:**
|
||||
- Full type safety with static analysis
|
||||
- IDE autocompletion for all options
|
||||
- Compile-time error detection
|
||||
- Self-documenting through type hints
|
||||
- Users can extend options for their specific needs or advances in models
|
||||
|
||||
**Cons:**
|
||||
- More complex implementation
|
||||
- Some type: ignore comments needed for TypedDict field overrides
|
||||
- Minor: Requires TypeVar with default (Python 3.13+ or typing_extensions)
|
||||
|
||||
> [NOTE!]
|
||||
> In .NET this is already achieved through overloads on the `GetResponseAsync` method for each provider-specific options class, e.g., `AnthropicChatOptions`, `OpenAIChatOptions`, etc. So this does not apply to .NET.
|
||||
|
||||
### Implementation Details
|
||||
|
||||
1. **Base Protocol**: `ChatClientProtocol[TOptions]` is generic over options type, with default set to `ChatOptions` (the new TypedDict)
|
||||
2. **Provider TypedDicts**: Each provider defines its options extending `ChatOptions`
|
||||
They can even override fields with type=None to indicate they are not supported.
|
||||
3. **TypeVar Pattern**: `TProviderOptions = TypeVar("TProviderOptions", bound=TypedDict, default=ProviderChatOptions, contravariant=True)`
|
||||
4. **Option Translation**: Common options are kept in place,and explicitly documented in the Options class how they are used. (e.g., `user` → `metadata.user_id`) in `_prepare_options` (for Anthropic) to preserve easy use of common options.
|
||||
|
||||
## Decision Outcome
|
||||
|
||||
Chosen option: **"Option 2: TypedDict with Generic Type Parameters"**, because it provides full type safety, excellent IDE support with autocompletion, and allows users to extend provider-specific options for their use cases. Extended this Generic to ChatAgents in order to also properly type the options used in agent construction and run methods.
|
||||
|
||||
See [typed_options.py](../../python/samples/getting_started/chat_client/typed_options.py) for a complete example demonstrating the usage of typed options with custom extensions.
|
||||
258
docs/decisions/0013-python-get-response-simplification.md
Normal file
258
docs/decisions/0013-python-get-response-simplification.md
Normal file
@@ -0,0 +1,258 @@
|
||||
---
|
||||
status: Accepted
|
||||
contact: eavanvalkenburg
|
||||
date: 2026-01-06
|
||||
deciders: markwallace-microsoft, dmytrostruk, taochenosu, alliscode, moonbox3, sphenry
|
||||
consulted: sergeymenshykh, rbarreto, dmytrostruk, westey-m
|
||||
informed:
|
||||
---
|
||||
|
||||
# Simplify Python Get Response API into a single method
|
||||
|
||||
## Context and Problem Statement
|
||||
|
||||
Currently chat clients must implement two separate methods to get responses, one for streaming and one for non-streaming. This adds complexity to the client implementations and increases the maintenance burden. This was likely done because the .NET version cannot do proper typing with a single method, in Python this is possible and this for instance is also how the OpenAI python client works, this would then also make it simpler to work with the Python version because there is only one method to learn about instead of two.
|
||||
|
||||
## Implications of this change
|
||||
|
||||
### Current Architecture Overview
|
||||
|
||||
The current design has **two separate methods** at each layer:
|
||||
|
||||
| Layer | Non-streaming | Streaming |
|
||||
|-------|---------------|-----------|
|
||||
| **Protocol** | `get_response()` → `ChatResponse` | `get_streaming_response()` → `AsyncIterable[ChatResponseUpdate]` |
|
||||
| **BaseChatClient** | `get_response()` (public) | `get_streaming_response()` (public) |
|
||||
| **Implementation** | `_inner_get_response()` (private) | `_inner_get_streaming_response()` (private) |
|
||||
|
||||
### Key Usage Areas Identified
|
||||
|
||||
#### 1. **ChatAgent** (_agents.py)
|
||||
- `run()` → calls `self.chat_client.get_response()`
|
||||
- `run_stream()` → calls `self.chat_client.get_streaming_response()`
|
||||
|
||||
These are parallel methods on the agent, so consolidating the client methods would **not break** the agent API. You could keep `agent.run()` and `agent.run_stream()` unchanged while internally calling `get_response(stream=True/False)`.
|
||||
|
||||
#### 2. **Function Invocation Decorator** (_tools.py)
|
||||
This is **the most impacted area**. Currently:
|
||||
- `_handle_function_calls_response()` decorates `get_response`
|
||||
- `_handle_function_calls_streaming_response()` decorates `get_streaming_response`
|
||||
- The `use_function_invocation` class decorator wraps **both methods separately**
|
||||
|
||||
**Impact**: The decorator logic is almost identical (~200 lines each) with small differences:
|
||||
- Non-streaming collects response, returns it
|
||||
- Streaming yields updates, returns async iterable
|
||||
|
||||
With a unified method, you'd need **one decorator** that:
|
||||
- Checks the `stream` parameter
|
||||
- Uses `@overload` to determine return type
|
||||
- Handles both paths with conditional logic
|
||||
- The new decorator could be applied just on the method, instead of the whole class.
|
||||
|
||||
This would **reduce code duplication** but add complexity to a single function.
|
||||
|
||||
#### 3. **Observability/Instrumentation** (observability.py)
|
||||
Same pattern as function invocation:
|
||||
- `_trace_get_response()` wraps `get_response`
|
||||
- `_trace_get_streaming_response()` wraps `get_streaming_response`
|
||||
- `use_instrumentation` decorator applies both
|
||||
|
||||
**Impact**: Would need consolidation into a single tracing wrapper.
|
||||
|
||||
#### 4. **Chat Middleware** (_middleware.py)
|
||||
The `use_chat_middleware` decorator also wraps both methods separately with similar logic.
|
||||
|
||||
#### 5. **AG-UI Client** (_client.py)
|
||||
Wraps both methods to unwrap server function calls:
|
||||
```python
|
||||
original_get_streaming_response = chat_client.get_streaming_response
|
||||
original_get_response = chat_client.get_response
|
||||
```
|
||||
|
||||
#### 6. **Provider Implementations** (all subpackages)
|
||||
All subclasses implement both `_inner_*` methods, except:
|
||||
- OpenAI Assistants Client (and similar clients, such as Foundry Agents V1) - it implements `_inner_get_response` by calling `_inner_get_streaming_response`
|
||||
|
||||
### Implications of Consolidation
|
||||
|
||||
| Aspect | Impact |
|
||||
|--------|--------|
|
||||
| **Type Safety** | Overloads work well: `@overload` with `Literal[True]` → `AsyncIterable`, `Literal[False]` → `ChatResponse`. Runtime return type based on `stream` param. |
|
||||
| **Breaking Change** | **Major breaking change** for anyone implementing custom chat clients. They'd need to update from 2 methods to 1 (or 2 inner methods to 1). |
|
||||
| **Decorator Complexity** | All 3 decorator systems (function invocation, middleware, observability) would need refactoring to handle both paths in one wrapper. |
|
||||
| **Code Reduction** | Significant reduction in _tools.py (~200 lines of near-duplicate code) and other decorators. |
|
||||
| **Samples/Tests** | Many samples call `get_streaming_response()` directly - would need updates. |
|
||||
| **Protocol Simplification** | `ChatClientProtocol` goes from 2 methods + 1 property to 1 method + 1 property. |
|
||||
|
||||
### Recommendation
|
||||
|
||||
The consolidation makes sense architecturally, but consider:
|
||||
|
||||
1. **The overload pattern with `stream: bool`** works well in Python typing:
|
||||
```python
|
||||
@overload
|
||||
async def get_response(self, messages, *, stream: Literal[True] = True, ...) -> AsyncIterable[ChatResponseUpdate]: ...
|
||||
@overload
|
||||
async def get_response(self, messages, *, stream: Literal[False] = False, ...) -> ChatResponse: ...
|
||||
```
|
||||
|
||||
2. **The decorator complexity** is the biggest concern. The current approach of separate decorators for separate methods is cleaner than conditional logic inside one wrapper.
|
||||
|
||||
## Decision Drivers
|
||||
|
||||
- Reduce code needed to implement a Chat Client, simplify the public API for chat clients
|
||||
- Reduce code duplication in decorators and middleware
|
||||
- Maintain type safety and clarity in method signatures
|
||||
|
||||
## Considered Options
|
||||
|
||||
1. Status quo: Keep separate methods for streaming and non-streaming
|
||||
2. Consolidate into a single `get_response` method with a `stream` parameter
|
||||
3. Option 2 plus merging `agent.run` and `agent.run_stream` into a single method with a `stream` parameter as well
|
||||
|
||||
## Option 1: Status Quo
|
||||
- Good: Clear separation of streaming vs non-streaming logic
|
||||
- Good: Aligned with .NET design, although it is already `run` for Python and `RunAsync` for .NET
|
||||
- Bad: Code duplication in decorators and middleware
|
||||
- Bad: More complex client implementations
|
||||
|
||||
## Option 2: Consolidate into Single Method
|
||||
- Good: Simplified public API for chat clients
|
||||
- Good: Reduced code duplication in decorators
|
||||
- Good: Smaller API footprint for users to get familiar with
|
||||
- Good: People using OpenAI directly already expect this pattern
|
||||
- Bad: Increased complexity in decorators and middleware
|
||||
- Bad: Less alignment with .NET design (`get_response(stream=True)` vs `GetStreamingResponseAsync`)
|
||||
|
||||
## Option 3: Consolidate + Merge Agent and Workflow Methods
|
||||
- Good: Further simplifies agent and workflow implementation
|
||||
- Good: Single method for all chat interactions
|
||||
- Good: Smaller API footprint for users to get familiar with
|
||||
- Good: People using OpenAI directly already expect this pattern
|
||||
- Good: Workflows internally already use a single method (_run_workflow_with_tracing), so would eliminate public API duplication as well, with hardly any code changes
|
||||
- Bad: More breaking changes for agent users
|
||||
- Bad: Increased complexity in agent implementation
|
||||
- Bad: More extensive misalignment with .NET design (`run(stream=True)` vs `RunStreamingAsync` in addition to `get_response` change)
|
||||
|
||||
## Misc
|
||||
|
||||
Smaller questions to consider:
|
||||
- Should default be `stream=False` or `stream=True`? (Current is False)
|
||||
- Default to `False` makes it simpler for new users, as non-streaming is easier to handle.
|
||||
- Default to `False` aligns with existing behavior.
|
||||
- Streaming tends to be faster, so defaulting to `True` could improve performance for common use cases.
|
||||
- Should this differ between ChatClient, Agent and Workflows? (e.g., Agent and Workflow defaults to streaming, ChatClient to non-streaming)
|
||||
|
||||
## Decision Outcome
|
||||
|
||||
Chosen Option: **Option 3: Consolidate + Merge Agent and Workflow Methods**
|
||||
|
||||
Since this is the most pythonic option and it reduces the API surface and code duplication the most, we will go with this option.
|
||||
We will keep the default of `stream=False` for all methods to maintain backward compatibility and simplicity for new users.
|
||||
|
||||
# Appendix
|
||||
## Code Samples for Consolidated Method
|
||||
|
||||
### Python - Option 3: Direct ChatClient + Agent with Single Method
|
||||
|
||||
```python
|
||||
# Copyright (c) Microsoft. All rights reserved.
|
||||
|
||||
import asyncio
|
||||
from random import randint
|
||||
from typing import Annotated
|
||||
|
||||
from agent_framework import ChatAgent
|
||||
from agent_framework.openai import OpenAIChatClient
|
||||
from pydantic import Field
|
||||
|
||||
|
||||
def get_weather(
|
||||
location: Annotated[str, Field(description="The location to get the weather for.")],
|
||||
) -> str:
|
||||
"""Get the weather for a given location."""
|
||||
conditions = ["sunny", "cloudy", "rainy", "stormy"]
|
||||
return f"The weather in {location} is {conditions[randint(0, 3)]} with a high of {randint(10, 30)}°C."
|
||||
|
||||
|
||||
async def main() -> None:
|
||||
# Example 1: Direct ChatClient usage with single method
|
||||
client = OpenAIChatClient()
|
||||
message = "What's the weather in Amsterdam and in Paris?"
|
||||
|
||||
# Non-streaming usage
|
||||
print(f"User: {message}")
|
||||
response = await client.get_response(message, tools=get_weather)
|
||||
print(f"Assistant: {response.text}")
|
||||
|
||||
# Streaming usage - same method, different parameter
|
||||
print(f"\nUser: {message}")
|
||||
print("Assistant: ", end="")
|
||||
async for chunk in client.get_response(message, tools=get_weather, stream=True):
|
||||
if chunk.text:
|
||||
print(chunk.text, end="")
|
||||
print("")
|
||||
|
||||
# Example 2: Agent usage with single method
|
||||
agent = ChatAgent(
|
||||
chat_client=client,
|
||||
tools=get_weather,
|
||||
name="WeatherAgent",
|
||||
instructions="You are a weather assistant.",
|
||||
)
|
||||
thread = agent.get_new_thread()
|
||||
|
||||
# Non-streaming agent
|
||||
print(f"\nUser: {message}")
|
||||
result = await agent.run(message, thread=thread) # default would be stream=False
|
||||
print(f"{agent.name}: {result.text}")
|
||||
|
||||
# Streaming agent - same method, different parameter
|
||||
print(f"\nUser: {message}")
|
||||
print(f"{agent.name}: ", end="")
|
||||
async for update in agent.run(message, thread=thread, stream=True):
|
||||
if update.text:
|
||||
print(update.text, end="")
|
||||
print("")
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
asyncio.run(main())
|
||||
```
|
||||
|
||||
### .NET - Current pattern for comparison
|
||||
|
||||
```csharp
|
||||
// Copyright (c) Microsoft. All rights reserved.
|
||||
|
||||
using Azure.AI.OpenAI;
|
||||
using Azure.Identity;
|
||||
using Microsoft.Agents.AI;
|
||||
using OpenAI.Chat;
|
||||
|
||||
var endpoint = Environment.GetEnvironmentVariable("AZURE_OPENAI_ENDPOINT")
|
||||
?? throw new InvalidOperationException("AZURE_OPENAI_ENDPOINT is not set.");
|
||||
var deploymentName = Environment.GetEnvironmentVariable("AZURE_OPENAI_DEPLOYMENT_NAME") ?? "gpt-4o-mini";
|
||||
|
||||
AIAgent agent = new AzureOpenAIClient(
|
||||
new Uri(endpoint),
|
||||
new AzureCliCredential())
|
||||
.GetChatClient(deploymentName)
|
||||
.CreateAIAgent(
|
||||
instructions: "You are good at telling jokes about pirates.",
|
||||
name: "PirateJoker");
|
||||
|
||||
// Non-streaming: Returns a string directly
|
||||
Console.WriteLine("=== Non-streaming ===");
|
||||
string result = await agent.RunAsync("Tell me a joke about a pirate.");
|
||||
Console.WriteLine(result);
|
||||
|
||||
// Streaming: Returns IAsyncEnumerable<AgentUpdate>
|
||||
Console.WriteLine("\n=== Streaming ===");
|
||||
await foreach (AgentUpdate update in agent.RunStreamingAsync("Tell me a joke about a pirate."))
|
||||
{
|
||||
Console.Write(update);
|
||||
}
|
||||
Console.WriteLine();
|
||||
|
||||
```
|
||||
423
docs/decisions/0014-feature-collections.md
Normal file
423
docs/decisions/0014-feature-collections.md
Normal file
@@ -0,0 +1,423 @@
|
||||
---
|
||||
status: accepted
|
||||
contact: westey-m
|
||||
date: 2025-01-21
|
||||
deciders: sergeymenshykh, markwallace, rbarreto, westey-m, stephentoub
|
||||
consulted: reubenbond
|
||||
informed:
|
||||
---
|
||||
|
||||
# Feature Collections
|
||||
|
||||
## Context and Problem Statement
|
||||
|
||||
When using agents, we often have cases where we want to pass some arbitrary services or data to an agent or some component in the agent execution stack.
|
||||
These services or data are not necessarily known at compile time and can vary by the agent stack that the user has built.
|
||||
E.g., there may be an agent decorator or chat client decorator that was added to the stack by the user, and an arbitrary payload needs to be passed to that decorator.
|
||||
|
||||
Since these payloads are related to components that are not integral parts of the agent framework, they cannot be added as strongly typed settings to the agent run options.
|
||||
However, the payloads could be added to the agent run options as loosely typed 'features', that can be retrieved as needed.
|
||||
|
||||
In some cases certain classes of agents may support the same capability, but not all agents do.
|
||||
Having the configuration for such a capability on the main abstraction would advertise the functionality to all users, even if their chosen agent does not support it.
|
||||
The user may type test for certain agent types, and call overloads on the appropriate agent types, with the strongly typed configuration.
|
||||
Having a feature collection though, would be an alternative way of passing such configuration, without needing to type check the agent type.
|
||||
All agents that support the functionality would be able to check for the configuration and use it, simplifying the user code.
|
||||
If the agent does not support the capability, that configuration would be ignored.
|
||||
|
||||
### Sample Scenario 1 - Per Run ChatMessageStore Override for hosting Libraries
|
||||
|
||||
We are building an agent hosting library, that can host any agent built using the agent framework.
|
||||
Where an agent is not built on a service that uses in-service chat history storage, the hosting library wants to force the agent to use
|
||||
the hosting library's chat history storage implementation.
|
||||
This chat history storage implementation may be specifically tailored to the type of protocol that the hosting library uses, e.g. conversation id based storage or response id based storage.
|
||||
The hosting library does not know what type of agent it is hosting, so it cannot provide a strongly typed parameter on the agent.
|
||||
Instead, it adds the chat history storage implementation to a feature collection, and if the agent supports custom chat history storage, it retrieves the implementation from the feature collection and uses it.
|
||||
|
||||
```csharp
|
||||
// Pseudo-code for an agent hosting library that supports conversation id based hosting.
|
||||
public async Task<string> HandleConversationsBasedRequestAsync(AIAgent agent, string conversationId, string userInput)
|
||||
{
|
||||
var thread = await this._threadStore.GetOrCreateThread(conversationId);
|
||||
|
||||
// The hosting library can set a per-run chat message store via Features that only applies for that run.
|
||||
// This message store will load and save messages under the conversation id provided.
|
||||
ConversationsChatMessageStore messageStore = new(this._dbClient, conversationId);
|
||||
var response = await agent.RunAsync(
|
||||
userInput,
|
||||
thread,
|
||||
options: new AgentRunOptions()
|
||||
{
|
||||
Features = new AgentFeatureCollection().WithFeature<ChatMessageStore>(messageStore)
|
||||
});
|
||||
|
||||
await this._threadStore.SaveThreadAsync(conversationId, thread);
|
||||
return response.Text;
|
||||
}
|
||||
|
||||
// Pseudo-code for an agent hosting library that supports response id based hosting.
|
||||
public async Task<(string responseMessage, string responseId)> HandleResponseIdBasedRequestAsync(AIAgent agent, string previousResponseId, string userInput)
|
||||
{
|
||||
var thread = await this._threadStore.GetOrCreateThreadAsync(previousResponseId);
|
||||
|
||||
// The hosting library can set a per-run chat message store via Features that only applies for that run.
|
||||
// This message store will buffer newly added messages until explicitly saved after the run.
|
||||
ResponsesChatMessageStore messageStore = new(this._dbClient, previousResponseId);
|
||||
|
||||
var response = await agent.RunAsync(
|
||||
userInput,
|
||||
thread,
|
||||
options: new AgentRunOptions()
|
||||
{
|
||||
Features = new AgentFeatureCollection().WithFeature<ChatMessageStore>(messageStore)
|
||||
});
|
||||
|
||||
// Since the message store may not actually have been used at all (if the agent's underlying chat client requires service-based chat history storage),
|
||||
// we may not have anything to save back to the database.
|
||||
// We still want to generate a new response id though, so that we can save the updated thread state under that id.
|
||||
// We should also use the same id to save any buffered messages in the message store if there are any.
|
||||
var newResponseId = this.GenerateResponseId();
|
||||
if (messageStore.HasBufferedMessages)
|
||||
{
|
||||
await messageStore.SaveBufferedMessagesAsync(newResponseId);
|
||||
}
|
||||
|
||||
// Save the updated thread state under the new response id that was generated by the store.
|
||||
await this._threadStore.SaveThreadAsync(newResponseId, thread);
|
||||
return (response.Text, newResponseId);
|
||||
}
|
||||
```
|
||||
|
||||
### Sample Scenario 2 - Structured output
|
||||
|
||||
Currently our base abstraction does not support structured output, since the capability is not supported by all agents.
|
||||
For those agents that don't support structured output, we could add an agent decorator that takes the response from the underlying agent, and applies structured output parsing on top of it via an additional LLM call.
|
||||
|
||||
If we add structured output configuration as a feature, then any agent that supports structured output could retrieve the configuration from the feature collection and apply it, and where it is not supported, the configuration would simply be ignored.
|
||||
|
||||
We could add a simple StructuredOutputAgentFeature that can be added to the list of features and also be used to return the generated structured output.
|
||||
|
||||
```csharp
|
||||
internal class StructuredOutputAgentFeature
|
||||
{
|
||||
public Type? OutputType { get; set; }
|
||||
|
||||
public JsonSerializerOptions? SerializerOptions { get; set; }
|
||||
|
||||
public bool? UseJsonSchemaResponseFormat { get; set; }
|
||||
|
||||
// Contains the result of the structured output parsing request.
|
||||
public ChatResponse? ChatResponse { get; set; }
|
||||
}
|
||||
```
|
||||
|
||||
We can add a simple decorator class that does the chat client invocation.
|
||||
|
||||
```csharp
|
||||
public class StructuredOutputAgent : DelegatingAIAgent
|
||||
{
|
||||
private readonly IChatClient _chatClient;
|
||||
public StructuredOutputAgent(AIAgent innerAgent, IChatClient chatClient)
|
||||
: base(innerAgent)
|
||||
{
|
||||
this._chatClient = Throw.IfNull(chatClient);
|
||||
}
|
||||
|
||||
public override async Task<AgentRunResponse> RunAsync(
|
||||
IEnumerable<ChatMessage> messages,
|
||||
AgentThread? thread = null,
|
||||
AgentRunOptions? options = null,
|
||||
CancellationToken cancellationToken = default)
|
||||
{
|
||||
// Run the inner agent first, to get back the text response we want to convert.
|
||||
var response = await base.RunAsync(messages, thread, options, cancellationToken).ConfigureAwait(false);
|
||||
|
||||
if (options?.Features?.TryGet<StructuredOutputAgentFeature>(out var responseFormatFeature) is true
|
||||
&& responseFormatFeature.OutputType is not null)
|
||||
{
|
||||
// Create the chat options to request structured output.
|
||||
ChatOptions chatOptions = new()
|
||||
{
|
||||
ResponseFormat = ChatResponseFormat.ForJsonSchema(responseFormatFeature.OutputType, responseFormatFeature.SerializerOptions)
|
||||
};
|
||||
|
||||
// Invoke the chat client to transform the text output into structured data.
|
||||
// The feature is updated with the result.
|
||||
// The code can be simplified by adding a non-generic structured output GetResponseAsync
|
||||
// overload that takes Type as input.
|
||||
responseFormatFeature.ChatResponse = await this._chatClient.GetResponseAsync(
|
||||
messages: new[]
|
||||
{
|
||||
new ChatMessage(ChatRole.System, "You are a json expert and when provided with any text, will convert it to the requested json format."),
|
||||
new ChatMessage(ChatRole.User, response.Text)
|
||||
},
|
||||
options: chatOptions,
|
||||
cancellationToken: cancellationToken).ConfigureAwait(false);
|
||||
}
|
||||
|
||||
return response;
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
Finally, we can add an extension method on `AIAgent` that can add the feature to the run options and check the feature for the structured output result and add the deserialized result to the response.
|
||||
|
||||
```csharp
|
||||
public static async Task<AgentRunResponse<T>> RunAsync<T>(
|
||||
this AIAgent agent,
|
||||
IEnumerable<ChatMessage> messages,
|
||||
AgentThread? thread = null,
|
||||
JsonSerializerOptions? serializerOptions = null,
|
||||
AgentRunOptions? options = null,
|
||||
bool? useJsonSchemaResponseFormat = null,
|
||||
CancellationToken cancellationToken = default)
|
||||
{
|
||||
// Create the structured output feature.
|
||||
var structuredOutputFeature = new StructuredOutputAgentFeature();
|
||||
structuredOutputFeature.OutputType = typeof(T);
|
||||
structuredOutputFeature.UseJsonSchemaResponseFormat = useJsonSchemaResponseFormat;
|
||||
|
||||
// Run the agent.
|
||||
options ??= new AgentRunOptions();
|
||||
options.Features ??= new AgentFeatureCollection();
|
||||
options.Features.Set(structuredOutputFeature);
|
||||
|
||||
var response = await agent.RunAsync(messages, thread, options, cancellationToken).ConfigureAwait(false);
|
||||
|
||||
// Deserialize the JSON output.
|
||||
if (structuredOutputFeature.ChatResponse is not null)
|
||||
{
|
||||
var typed = new ChatResponse<T>(structuredOutputFeature.ChatResponse, serializerOptions ?? AgentJsonUtilities.DefaultOptions);
|
||||
return new AgentRunResponse<T>(response, typed.Result);
|
||||
}
|
||||
|
||||
throw new InvalidOperationException("No structured output response was generated by the agent.");
|
||||
}
|
||||
```
|
||||
|
||||
We can then use the extension method with any agent that supports structured output or that has
|
||||
been decorated with the `StructuredOutputAgent` decorator.
|
||||
|
||||
```csharp
|
||||
agent = new StructuredOutputAgent(agent, chatClient);
|
||||
|
||||
AgentRunResponse<PersonInfo> response = await agent.RunAsync<PersonInfo>([new ChatMessage(
|
||||
ChatRole.User,
|
||||
"Please provide information about John Smith, who is a 35-year-old software engineer.")]);
|
||||
```
|
||||
|
||||
## Implementation Options
|
||||
|
||||
Three options were considered for implementing feature collections:
|
||||
|
||||
- **Option 1**: FeatureCollections similar to ASP.NET Core
|
||||
- **Option 2**: AdditionalProperties Dictionary
|
||||
- **Option 3**: IServiceProvider
|
||||
|
||||
Here are some comparisons about their suitability for our use case:
|
||||
|
||||
| Criteria | Feature Collection | Additional Properties | IServiceProvider |
|
||||
|------------------|--------------------|-----------------------|------------------|
|
||||
|Ease of use |✅ Good |❌ Bad |✅ Good |
|
||||
|User familiarity |❌ Bad |✅ Good |✅ Good |
|
||||
|Type safety |✅ Good |❌ Bad |✅ Good |
|
||||
|Ability to modify registered options when progressing down the stack|✅ Supported|✅ Supported|❌ Not-Supported (IServiceProvider is read-only)|
|
||||
|Already available in MEAI stack|❌ No|✅ Yes|❌ No|
|
||||
|Ambiguity with existing AdditionalProperties|❌ Yes|✅ No|❌ Yes|
|
||||
|
||||
## IServiceProvider
|
||||
|
||||
Service Collections and Service Providers provide a very popular way to register and retrieve services by type and could be used as a way to pass features to agents and chat clients.
|
||||
|
||||
However, since IServiceProvider is read-only, it is not possible to modify the registered services when progressing down the execution stack.
|
||||
E.g. an agent decorator cannot add additional services to the IServiceProvider passed to it when calling into the inner agent.
|
||||
|
||||
IServiceProvider also does not expose a way to list all services contained in it, making it difficult to copy services from one provider to another.
|
||||
|
||||
This lack of mutability makes IServiceProvider unsuitable for our use case, since we will not be able to use it to build sample scenario 2.
|
||||
|
||||
## AdditionalProperties dictionary
|
||||
|
||||
The AdditionalProperties dictionary is already available on various options classes in the agent framework as well as in the MEAI stack and
|
||||
allows storing arbitrary key/value pairs, where the key is a string and the value is an object.
|
||||
|
||||
While FeatureCollection uses Type as a key, AdditionalProperties uses string keys.
|
||||
This means that users need to agree on string keys to use for specific features, however it is also possible to use Type.FullName as a key by convention
|
||||
to avoid key collisions, which is an easy convention to follow.
|
||||
|
||||
Since the value of AdditionalProperties is of type object, users need to cast the value to the expected type when retrieving it, which is also
|
||||
a drawback, but when using the convention of using Type.FullName as a key, there is at least a clear expectation of what type to cast to.
|
||||
|
||||
```csharp
|
||||
// Setting a feature
|
||||
options.AdditionalProperties[typeof(MyFeature).FullName] = new MyFeature();
|
||||
|
||||
// Retrieving a feature
|
||||
if (options.AdditionalProperties.TryGetValue(typeof(MyFeature).FullName, out var featureObj)
|
||||
&& featureObj is MyFeature myFeature)
|
||||
{
|
||||
// Use myFeature
|
||||
}
|
||||
```
|
||||
|
||||
It would also be possible to add extension methods to simplify setting and getting features from AdditionalProperties.
|
||||
Having a base class for features should help make this more feature rich.
|
||||
|
||||
```csharp
|
||||
// Setting a feature, this can use Type.FullName as the key.
|
||||
options.AdditionalProperties
|
||||
.WithFeature(new MyFeature());
|
||||
|
||||
// Retrieving a feature, this can use Type.FullName as the key.
|
||||
if (options.AdditionalProperties.TryGetFeature<MyFeature>(out var myFeature))
|
||||
{
|
||||
// Use myFeature
|
||||
}
|
||||
```
|
||||
|
||||
It would also be possible to add extension methods for a feature to simplify setting and getting features from AdditionalProperties.
|
||||
|
||||
```csharp
|
||||
// Setting a feature
|
||||
options.AdditionalProperties
|
||||
.WithMyFeature(new MyFeature());
|
||||
// Retrieving a feature
|
||||
if (options.AdditionalProperties.TryGetMyFeature(out var myFeature))
|
||||
{
|
||||
// Use myFeature
|
||||
}
|
||||
```
|
||||
|
||||
## Feature Collection
|
||||
|
||||
If we choose the feature collection option, we need to decide on the design of the feature collection itself.
|
||||
|
||||
### Feature Collections extension points
|
||||
|
||||
We need to decide the set of actions that feature collections would be supported for. Here is the suggested list of actions:
|
||||
|
||||
**MAAI.AIAgent:**
|
||||
|
||||
1. GetNewThread
|
||||
1. E.g. this would allow passing an already existing storage id for the thread to use, or an initialized custom chat message store to use.
|
||||
1. DeserializeThread
|
||||
1. E.g. this would allow passing an already existing storage id for the thread to use, or an initialized custom chat message store to use.
|
||||
1. Run / RunStreaming
|
||||
1. E.g. this would allow passing an override chat message store just for that run, or a desired schema for a structured output middleware component.
|
||||
|
||||
**MEAI.ChatClient:**
|
||||
|
||||
1. GetResponse / GetStreamingResponse
|
||||
|
||||
### Reconciling with existing AdditionalProperties
|
||||
|
||||
If we decide to add feature collections, separately from the existing AdditionalProperties dictionaries, we need to consider how to explain to users when to use each one.
|
||||
One possible approach though is to have the one use the other under the hood.
|
||||
AdditionalProperties could be stored as a feature in the feature collection.
|
||||
|
||||
Users would be able to retrieve additional properties from the feature collection, in addition to retrieving it via a dedicated AdditionalProperties property.
|
||||
E.g. `features.Get<AdditionalPropertiesDictionary>()`
|
||||
|
||||
One challenge with this approach is that when setting a value in the AdditionalProperties dictionary, the feature collection would need to be created first if it does not already exist.
|
||||
|
||||
```csharp
|
||||
public class AgentRunOptions
|
||||
{
|
||||
public AdditionalPropertiesDictionary? AdditionalProperties { get; set; }
|
||||
public IAgentFeatureCollection? Features { get; set; }
|
||||
}
|
||||
|
||||
var options = new AgentRunOptions();
|
||||
// This would need to create the feature collection first, if it does not already exist.
|
||||
options.AdditionalProperties ??= new AdditionalPropertiesDictionary();
|
||||
```
|
||||
|
||||
Since IAgentFeatureCollection is an interface, AgentRunOptions would need to have a concrete implementation of the interface to create, meaning that the user cannot decide.
|
||||
It also means that if the user doesn't realise that AdditionalProperties is implemented using feature collections, they may set a value on AdditionalProperties, and then later overwrite the entire feature collection, losing the AdditionalProperties feature.
|
||||
|
||||
Options to avoid these issues:
|
||||
|
||||
1. Make `Features` readonly.
|
||||
1. This would prevent the user from overwriting the feature collection after setting AdditionalProperties.
|
||||
1. Since the user cannot set their own implementation of IAgentFeatureCollection, having an interface for it may not be necessary.
|
||||
|
||||
### Feature Collection Implementation
|
||||
|
||||
We have two options for implementing feature collections:
|
||||
|
||||
1. Create our own [IAgentFeatureCollection interface](https://github.com/microsoft/agent-framework/pull/2354/files#diff-9c42f3e60d70a791af9841d9214e038c6de3eebfc10e3997cb4cdffeb2f1246d) and [implementation](https://github.com/microsoft/agent-framework/pull/2354/files#diff-a435cc738baec500b8799f7f58c1538e3bb06c772a208afc2615ff90ada3f4ca).
|
||||
2. Reuse the asp.net [IFeatureCollection interface](https://github.com/dotnet/aspnetcore/blob/main/src/Extensions/Features/src/IFeatureCollection.cs) and [implementation](https://github.com/dotnet/aspnetcore/blob/main/src/Extensions/Features/src/FeatureCollection.cs).
|
||||
|
||||
#### Roll our own
|
||||
|
||||
Advantages:
|
||||
|
||||
Creating our own IAgentFeatureCollection interface and implementation has the advantage of being more clearly associated with the agent framework and allows us to
|
||||
improve on some of the design decisions made in asp.net core's IFeatureCollection.
|
||||
|
||||
Drawbacks:
|
||||
|
||||
It would mean a different implementation to maintain and test.
|
||||
|
||||
#### Reuse asp.net IFeatureCollection
|
||||
|
||||
Advantages:
|
||||
|
||||
Reusing the asp.net IFeatureCollection has the advantage of being able to reuse the well-established and tested implementation from asp.net
|
||||
core. Users who are using agents in an asp.net core application may be able to pass feature collections from asp.net core to the agent framework directly.
|
||||
|
||||
Drawbacks:
|
||||
|
||||
While the package name is `Microsoft.Extensions.Features`, the namespaces of the types are `Microsoft.AspNetCore.Http.Features`, which may create confusion for users of agent framework who are not building web applications or services.
|
||||
Users may rightly ask: Why do I need to use a class from asp.net core when I'm not building a web application / service?
|
||||
|
||||
The current design has some design issues that would be good to avoid. E.g. it does not distinguish between a feature being "not set" and "null". Get returns both as null and there is no tryget method.
|
||||
Since the [default implementation](https://github.com/dotnet/aspnetcore/blob/main/src/Extensions/Features/src/FeatureCollection.cs) also supports value types, it throws for null values of value types.
|
||||
A TryGet method would be more appropriate.
|
||||
|
||||
## Feature Layering
|
||||
|
||||
One possible scenario when adding support for feature collections is to allow layering of features by scope.
|
||||
|
||||
The following levels of scope could be supported:
|
||||
|
||||
1. Application - Application wide features that apply to all agents / chat clients
|
||||
2. Artifact (Agent / ChatClient) - Features that apply to all runs of a specific agent or chat client instance
|
||||
3. Action (GetNewThread / Run / GetResponse) - Feature that apply to a single action only
|
||||
|
||||
When retrieving a feature from the collection, the search would start from the most specific scope (Action) and progress to the least specific scope (Application), returning the first matching feature found.
|
||||
|
||||
Introducing layering adds some challenges:
|
||||
|
||||
- There may be multiple feature collections at the same scope level, e.g. an Agent that uses a ChatClient where both have their own feature collections.
|
||||
- Do we layer the agent feature collection over the chat client feature collection (Application -> ChatClient -> Agent -> Run), or only use the agent feature collection in the agent (Application -> Agent -> Run), and the chat client feature collection in the chat client (Application -> ChatClient -> Run)?
|
||||
- The appropriate base feature collection may change when progressing down the stack, e.g. when an Agent calls a ChatClient, the action feature collection stays the same, but the artifact feature collection changes.
|
||||
- Who creates the feature collection hierarchy?
|
||||
- Since the hierarchy changes as it progresses down the execution stack, and the caller can only pass in the action level feature collection, the callee needs to combine it with its own artifact level feature collection and the application level feature collection. Each action will need to build the appropriate feature collection hierarchy, at the start of its execution.
|
||||
- For Artifact level features, it seems odd to pass them in as a bag of untyped features, when we are constructing a known artifact type and therefore can have typed settings.
|
||||
- E.g. today we have a strongly typed setting on ChatClientAgentOptions to configure a ChatMessageStore for the agent.
|
||||
- To avoid global statics for application level features, the user would need to pass in the application level feature collection to each artifact that they create.
|
||||
- This would be very odd if the user also already has to strongly typed settings for each feature that they want to set at the artifact level.
|
||||
|
||||
### Layering Options
|
||||
|
||||
1. No layering - only a single feature collection is supported per action (the caller can still create a layered collection if desired, but the callee does not do any layering automatically).
|
||||
1. Fallback is to any features configured on the artifact via strongly typed settings.
|
||||
1. Full layering - support layering at all levels (Application -> Artifact -> Action).
|
||||
1. Only apply applicable artifact level features when calling into that artifact.
|
||||
1. Apply upstream artifact features when calling into downstream artifacts, e.g. Feature hierarchy in ChatClientAgent would be `Application -> Agent -> Run` and in ChatClient would be `Application -> ChatClient -> Agent -> Run` or `Application -> Agent -> ChatClient -> Run`
|
||||
1. The user needs to provide the application level feature collection to each artifact that they create and artifact features are passed via strongly typed settings.
|
||||
|
||||
### Accessing application level features Options
|
||||
|
||||
We need to consider how application level features would be accessed if supported.
|
||||
|
||||
1. The user provides the application level feature collection to each artifact that the user constructs
|
||||
1. Passing the application level feature collection to each artifact is tedious for the user.
|
||||
1. There is a static application level feature collection that can be accessed globally.
|
||||
1. Statics create issues with testing and isolation.
|
||||
|
||||
## Decisions
|
||||
|
||||
- Feature Collections Container: Use AdditionalProperties
|
||||
- Feature Layering: No layering - only a single collection/dictionary is supported per action. Application layers can be added later if needed.
|
||||
24
docs/decisions/README.md
Normal file
24
docs/decisions/README.md
Normal file
@@ -0,0 +1,24 @@
|
||||
# Architectural Decision Records (ADRs)
|
||||
|
||||
An Architectural Decision (AD) is a justified software design choice that addresses a functional or non-functional requirement that is architecturally significant. An Architectural Decision Record (ADR) captures a single AD and its rationale.
|
||||
|
||||
For more information [see](https://adr.github.io/)
|
||||
|
||||
## How are we using ADRs to track technical decisions?
|
||||
|
||||
1. Copy docs/decisions/adr-template.md to docs/decisions/NNNN-title-with-dashes.md, where NNNN indicates the next number in sequence.
|
||||
1. Check for existing PR's to make sure you use the correct sequence number.
|
||||
2. There is also a short form template docs/decisions/adr-short-template.md
|
||||
2. Edit NNNN-title-with-dashes.md.
|
||||
1. Status must initially be `proposed`
|
||||
2. List of `deciders` must include the github ids of the people who will sign off on the decision.
|
||||
3. The relevant EM and architect must be listed as deciders or informed of all decisions.
|
||||
4. You should list the names or github ids of all partners who were consulted as part of the decision.
|
||||
5. Keep the list of `deciders` short. You can also list people who were `consulted` or `informed` about the decision.
|
||||
3. For each option list the good, neutral and bad aspects of each considered alternative.
|
||||
1. Detailed investigations can be included in the `More Information` section inline or as links to external documents.
|
||||
4. Share your PR with the deciders and other interested parties.
|
||||
1. Deciders must be listed as required reviewers.
|
||||
2. The status must be updated to `accepted` once a decision is agreed and the date must also be updated.
|
||||
3. Approval of the decision is captured using PR approval.
|
||||
5. Decisions can be changed later and superseded by a new ADR. In this case it is useful to record any negative outcomes in the original ADR.
|
||||
36
docs/decisions/adr-short-template.md
Normal file
36
docs/decisions/adr-short-template.md
Normal file
@@ -0,0 +1,36 @@
|
||||
---
|
||||
# These are optional elements. Feel free to remove any of them.
|
||||
status: {proposed | rejected | accepted | deprecated | … | superseded by [ADR-0001](0001-madr-architecture-decisions.md)}
|
||||
contact: {person proposing the ADR}
|
||||
date: {YYYY-MM-DD when the decision was last updated}
|
||||
deciders: {list everyone involved in the decision}
|
||||
consulted: {list everyone whose opinions are sought (typically subject-matter experts); and with whom there is a two-way communication}
|
||||
informed: {list everyone who is kept up-to-date on progress; and with whom there is a one-way communication}
|
||||
---
|
||||
|
||||
# {short title of solved problem and solution}
|
||||
|
||||
## Context and Problem Statement
|
||||
|
||||
{Describe the context and problem statement, e.g., in free form using two to three sentences or in the form of an illustrative story.
|
||||
You may want to articulate the problem in form of a question and add links to collaboration boards or issue management systems.}
|
||||
|
||||
<!-- This is an optional element. Feel free to remove. -->
|
||||
|
||||
## Decision Drivers
|
||||
|
||||
- {decision driver 1, e.g., a force, facing concern, …}
|
||||
- {decision driver 2, e.g., a force, facing concern, …}
|
||||
- … <!-- numbers of drivers can vary -->
|
||||
|
||||
## Considered Options
|
||||
|
||||
- {title of option 1}
|
||||
- {title of option 2}
|
||||
- {title of option 3}
|
||||
- … <!-- numbers of options can vary -->
|
||||
|
||||
## Decision Outcome
|
||||
|
||||
Chosen option: "{title of option 1}", because
|
||||
{justification. e.g., only option, which meets k.o. criterion decision driver | which resolves force {force} | … | comes out best (see below)}.
|
||||
87
docs/decisions/adr-template.md
Normal file
87
docs/decisions/adr-template.md
Normal file
@@ -0,0 +1,87 @@
|
||||
---
|
||||
# These are optional elements. Feel free to remove any of them.
|
||||
status: {proposed | rejected | accepted | deprecated | … | superseded by [ADR-0001](0001-madr-architecture-decisions.md)}
|
||||
contact: {person proposing the ADR}
|
||||
date: {YYYY-MM-DD when the decision was last updated}
|
||||
deciders: {list everyone involved in the decision}
|
||||
consulted: {list everyone whose opinions are sought (typically subject-matter experts); and with whom there is a two-way communication}
|
||||
informed: {list everyone who is kept up-to-date on progress; and with whom there is a one-way communication}
|
||||
---
|
||||
|
||||
# {short title of solved problem and solution}
|
||||
|
||||
## Context and Problem Statement
|
||||
|
||||
{Describe the context and problem statement, e.g., in free form using two to three sentences or in the form of an illustrative story.
|
||||
You may want to articulate the problem in form of a question and add links to collaboration boards or issue management systems.}
|
||||
|
||||
<!-- This is an optional element. Feel free to remove. -->
|
||||
|
||||
## Decision Drivers
|
||||
|
||||
- {decision driver 1, e.g., a force, facing concern, …}
|
||||
- {decision driver 2, e.g., a force, facing concern, …}
|
||||
- … <!-- numbers of drivers can vary -->
|
||||
|
||||
## Considered Options
|
||||
|
||||
- {title of option 1}
|
||||
- {title of option 2}
|
||||
- {title of option 3}
|
||||
- … <!-- numbers of options can vary -->
|
||||
|
||||
## Decision Outcome
|
||||
|
||||
Chosen option: "{title of option 1}", because
|
||||
{justification. e.g., only option, which meets k.o. criterion decision driver | which resolves force {force} | … | comes out best (see below)}.
|
||||
|
||||
<!-- This is an optional element. Feel free to remove. -->
|
||||
|
||||
### Consequences
|
||||
|
||||
- Good, because {positive consequence, e.g., improvement of one or more desired qualities, …}
|
||||
- Bad, because {negative consequence, e.g., compromising one or more desired qualities, …}
|
||||
- … <!-- numbers of consequences can vary -->
|
||||
|
||||
<!-- This is an optional element. Feel free to remove. -->
|
||||
|
||||
## Validation
|
||||
|
||||
{describe how the implementation of/compliance with the ADR is validated. E.g., by a review or an ArchUnit test}
|
||||
|
||||
<!-- This is an optional element. Feel free to remove. -->
|
||||
|
||||
## Pros and Cons of the Options
|
||||
|
||||
### {title of option 1}
|
||||
|
||||
<!-- This is an optional element. Feel free to remove. -->
|
||||
|
||||
{example | description | pointer to more information | …}
|
||||
|
||||
- Good, because {argument a}
|
||||
- Good, because {argument b}
|
||||
<!-- use "neutral" if the given argument weights neither for good nor bad -->
|
||||
- Neutral, because {argument c}
|
||||
- Bad, because {argument d}
|
||||
- … <!-- numbers of pros and cons can vary -->
|
||||
|
||||
### {title of other option}
|
||||
|
||||
{example | description | pointer to more information | …}
|
||||
|
||||
- Good, because {argument a}
|
||||
- Good, because {argument b}
|
||||
- Neutral, because {argument c}
|
||||
- Bad, because {argument d}
|
||||
- …
|
||||
|
||||
<!-- This is an optional element. Feel free to remove. -->
|
||||
|
||||
## More Information
|
||||
|
||||
{You might want to provide additional evidence/confidence for the decision outcome here and/or
|
||||
document the team agreement on the decision and/or
|
||||
define when this decision when and how the decision should be realized and if/when it should be re-visited and/or
|
||||
how the decision is validated.
|
||||
Links to other decisions and resources might appear here as well.}
|
||||
273
docs/design/python-package-setup.md
Normal file
273
docs/design/python-package-setup.md
Normal file
@@ -0,0 +1,273 @@
|
||||
# Python Package design for Agent Framework
|
||||
|
||||
## Design goals
|
||||
* Developer experience is key
|
||||
* the components needed for a basic agent with tools and a runtime should be importable from `agent_framework` without having to import from subpackages. This will be referred to as _tier 0_ components.
|
||||
* for more advanced components, _tier 1_ components, such as context providers, guardrails, vector data, text search, exceptions, evaluation, utils, telemetry and workflows, they should be importable from `agent_framework.<component>`, so for instance `from agent_framework.vector_data import vectorstoremodel`.
|
||||
* for parts of the package that are either additional functionality or integrations with other services (connectors) (_tier 2_), we use the term _tier 2_, however they should also be importable from `agent_framework.<component>`, so for instance `from agent_framework.openai import OpenAIClient`.
|
||||
* this means that the package structure is flat, and the components are grouped by functionality, not by type, so for instance `from agent_framework.openai import OpenAIChatClient` will import the OpenAI chat client, but also the OpenAI tools, and any other OpenAI related functionality.
|
||||
* There should not be a need for deeper imports from those packages, unless a good case is made for that, so the internals of the extensions packages should always be a folder with the name of the package, a `__init__.py` and one or more `_files.py` file, where the `_files.py` file contains the implementation details, and the `__init__.py` file exposes the public interface.
|
||||
* if a single file becomes too cumbersome (files are allowed to be 1k+ lines) it should be split into a folder with an `__init__.py` that exposes the public interface and a `_files.py` that contains the implementation details, with a `__all__` in the init to expose the right things, if there are very large dependencies being loaded it can optionally using lazy loading to avoid loading the entire package when importing a single component.
|
||||
* as much as possible, related things are in a single file which makes understanding the code easier.
|
||||
* simple and straightforward logging and telemetry setup, so developers can easily add logging and telemetry to their code without having to worry about the details.
|
||||
* Independence of connectors
|
||||
* To allow connectors to be treated as independent packages, we will use namespace packages for connectors, in principle this only includes the packages that we will develop in our repo, since that is easy to manage and maintain.
|
||||
* further advantages are that each package can have a independent lifecycle, versioning, and dependencies.
|
||||
* and this gives us insights into the usage, through pip install statistics, especially for connectors to services outside of Microsoft.
|
||||
* the goal is to group related connectors based on vendors, not on types, so for instance doing: `import agent_framework.google` will import connectors for all Google services, such as `GoogleChatClient` but also `BigQueryCollection`, etc.
|
||||
* All dependencies for a subpackage should be required dependencies in that package, and that package becomes a optional dependency in the main package as an _extra_ with the same name, so in the main `pyproject.toml` we will have:
|
||||
```toml
|
||||
[project.optional-dependencies]
|
||||
google = [
|
||||
"agent-framework-google == 1.0.0"
|
||||
]
|
||||
```
|
||||
* this means developers can use `pip install agent-framework[google] --pre` to get AF with all Google connectors and dependencies, as well as manually installing the subpackage with `pip install agent-framework-google --pre`.
|
||||
|
||||
### Sample getting started code
|
||||
```python
|
||||
from typing import Annotated
|
||||
from agent_framework import Agent, ai_function
|
||||
from agent_framework.openai import OpenAIChatClient
|
||||
|
||||
@ai_function(description="Get the current weather in a given location")
|
||||
async def get_weather(location: Annotated[str, "The location as a city name"]) -> str:
|
||||
"""Get the current weather in a given location."""
|
||||
# Implementation of the tool to get weather
|
||||
return f"The current weather in {location} is sunny."
|
||||
|
||||
agent = Agent(
|
||||
name="MyAgent",
|
||||
model_client=OpenAIChatClient(),
|
||||
tools=get_weather,
|
||||
description="An agent that can get the current weather.",
|
||||
)
|
||||
response = await agent.run("What is the weather in Amsterdam?")
|
||||
print(response)
|
||||
```
|
||||
|
||||
## Global Package structure
|
||||
Overall the following structure is proposed:
|
||||
|
||||
* agent-framework
|
||||
* core components, will be exposed directly from `agent_framework`:
|
||||
* (single) agents (includes threads)
|
||||
* tools (includes MCP and OpenAPI)
|
||||
* types
|
||||
* context_providers
|
||||
* logging
|
||||
* workflows (includes multi-agent orchestration)
|
||||
* middleware
|
||||
* telemetry (user_agent)
|
||||
* advanced components, will be exposed from `agent_framework.<component>`:
|
||||
* vector_data (tbd, vector stores and other MEVD-like pieces)
|
||||
* text_search (tbd)
|
||||
* exceptions
|
||||
* evaluations (tbd)
|
||||
* utils (optional)
|
||||
* observability
|
||||
* vendor folders with connectors and integrations, will be exposed from `agent_framework.<vendor>`:
|
||||
* Code can be both in folder or in subpackage with lazy import.
|
||||
* See subpackage scope below for more detail
|
||||
* tests
|
||||
* samples
|
||||
* extensions
|
||||
* azure
|
||||
* ...
|
||||
|
||||
All the init's in the subpackages will use lazy loading so avoid importing the entire package when importing a single component.
|
||||
Internal imports will be done using relative imports, so that the package can be used as a namespace package.
|
||||
|
||||
### File structure
|
||||
The resulting file structure will be as follows (not all things currently implemented, just an example):
|
||||
|
||||
```plaintext
|
||||
packages/
|
||||
main/
|
||||
agent_framework/
|
||||
azure/
|
||||
__init__.py
|
||||
_chat_client.py
|
||||
...
|
||||
microsoft/
|
||||
__init__.py
|
||||
_copilot_studio.py
|
||||
...
|
||||
openai/
|
||||
__init__.py
|
||||
_chat_client.py
|
||||
_shared.py
|
||||
exceptions.py
|
||||
__init__.py
|
||||
__init__.pyi
|
||||
_agents.py
|
||||
_tools.py
|
||||
_models.py
|
||||
_logging.py
|
||||
_middleware.py
|
||||
_telemetry.py
|
||||
observability.py
|
||||
exceptions.py
|
||||
utils.py
|
||||
py.typed
|
||||
_workflow/
|
||||
__init__.py
|
||||
_workflow.py
|
||||
...etc...
|
||||
tests/
|
||||
unit/
|
||||
test_types.py
|
||||
integration/
|
||||
test_chat_clients.py
|
||||
pyproject.toml
|
||||
README.md
|
||||
...
|
||||
azure-ai-agents/
|
||||
agent_framework-azure-ai-agents/
|
||||
__init__.py
|
||||
_chat_client.py
|
||||
...
|
||||
tests/
|
||||
test_azure_ai_agents.py
|
||||
samples/ (optional)
|
||||
...
|
||||
pyproject.toml
|
||||
README.md
|
||||
...
|
||||
redis/
|
||||
...
|
||||
mem0/
|
||||
agent_framework-mem0/
|
||||
__init__.py
|
||||
_provider.py
|
||||
...
|
||||
tests/
|
||||
test_mem0_provider.py
|
||||
samples/ (optional)
|
||||
...
|
||||
pyproject.toml
|
||||
README.md
|
||||
...
|
||||
...
|
||||
samples/
|
||||
...
|
||||
pyproject.toml
|
||||
README.md
|
||||
LICENSE
|
||||
uv.lock
|
||||
.pre-commit-config.yaml
|
||||
```
|
||||
|
||||
We might add a template subpackage as well, to make it easy to setup, this could be based on the first one that is added.
|
||||
|
||||
In the [`DEV_SETUP.md`](../../python/DEV_SETUP.md) we will add instructions for how to deal with the path depth issues, especially on Windows, where the maximum path length can be a problem.
|
||||
|
||||
### Subpackage scope
|
||||
Sub-packages are comprised of two parts, the code itself and the dependencies, the choice of when to use a subpackage and when to use a extra in the main package is based on the status of dependencies and/or possibilities of a external support mechanism. What this means is that:
|
||||
|
||||
- Integrations that need non-GA dependencies will be sub-packages and installed only when using a extra, so that we can avoid having non-GA dependencies in the main package.
|
||||
- Integrations where the AF-code is still experimental, preview or release candidate will be sub-packages, so that we can avoid having non-GA code in the main package and we can version those packages properly.
|
||||
- Integrations that are outside Microsoft and where we might not always be able to fast-follow breaking changes, will stay as sub-packages, to provide some isolation and to be able to version them properly.
|
||||
- Integrations that are mature and that have released (GA) dependencies and features on the service side will be moved into the main package, the dependencies of those packages will stay installable under the same `extra` name, so that users do not have to change anything, and we then remove the subpackage itself.
|
||||
- All subpackage imports in the code should be from a stable place, mostly vendor-based, so that when something moves from a subpackage to the main package, the import path does not change, so `from agent_framework.microsoft import CopilotAgent` will always work, even if it moves from the `agent-framework-microsoft-copilot` package to the main `agent-framework` package.
|
||||
- The imports in those vendor namespaces (these won't be actual python namespaces, just the folders with a __init__.py file and any code) will do lazy loading and raise a meaningful error if the subpackage or dependencies are not installed, so that users know which extra to install with ease.
|
||||
- On a case by case basis we can decide to create additional a `extra`, that combines multiple sub-packages and dependencies into one extra, so that users who work primarily with one platform can install everything they need with a single extra, for example (not implemented) you can install with the `agent-framework[azure-purview]` extra that only implement a `PurviewMiddleware`, or you can install with the `agent-framework[azure]` extra that includes all Azure related connectors, like `purview`, `content-safety` and others (all examples, not actual packages), regardless of where the code sits, these should always be importable from `agent_framework.azure`.
|
||||
- Subpackage naming should also follow this, so in principle a package name is `<vendor/folder>-<feature/brand>`, so `google-gemini`, `azure-purview`, `microsoft-copilotstudio`, etc. For smaller vendors, where it's less likely to have a multitude of connectors, we can skip the feature/brand part, so `mem0`, `redis`, etc.
|
||||
- For Microsoft services we will have two vendor folders, `azure` and `microsoft`, where `azure` contains all Azure services, while `microsoft` contains other Microsoft services, such as Copilot Studio Agents.
|
||||
|
||||
This setup was discussed at length and the decision is captured in [ADR-0008](../decisions/0008-python-subpackages.md).
|
||||
|
||||
#### Evolving the package structure
|
||||
For each of the advanced components, we have two reason why we may split them into a folder, with an `__init__.py` and optionally a `_files.py`:
|
||||
1. If the file becomes too large, we can split it into multiple `_files`, while still keeping the public interface in the `__init__.py` file, this is a non-breaking change
|
||||
2. If we want to partially or fully move that code into a separate package.
|
||||
In this case we do need to lazy load anything that was moved from the main package to the subpackage, so that existing code still works, and if the subpackage is not installed we can raise a meaningful error.
|
||||
|
||||
## Coding standards
|
||||
|
||||
Coding standards will be maintained in the [`DEV_SETUP.md`](../../python/DEV_SETUP.md) file.
|
||||
|
||||
### Tooling
|
||||
uv and ruff are the main tools, for package management and code formatting/linting respectively.
|
||||
|
||||
#### Type checking
|
||||
We currently can choose between mypy, pyright, ty and pyrefly for static type checking.
|
||||
I propose we run `mypy` and `pyright` in GHA, similar to what AG already does. We might explore newer tools as a later date.
|
||||
|
||||
#### Task runner
|
||||
AG already has experience with poe the poet, so let's start there, removing the MAKE file setup that SK uses.
|
||||
|
||||
### Unit test coverage
|
||||
The goal is to have at least 80% unit test coverage for all code under both the main package and the subpackages.
|
||||
|
||||
### Telemetry and logging
|
||||
Telemetry and logging are handled by the `agent_framework.telemetry` and `agent_framework._logging` packages.
|
||||
|
||||
#### Logging
|
||||
|
||||
Logging is considered as part of the basic setup, while telemetry is a advanced concept.
|
||||
The telemetry package will use OpenTelemetry to provide a consistent way to collect and export telemetry data, similar to how we do this now in SK.
|
||||
|
||||
The logging will be simplified, there will be one logger in the base package:
|
||||
* name: `agent_framework` - used for all logging in the abstractions and base components
|
||||
|
||||
Each of the other subpackages for connectors will have a similar single logger.
|
||||
* name: `agent_framework.openai`
|
||||
* name: `agent_framework.azure`
|
||||
|
||||
This means that when a logger is needed, it should be created like this:
|
||||
```python
|
||||
from agent_framework import get_logger
|
||||
|
||||
logger = get_logger()
|
||||
#or in a subpackage:
|
||||
logger = get_logger('agent_framework.openai')
|
||||
```
|
||||
The implementation should be something like this:
|
||||
```python
|
||||
# in file _logging.py
|
||||
import logging
|
||||
|
||||
def get_logger(name: str = "agent_framework") -> logging.Logger:
|
||||
"""
|
||||
Get a logger with the specified name, defaulting to 'agent_framework'.
|
||||
|
||||
Args:
|
||||
name (str): The name of the logger. Defaults to 'agent_framework'.
|
||||
|
||||
Returns:
|
||||
logging.Logger: The configured logger instance.
|
||||
"""
|
||||
logger = logging.getLogger(name)
|
||||
# create the specifics for the logger, such as setting the level, handlers, etc.
|
||||
return logger
|
||||
```
|
||||
This will ensure that the logger is created with the correct name and configuration, and it will be consistent across the package.
|
||||
|
||||
Further there should be a easy way to configure the log levels, either through a environment variable or with a similar function as the get_logger.
|
||||
|
||||
This will not be allowed:
|
||||
```python
|
||||
import logging
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
```
|
||||
|
||||
This is allowed but discouraged, if the get_logger function has been called at least once then this will return the same logger as the get_logger function, however that might not have happened and then the logging experience (in terms of formats and handlers, etc) is not consistent across the package:
|
||||
```python
|
||||
import logging
|
||||
|
||||
logger = logging.getLogger("agent_framework")
|
||||
```
|
||||
|
||||
#### Telemetry
|
||||
Telemetry will be based on OpenTelemetry (OTel), and will be implemented in the `agent_framework.telemetry` package.
|
||||
|
||||
We will also add headers with user-agent strings where applicable, these will include `agent-framework-python` and the version.
|
||||
|
||||
We should consider auto-instrumentation and provide an implementation of it to the OTel community.
|
||||
|
||||
### Build and release
|
||||
The build step will be done in GHA, adding the package to the release and then we call into Azure DevOps to use the ESRP pipeline to publish to pypi. This is how SK already works, we will just have to adapt it to the new package structure.
|
||||
|
||||
For now we will stick to semantic versioning, and all preview release will be tagged as such.
|
||||
147
docs/features/durable-agents/durable-agents-ttl.md
Normal file
147
docs/features/durable-agents/durable-agents-ttl.md
Normal file
@@ -0,0 +1,147 @@
|
||||
# Time-To-Live (TTL) for durable agent sessions
|
||||
|
||||
## Overview
|
||||
|
||||
The durable agents automatically maintain conversation history and state for each session. Without automatic cleanup, this state can accumulate indefinitely, consuming storage resources and increasing costs. The Time-To-Live (TTL) feature provides automatic cleanup of idle agent sessions, ensuring that sessions are automatically deleted after a period of inactivity.
|
||||
|
||||
## What is TTL?
|
||||
|
||||
Time-To-Live (TTL) is a configurable duration that determines how long an agent session state will be retained after its last interaction. When an agent session is idle (no messages sent to it) for longer than the TTL period, the session state is automatically deleted. Each new interaction with an agent resets the TTL timer, extending the session's lifetime.
|
||||
|
||||
## Benefits
|
||||
|
||||
- **Automatic cleanup**: No manual intervention required to clean up idle agent sessions
|
||||
- **Cost optimization**: Reduces storage costs by automatically removing unused session state
|
||||
- **Resource management**: Prevents unbounded growth of agent session state in storage
|
||||
- **Configurable**: Set TTL globally or per-agent type to match your application's needs
|
||||
|
||||
## Configuration
|
||||
|
||||
TTL can be configured at two levels:
|
||||
|
||||
1. **Global default TTL**: Applies to all agent sessions unless overridden
|
||||
2. **Per-agent type TTL**: Overrides the global default for specific agent types
|
||||
|
||||
Additionally, you can configure a **minimum deletion delay** that controls how frequently deletion operations are scheduled. The default value is 5 minutes, and the maximum allowed value is also 5 minutes.
|
||||
|
||||
> [!NOTE]
|
||||
> Reducing the minimum deletion delay below 5 minutes can be useful for testing or for ensuring rapid cleanup of short-lived agent sessions. However, this can also increase the load on the system and should be used with caution.
|
||||
|
||||
### Default values
|
||||
|
||||
- **Default TTL**: 14 days
|
||||
- **Minimum TTL deletion delay**: 5 minutes (maximum allowed value, subject to change in future releases)
|
||||
|
||||
### Configuration examples
|
||||
|
||||
#### .NET
|
||||
|
||||
```csharp
|
||||
// Configure global default TTL and minimum signal delay
|
||||
services.ConfigureDurableAgents(
|
||||
options =>
|
||||
{
|
||||
// Set global default TTL to 7 days
|
||||
options.DefaultTimeToLive = TimeSpan.FromDays(7);
|
||||
|
||||
// Add agents (will use global default TTL)
|
||||
options.AddAIAgent(myAgent);
|
||||
});
|
||||
|
||||
// Configure per-agent TTL
|
||||
services.ConfigureDurableAgents(
|
||||
options =>
|
||||
{
|
||||
options.DefaultTimeToLive = TimeSpan.FromDays(14); // Global default
|
||||
|
||||
// Agent with custom TTL of 1 day
|
||||
options.AddAIAgent(shortLivedAgent, timeToLive: TimeSpan.FromDays(1));
|
||||
|
||||
// Agent with custom TTL of 90 days
|
||||
options.AddAIAgent(longLivedAgent, timeToLive: TimeSpan.FromDays(90));
|
||||
|
||||
// Agent using global default (14 days)
|
||||
options.AddAIAgent(defaultAgent);
|
||||
});
|
||||
|
||||
// Disable TTL for specific agents by setting TTL to null
|
||||
services.ConfigureDurableAgents(
|
||||
options =>
|
||||
{
|
||||
options.DefaultTimeToLive = TimeSpan.FromDays(14);
|
||||
|
||||
// Agent with no TTL (never expires)
|
||||
options.AddAIAgent(permanentAgent, timeToLive: null);
|
||||
});
|
||||
```
|
||||
|
||||
## How TTL works
|
||||
|
||||
The following sections describe how TTL works in detail.
|
||||
|
||||
### Expiration tracking
|
||||
|
||||
Each agent session maintains an expiration timestamp in its internally managed state that is updated whenever the session processes a message:
|
||||
|
||||
1. When a message is sent to an agent session, the expiration time is set to `current time + TTL`
|
||||
2. The runtime schedules a delete operation for the expiration time (subject to minimum delay constraints)
|
||||
3. When the delete operation runs, if the current time is past the expiration time, the session state is deleted. Otherwise, the delete operation is rescheduled for the next expiration time.
|
||||
|
||||
### State deletion
|
||||
|
||||
When an agent session expires, its entire state is deleted, including:
|
||||
|
||||
- Conversation history
|
||||
- Any custom state data
|
||||
- Expiration timestamps
|
||||
|
||||
After deletion, if a message is sent to the same agent session, a new session is created with a fresh conversation history.
|
||||
|
||||
## Behavior examples
|
||||
|
||||
The following examples illustrate how TTL works in different scenarios.
|
||||
|
||||
### Example 1: Agent session expires after TTL
|
||||
|
||||
1. Agent configured with 30-day TTL
|
||||
2. User sends message at Day 0 → agent session created, expiration set to Day 30
|
||||
3. No further messages sent
|
||||
4. At Day 30 → Agent session is deleted
|
||||
5. User sends message at Day 31 → New agent session created with fresh conversation history
|
||||
|
||||
### Example 2: TTL reset on interaction
|
||||
|
||||
1. Agent configured with 30-day TTL
|
||||
2. User sends message at Day 0 → agent session created, expiration set to Day 30
|
||||
3. User sends message at Day 15 → Expiration reset to Day 45
|
||||
4. User sends message at Day 40 → Expiration reset to Day 70
|
||||
5. Agent session remains active as long as there are regular interactions
|
||||
|
||||
## Logging
|
||||
|
||||
The TTL feature includes comprehensive logging to track state changes:
|
||||
|
||||
- **Expiration time updated**: Logged when TTL expiration time is set or updated
|
||||
- **Deletion scheduled**: Logged when a deletion check signal is scheduled
|
||||
- **Deletion check**: Logged when a deletion check operation runs
|
||||
- **Session expired**: Logged when an agent session is deleted due to expiration
|
||||
- **TTL rescheduled**: Logged when a deletion signal is rescheduled
|
||||
|
||||
These logs help monitor TTL behavior and troubleshoot any issues.
|
||||
|
||||
## Best practices
|
||||
|
||||
1. **Choose appropriate TTL values**: Balance between storage costs and user experience. Too short TTLs may delete active sessions, while too long TTLs may accumulate unnecessary state.
|
||||
|
||||
2. **Use per-agent TTLs**: Different agents may have different usage patterns. Configure TTLs per-agent based on expected session lifetimes.
|
||||
|
||||
3. **Monitor expiration logs**: Review logs to understand TTL behavior and adjust configuration as needed.
|
||||
|
||||
4. **Test with short TTLs**: During development, use short TTLs (e.g., minutes) to verify TTL behavior without waiting for long periods.
|
||||
|
||||
## Limitations
|
||||
|
||||
- TTL is based on wall-clock time, not activity time. The expiration timer starts from the last message timestamp.
|
||||
- Deletion checks are durably scheduled operations and may have slight delays depending on system load.
|
||||
- Once an agent session is deleted, its conversation history cannot be recovered.
|
||||
- TTL deletion requires at least one worker to be available to process the deletion operation message.
|
||||
291
docs/specs/001-foundry-sdk-alignment.md
Normal file
291
docs/specs/001-foundry-sdk-alignment.md
Normal file
@@ -0,0 +1,291 @@
|
||||
---
|
||||
# These are optional elements. Feel free to remove any of them.
|
||||
status: accepted
|
||||
contact: markwallace
|
||||
date: 2025-08-06
|
||||
deciders: markwallace-microsoft, westey-m, quibitron
|
||||
consulted: shawnhenry, elijahstraight
|
||||
informed:
|
||||
---
|
||||
|
||||
# Agent Framework / Foundry SDK Alignment
|
||||
|
||||
Agent Framework and Foundry SDK have overlapping functionality but serve different audiences & scenarios.
|
||||
This specification clarifies the positioning of these SDKs to customers, what goes in each and when to use what.
|
||||
|
||||
- **Foundry SDK** is a thin-client SDK for accessing everything available in the agent service and is autogenerated from REST APIs in multiple languages
|
||||
- **Agent Framework SDK** is general-purpose framework for agentic application development, where common agent abstractions enable creating and orchestrating heterogenous agent systems (across local & cloud)
|
||||
|
||||
## What is the goal of this feature?
|
||||
|
||||
Goals:
|
||||
- Developers can seamlessly combine Foundry and Agent Framework SDK's and there is no friction when using both SDKs at the same time
|
||||
- Developers can take advantage of the full capabilities supported by the Foundry SDK
|
||||
- Developers can create multi-agent orchestrations using Foundry and other agent types
|
||||
|
||||
Success Metrics:
|
||||
- Complexity of basic samples is comparable to other agent frameworks
|
||||
- Developers can easily discover how to use Foundry Agents in Agent Framework multi-agent orchestrations
|
||||
|
||||
## What is the problem being solved?
|
||||
|
||||
- In Semantic Kernel the Foundry Agent support isn't integrated into the Foundry SDK so there is a disjointed developer UX
|
||||
- Customers are confused as to when they should use Foundry SDK versus Semantic Kernel
|
||||
|
||||
|
||||
## API Changes
|
||||
|
||||
The proposed solution is to add helper methods which allow developers to either retrieve or create an `AIAgent` using a `PersistentAgentsClient`
|
||||
|
||||
- Retrieve an `AIAgent`
|
||||
```csharp
|
||||
/// <summary>
|
||||
/// Retrieves an existing server side agent, wrapped as a <see cref="ChatClientAgent"/> using the provided <see cref="PersistentAgentsClient"/>.
|
||||
/// </summary>
|
||||
/// <param name="persistentAgentsClient">The <see cref="PersistentAgentsClient"/> to create the <see cref="ChatClientAgent"/> with.</param>
|
||||
/// <returns>A <see cref="ChatClientAgent"/> for the persistent agent.</returns>
|
||||
/// <param name="agentId"> The ID of the server side agent to create a <see cref="ChatClientAgent"/> for.</param>
|
||||
/// <param name="chatOptions">Options that should apply to all runs of the agent.</param>
|
||||
/// <param name="cancellationToken">The <see cref="CancellationToken"/> to monitor for cancellation requests. The default is <see cref="CancellationToken.None"/>.</param>
|
||||
/// <returns>A <see cref="ChatClientAgent"/> instance that can be used to perform operations on the persistent agent.</returns>
|
||||
public static async Task<ChatClientAgent> GetAIAgentAsync(
|
||||
this PersistentAgentsClient persistentAgentsClient,
|
||||
string agentId,
|
||||
ChatOptions? chatOptions = null,
|
||||
CancellationToken cancellationToken = default)
|
||||
```
|
||||
- Create an `AIAgent`
|
||||
```csharp
|
||||
/// <summary>
|
||||
/// Creates a new server side agent using the provided <see cref="PersistentAgentsClient"/>.
|
||||
/// </summary>
|
||||
/// <param name="persistentAgentsClient">The <see cref="PersistentAgentsClient"/> to create the agent with.</param>
|
||||
/// <param name="model">The model to be used by the agent.</param>
|
||||
/// <param name="name">The name of the agent.</param>
|
||||
/// <param name="description">The description of the agent.</param>
|
||||
/// <param name="instructions">The instructions for the agent.</param>
|
||||
/// <param name="tools">The tools to be used by the agent.</param>
|
||||
/// <param name="toolResources">The resources for the tools.</param>
|
||||
/// <param name="temperature">The temperature setting for the agent.</param>
|
||||
/// <param name="topP">The top-p setting for the agent.</param>
|
||||
/// <param name="responseFormat">The response format for the agent.</param>
|
||||
/// <param name="metadata">The metadata for the agent.</param>
|
||||
/// <param name="cancellationToken">The <see cref="CancellationToken"/> to monitor for cancellation requests. The default is <see cref="CancellationToken.None"/>.</param>
|
||||
/// <returns>A <see cref="ChatClientAgent"/> instance that can be used to perform operations on the newly created agent.</returns>
|
||||
public static async Task<ChatClientAgent> CreateAIAgentAsync(
|
||||
this PersistentAgentsClient persistentAgentsClient,
|
||||
string model,
|
||||
string? name = null,
|
||||
string? description = null,
|
||||
string? instructions = null,
|
||||
IEnumerable<ToolDefinition>? tools = null,
|
||||
ToolResources? toolResources = null,
|
||||
float? temperature = null,
|
||||
float? topP = null,
|
||||
BinaryData? responseFormat = null,
|
||||
IReadOnlyDictionary<string, string>? metadata = null,
|
||||
CancellationToken cancellationToken = default)
|
||||
```
|
||||
- Additional overload using the M.E.AI types:
|
||||
```csharp
|
||||
/// <summary>
|
||||
/// Creates a new server side agent using the provided <see cref="PersistentAgentsClient"/>.
|
||||
/// </summary>
|
||||
/// <param name="persistentAgentsClient">The <see cref="PersistentAgentsClient"/> to create the agent with.</param>
|
||||
/// <param name="model">The model to be used by the agent.</param>
|
||||
/// <param name="name">The name of the agent.</param>
|
||||
/// <param name="description">The description of the agent.</param>
|
||||
/// <param name="instructions">The instructions for the agent.</param>
|
||||
/// <param name="tools">The tools to be used by the agent.</param>
|
||||
/// <param name="temperature">The temperature setting for the agent.</param>
|
||||
/// <param name="topP">The top-p setting for the agent.</param>
|
||||
/// <param name="responseFormat">The response format for the agent.</param>
|
||||
/// <param name="metadata">The metadata for the agent.</param>
|
||||
/// <param name="cancellationToken">The <see cref="CancellationToken"/> to monitor for cancellation requests. The default is <see cref="CancellationToken.None"/>.</param>
|
||||
/// <returns>A <see cref="ChatClientAgent"/> instance that can be used to perform operations on the newly created agent.</returns>
|
||||
public static async Task<ChatClientAgent> CreateAIAgentAsync(
|
||||
this PersistentAgentsClient persistentAgentsClient,
|
||||
string model,
|
||||
string? name = null,
|
||||
string? description = null,
|
||||
string? instructions = null,
|
||||
IEnumerable<AITool>? tools = null,
|
||||
float? temperature = null,
|
||||
float? topP = null,
|
||||
BinaryData? responseFormat = null,
|
||||
IReadOnlyDictionary<string, string>? metadata = null,
|
||||
CancellationToken cancellationToken = default)
|
||||
```
|
||||
|
||||
|
||||
## E2E Code Samples
|
||||
|
||||
### 1. Create and retrieve with Foundry SDK, run with Agent Framework
|
||||
|
||||
- [Foundry SDK] Create a `PersistentAgentsClient`
|
||||
- [Foundry SDK] Create a `PersistentAgent` using the `PersistentAgentsClient`
|
||||
- [Foundry SDK] Retrieve an `AIAgent` using the `PersistentAgentsClient`
|
||||
- [Agent Framework SDK] Invoke the `AIAgent` instance and access response from the `AgentResponse`
|
||||
- [Foundry SDK] Clean up the agent
|
||||
|
||||
|
||||
```csharp
|
||||
// Get a client to create server side agents with.
|
||||
var persistentAgentsClient = new PersistentAgentsClient(
|
||||
TestConfiguration.AzureAI.Endpoint, new AzureCliCredential());
|
||||
|
||||
// Create a persistent agent.
|
||||
var persistentAgentMetadata = await persistentAgentsClient.Administration.CreateAgentAsync(
|
||||
model: TestConfiguration.AzureAI.DeploymentName!,
|
||||
name: JokerName,
|
||||
instructions: JokerInstructions);
|
||||
|
||||
// Get the persistent agent we created in the previous step and expose it as an Agent Framework agent.
|
||||
AIAgent agent = await persistentAgentsClient.GetAIAgentAsync(persistentAgent.Value.Id);
|
||||
|
||||
// Respond to user input.
|
||||
var input = "Tell me a joke about a pirate.";
|
||||
Console.WriteLine(input);
|
||||
Console.WriteLine(await agent.RunAsync(input));
|
||||
|
||||
// Delete the persistent agent.
|
||||
await persistentAgentsClient.Administration.DeleteAgentAsync(agent.Id);
|
||||
```
|
||||
|
||||
### 2. Create directly with Foundry SDK, run with Agent Framework
|
||||
|
||||
- [Foundry SDK] Create a `PersistentAgentsClient`
|
||||
- [Foundry SDK] Create a `AIAgent` using the `PersistentAgentsClient`
|
||||
- [Agent Framework SDK] Invoke the `AIAgent` instance and access response from the `AgentResponse`
|
||||
- [Foundry SDK] Clean up the agent
|
||||
|
||||
```csharp
|
||||
// Get a client to create server side agents with.
|
||||
var persistentAgentsClient = new PersistentAgentsClient(
|
||||
TestConfiguration.AzureAI.Endpoint, new AzureCliCredential());
|
||||
|
||||
// Create a persistent agent and expose it as an Agent Framework agent.
|
||||
AIAgent agent = await persistentAgentsClient.CreateAIAgentAsync(
|
||||
model: TestConfiguration.AzureAI.DeploymentName!,
|
||||
name: JokerName,
|
||||
instructions: JokerInstructions);
|
||||
|
||||
// Respond to user input.
|
||||
var input = "Tell me a joke about a pirate.";
|
||||
Console.WriteLine(input);
|
||||
Console.WriteLine(await agent.RunAsync(input));
|
||||
|
||||
// Delete the persistent agent.
|
||||
await persistentAgentsClient.Administration.DeleteAgentAsync(agent.Id);
|
||||
```
|
||||
|
||||
### 3. Create directly with Foundry SDK, run with conversation state using Agent Framework
|
||||
|
||||
- [Foundry SDK] Create a `PersistentAgentsClient`
|
||||
- [Foundry SDK] Create a `AIAgent` using the `PersistentAgentsClient`
|
||||
- [Agent Framework SDK] Optionally create an `AgentThread` for the agent run
|
||||
- [Agent Framework SDK] Invoke the `AIAgent` instance and access response from the `AgentResponse`
|
||||
- [Foundry SDK] Clean up the agent and the agent thread
|
||||
|
||||
```csharp
|
||||
// Get a client to create server side agents with.
|
||||
var persistentAgentsClient = new PersistentAgentsClient(
|
||||
TestConfiguration.AzureAI.Endpoint, new AzureCliCredential());
|
||||
|
||||
// Create an Agent Framework agent.
|
||||
AIAgent agent = await persistentAgentsClient.CreateAIAgentAsync(
|
||||
model: TestConfiguration.AzureAI.DeploymentName!,
|
||||
name: JokerName,
|
||||
instructions: JokerInstructions);
|
||||
|
||||
// Start a new thread for the agent conversation.
|
||||
AgentThread thread = agent.GetNewThread();
|
||||
|
||||
// Respond to user input.
|
||||
await RunAgentAsync("Tell me a joke about a pirate.");
|
||||
await RunAgentAsync("Now add some emojis to the joke.");
|
||||
|
||||
// Local function to run agent and display the conversation messages for the thread.
|
||||
async Task RunAgentAsync(string input)
|
||||
{
|
||||
Console.WriteLine(
|
||||
$"""
|
||||
User: {input}
|
||||
Assistant:
|
||||
{await agent.RunAsync(input, thread)}
|
||||
|
||||
""");
|
||||
}
|
||||
|
||||
// Cleanup
|
||||
await persistentAgentsClient.Threads.DeleteThreadAsync(thread.ConversationId);
|
||||
await persistentAgentsClient.Administration.DeleteAgentAsync(agent.Id);
|
||||
```
|
||||
|
||||
### 4. Create directly with Foundry SDK, orchestrate with Agent Framework
|
||||
|
||||
- [Foundry SDK] Create a `PersistentAgentsClient`
|
||||
- [Foundry SDK] Create multiple `AIAgent` instances using the `PersistentAgentsClient`
|
||||
- [Agent Framework SDK] Create a `SequentialOrchestration` and add all of the agents to it
|
||||
- [Agent Framework SDK] Invoke the `SequentialOrchestration` instance and access response from the `AgentResponse`
|
||||
- [Foundry SDK] Clean up the agents
|
||||
|
||||
```csharp
|
||||
// Get a client to create server side agents with.
|
||||
var persistentAgentsClient = new PersistentAgentsClient(
|
||||
TestConfiguration.AzureAI.Endpoint, new AzureCliCredential());
|
||||
var model = TestConfiguration.OpenAI.ChatModelId;
|
||||
|
||||
// Define the agents
|
||||
AIAgent analystAgent =
|
||||
await persistentAgentsClient.CreateAIAgentAsync(
|
||||
model,
|
||||
name: "Analyst",
|
||||
instructions:
|
||||
"""
|
||||
You are a marketing analyst. Given a product description, identify:
|
||||
- Key features
|
||||
- Target audience
|
||||
- Unique selling points
|
||||
""",
|
||||
description: "An agent that extracts key concepts from a product description.");
|
||||
AIAgent writerAgent =
|
||||
await persistentAgentsClient.CreateAIAgentAsync(
|
||||
model,
|
||||
name: "copywriter",
|
||||
instructions:
|
||||
"""
|
||||
You are a marketing copywriter. Given a block of text describing features, audience, and USPs,
|
||||
compose a compelling marketing copy (like a newsletter section) that highlights these points.
|
||||
Output should be short (around 150 words), output just the copy as a single text block.
|
||||
""",
|
||||
description: "An agent that writes a marketing copy based on the extracted concepts.");
|
||||
AIAgent editorAgent =
|
||||
await persistentAgentsClient.CreateAIAgentAsync(
|
||||
model,
|
||||
name: "editor",
|
||||
instructions:
|
||||
"""
|
||||
You are an editor. Given the draft copy, correct grammar, improve clarity, ensure consistent tone,
|
||||
give format and make it polished. Output the final improved copy as a single text block.
|
||||
""",
|
||||
description: "An agent that formats and proofreads the marketing copy.");
|
||||
|
||||
// Define the orchestration
|
||||
SequentialOrchestration orchestration =
|
||||
new(analystAgent, writerAgent, editorAgent)
|
||||
{
|
||||
LoggerFactory = this.LoggerFactory,
|
||||
};
|
||||
|
||||
// Run the orchestration
|
||||
string input = "An eco-friendly stainless steel water bottle that keeps drinks cold for 24 hours";
|
||||
Console.WriteLine($"\n# INPUT: {input}\n");
|
||||
AgentResponse result = await orchestration.RunAsync(input);
|
||||
Console.WriteLine($"\n# RESULT: {result}");
|
||||
|
||||
// Cleanup
|
||||
await persistentAgentsClient.Administration.DeleteAgentAsync(analystAgent.Id);
|
||||
await persistentAgentsClient.Administration.DeleteAgentAsync(writerAgent.Id);
|
||||
await persistentAgentsClient.Administration.DeleteAgentAsync(editorAgent.Id);
|
||||
```
|
||||
75
docs/specs/spec-template.md
Normal file
75
docs/specs/spec-template.md
Normal file
@@ -0,0 +1,75 @@
|
||||
---
|
||||
# These are optional elements. Feel free to remove any of them.
|
||||
status: {proposed | rejected | accepted | deprecated | … | superseded by [SPEC-0001](0001-spec.md)}
|
||||
contact: {person proposing the ADR}
|
||||
date: {YYYY-MM-DD when the decision was last updated}
|
||||
deciders: {list everyone involved in the decision}
|
||||
consulted: {list everyone whose opinions are sought (typically subject-matter experts); and with whom there is a two-way communication}
|
||||
informed: {list everyone who is kept up-to-date on progress; and with whom there is a one-way communication}
|
||||
---
|
||||
|
||||
# {short title of solved problem and solution}
|
||||
|
||||
## What is the goal of this feature?
|
||||
|
||||
Make sure to cover:
|
||||
1. What is the value we are providing to users
|
||||
1. Include one success metric
|
||||
1. Implementation free description of outcome
|
||||
|
||||
Consult PM on this.
|
||||
|
||||
For example:
|
||||
|
||||
We want users to be able to refer to external Azure resources easily when consuming them in other features like indexes, agents,
|
||||
and evaluations. We know we're successful when 40% of project client users are using connections.
|
||||
|
||||
## What is the problem being solved?
|
||||
|
||||
Make sure to cover:
|
||||
1. Why is this hard today?
|
||||
1. Customer pain points?
|
||||
1. Reducing system complexity (maintenance costs, latency, etc)?
|
||||
|
||||
Consult PM on this.
|
||||
|
||||
For example:
|
||||
|
||||
Today, users have to understand control plane vs data plane endpoints and use multiple packages to stitch their application
|
||||
code together. This makes using our product confusing and also increases the number of dependencies a customer will have
|
||||
in their code.
|
||||
|
||||
## API Changes
|
||||
|
||||
List all new API changes
|
||||
|
||||
## E2E Code Samples
|
||||
|
||||
Include python or C# examples of how you expect this feature to be used with other things in our system.
|
||||
|
||||
For example:
|
||||
|
||||
This connection name is unique across the resource. Given a resource name, system should be able to unambiguously resolve a
|
||||
connection name. A connection name can be used to pass along connection details to individual features. Services will be able to parse this ID and use it to access the underlying resource. The below example shows how a connection can be used to create a dataset.
|
||||
|
||||
```python
|
||||
client.datasets.create_dataset(
|
||||
name="evaluation_dataset",
|
||||
file="myblob/product1.pdf",
|
||||
connection = "my-azure-blob-connection"
|
||||
)
|
||||
```
|
||||
|
||||
How to use a connection when creating an `AzureAISearchIndex`
|
||||
|
||||
```python
|
||||
from azure.ai.projects.models import AzureAISearchIndex
|
||||
|
||||
azure_ai_search_index = AzureAISearchIndex(
|
||||
name="azure-search-index",
|
||||
connection="my-ai-search-connection",
|
||||
index_name="my-index-in-azure-search",
|
||||
)
|
||||
|
||||
created_index = client.indexes.create_index(azure_ai_search_index)
|
||||
```
|
||||
Reference in New Issue
Block a user