-
I would appreciate any comments from those who have good suggestions. I am developing an AI agent using semantic kernel. Initially, I was satisfied with function calling and reasoning tasks. However, recently I've been developing features that handle multiple tasks with a single instruction, such as "information gathering → action → document creation → result reporting." Additionally, as the number of plugins has increased, the difficulty of selection in function calling has gradually increased. Therefore, I am working on multi-agent implementation and nested reasoning within plugins (executing multiple reasoning tasks together within plugins) to maintain accuracy. The accuracy has improved with the above approach. However, the following issues have emerged:
Here's one extreme example. There are plugins in this state. There are plugins that package multiple reasoning tasks to control specific thought flows. (In reality, it's more complex, with Plugin A containing 10-50 reasoning steps.) from typing import cast
from semantic_kernel.connectors.ai import FunctionChoiceBehavior
from semantic_kernel.connectors.ai.open_ai import AzureChatCompletion, AzureChatPromptExecutionSettings
from semantic_kernel.contents import ChatHistory
from semantic_kernel.functions import kernel_function
from semantic_kernel.kernel import Kernel
kernel = Kernel()
kernel.add_service(service=AzureChatCompletion(service_id="default"))
service = cast(AzureChatCompletion, kernel.get_service(service_id="default"))
class PluginA:
def __init__(self) -> None:
pass
@kernel_function(description="dummy")
async def run(self, args: str) -> str:
chat_history = ChatHistory()
chat_history.add_user_message(args)
settings = service.get_prompt_execution_settings_class()(service_id=service.service_id)
if isinstance(settings, AzureChatPromptExecutionSettings):
settings.function_choice_behavior = FunctionChoiceBehavior.Auto(
filters={"excluded_plugins": [PluginA.__name__]}
)
r = await service.get_chat_message_contents(chat_history, settings)
# ... any
return r[0].content
async def main():
kernel.add_plugin(PluginA, plugin_name=PluginA.__name__)
# kernel.add_plugin(PluginB, plugin_name="etc...")
# ...
settings = service.get_prompt_execution_settings_class()(service_id=service.service_id)
chat_history = ChatHistory()
chat_history.add_user_message("dummy")
if isinstance(settings, AzureChatPromptExecutionSettings):
settings.function_choice_behavior = FunctionChoiceBehavior.Auto(auto_invoke=True)
async for chunk in service.get_streaming_chat_message_contents(chat_history, settings):
if chunk:
print(chunk) The problem is that in such cases, we cannot return the reasoning progress within the plugin as a stream. This has increased user waiting time and caused a decline in UX. What approach would you take in this situation...? |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 2 replies
-
I would recommend storing a plugin in a vector search, adding @westey-m |
Beta Was this translation helpful? Give feedback.
Narrowing down the set of available functions for function calling using RAG might help to increase accuracy. See this sample on how to achieve that:
https://github.com/microsoft/semantic-kernel/blob/main/dotnet/samples/Concepts/Optimization/PluginSelectionWithFilters.cs#L104
This sounds like a complex system though, so multiple strategies may have to be considered to speed things up, e.g:
GPT-4o-mini
instead ofGPT-4o