.Net Add Audio content capabilities to Gemini #10364

davidpene · 2025-02-01T12:05:18Z

Motivation and Context

Why is this change required?
Adds to the multi-modal capabilities of Gemini.

This change exposes the audio capabilities of Gemini to perform tasks that involve understanding the contents of the included audio.

No open issues are linked to this change.

Contribution Checklist

The code builds clean without any errors or warnings
The PR follows the SK Contribution Guidelines and the pre-submission formatting script raises no violations
All unit tests pass, and I have added new tests where possible
I didn't break anyone 😄

Demo

Notes

Audio capabilities documentation

davidpene · 2025-02-01T12:05:48Z

@microsoft-github-policy-service agree company="Carepatron"

davidpene · 2025-02-01T12:12:06Z

@RogerBarreto / @markwallace-microsoft / @dmytrostruk 🙏

### Motivation and Context **Why is this change required?** Adds to the multi-modal capabilities of Gemini. This change exposes the [audio capabilities](https://cloud.google.com/vertex-ai/generative-ai/docs/multimodal/audio-understanding) of Gemini to perform tasks that involve understanding the contents of the included audio. *No open issues are linked to this change.* --- ### Contribution Checklist - [x] The code builds clean without any errors or warnings - [x] The PR follows the [SK Contribution Guidelines](https://github.com/microsoft/semantic-kernel/blob/main/CONTRIBUTING.md) and the [pre-submission formatting script](https://github.com/microsoft/semantic-kernel/blob/main/CONTRIBUTING.md#development-scripts) raises no violations - [x] All unit tests pass, and I have added new tests where possible - [x] I didn't break anyone 😄 --- ### Demo <img width="1475" alt="Screenshot 2025-02-02 at 12 52 22 AM" src="https://github.com/user-attachments/assets/96872595-47ac-4c9f-9457-128714790b3e" /> --- ### Notes 1. Audio capabilities [documentation](https://cloud.google.com/vertex-ai/generative-ai/docs/multimodal/audio-understanding)

.Net Add Audio content capabilities to Gemini

4cbdb43

davidpene requested a review from a team as a code owner February 1, 2025 12:05

markwallace-microsoft added .NET Issue or Pull requests regarding .NET code kernel Issues or pull requests impacting the core kernel labels Feb 1, 2025

add unit test

84a0d10

re-word skips

54f926a

RogerBarreto approved these changes Feb 5, 2025

View reviewed changes

davidpene temporarily deployed to integration February 5, 2025 13:10 — with GitHub Actions Inactive

markwallace-microsoft approved these changes Feb 5, 2025

View reviewed changes

markwallace-microsoft added this pull request to the merge queue Feb 5, 2025

github-merge-queue bot removed this pull request from the merge queue due to failed status checks Feb 5, 2025

markwallace-microsoft added this pull request to the merge queue Feb 6, 2025

github-merge-queue bot removed this pull request from the merge queue due to failed status checks Feb 6, 2025

markwallace-microsoft added this pull request to the merge queue Feb 6, 2025

Merged via the queue into microsoft:main with commit f618205 Feb 6, 2025
17 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

.Net Add Audio content capabilities to Gemini #10364

.Net Add Audio content capabilities to Gemini #10364

davidpene commented Feb 1, 2025

davidpene commented Feb 1, 2025

davidpene commented Feb 1, 2025

.Net Add Audio content capabilities to Gemini #10364

.Net Add Audio content capabilities to Gemini #10364

Conversation

davidpene commented Feb 1, 2025

Motivation and Context

Contribution Checklist

Demo

Notes

davidpene commented Feb 1, 2025

davidpene commented Feb 1, 2025