Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

.Net Add Audio content capabilities to Gemini #10364

Merged
merged 3 commits into from
Feb 6, 2025

Conversation

davidpene
Copy link
Contributor

Motivation and Context

Why is this change required?
Adds to the multi-modal capabilities of Gemini.

This change exposes the audio capabilities of Gemini to perform tasks that involve understanding the contents of the included audio.

No open issues are linked to this change.


Contribution Checklist


Demo

Screenshot 2025-02-02 at 12 52 22 AM

Notes

  1. Audio capabilities documentation

@davidpene davidpene requested a review from a team as a code owner February 1, 2025 12:05
@markwallace-microsoft markwallace-microsoft added .NET Issue or Pull requests regarding .NET code kernel Issues or pull requests impacting the core kernel labels Feb 1, 2025
@davidpene
Copy link
Contributor Author

@microsoft-github-policy-service agree company="Carepatron"

@davidpene
Copy link
Contributor Author

@markwallace-microsoft markwallace-microsoft added this pull request to the merge queue Feb 5, 2025
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Feb 5, 2025
@markwallace-microsoft markwallace-microsoft added this pull request to the merge queue Feb 6, 2025
github-merge-queue bot pushed a commit that referenced this pull request Feb 6, 2025
### Motivation and Context

**Why is this change required?**  
Adds to the multi-modal capabilities of Gemini.

This change exposes the [audio
capabilities](https://cloud.google.com/vertex-ai/generative-ai/docs/multimodal/audio-understanding)
of Gemini to perform tasks that involve understanding the contents of
the included audio.

*No open issues are linked to this change.*

---

### Contribution Checklist

- [x] The code builds clean without any errors or warnings
- [x] The PR follows the [SK Contribution
Guidelines](https://github.com/microsoft/semantic-kernel/blob/main/CONTRIBUTING.md)
and the [pre-submission formatting
script](https://github.com/microsoft/semantic-kernel/blob/main/CONTRIBUTING.md#development-scripts)
raises no violations
- [x] All unit tests pass, and I have added new tests where possible
- [x] I didn't break anyone 😄

---

### Demo

<img width="1475" alt="Screenshot 2025-02-02 at 12 52 22 AM"
src="https://github.com/user-attachments/assets/96872595-47ac-4c9f-9457-128714790b3e"
/>


---

### Notes

1. Audio capabilities
[documentation](https://cloud.google.com/vertex-ai/generative-ai/docs/multimodal/audio-understanding)
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Feb 6, 2025
@markwallace-microsoft markwallace-microsoft added this pull request to the merge queue Feb 6, 2025
Merged via the queue into microsoft:main with commit f618205 Feb 6, 2025
17 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kernel Issues or pull requests impacting the core kernel .NET Issue or Pull requests regarding .NET code
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants