Memory leak and channel closure issues when reusing/dropping Model #865

solaoi · 2024-10-19T05:24:43Z

Describe the bug

When initializing and dropping the Model repeatedly:

Memory usage continuously increases as GGUF models aren't properly cleaned up
Channel is erroneously closed after the first iteration

Steps to Reproduce

Create a service that initializes and drops the model multiple times
Run the following code:

use anyhow::Result;
use mistralrs::{GgufModelBuilder, PagedAttentionMetaBuilder, TextMessageRole, TextMessages};
use std::time::Duration;
use tokio::time::sleep;

struct ChatService {
    model: Option<mistralrs::Model>,
}

impl ChatService {
    async fn new() -> Result<Self> {
        Ok(Self { model: None })
    }

    async fn initialize_model(&mut self) -> Result<()> {
        self.model = Some(
            GgufModelBuilder::new(
                "gguf_models/mistral_v0.1/",
                vec!["mistral-7b-instruct-v0.1.Q4_K_M.gguf"],
            )
            .with_chat_template("chat_templates/mistral.json")
            .with_paged_attn(|| PagedAttentionMetaBuilder::default().build())?
            .build()
            .await?,
        );
        Ok(())
    }

    async fn chat(&self, prompt: &str) -> Result<String> {
        let messages = TextMessages::new().add_message(TextMessageRole::User, prompt);

        let response = self
            .model
            .as_ref()
            .unwrap()
            .send_chat_request(messages)
            .await?;

        Ok(response.choices[0]
            .message
            .content
            .clone()
            .unwrap_or_default())
    }
}

#[tokio::main]
async fn main() -> Result<()> {
    for i in 0..3 {
        println!("Iteration {}", i);

        let mut service = ChatService::new().await?;
        service.initialize_model().await?;

        let response = service.chat("Write a short greeting").await?;
        println!("Response: {}", response);

        // Model is dropped here, but GGUF remains in memory
        drop(service);

        // Wait to make memory usage observable
        sleep(Duration::from_secs(5)).await;
    }

    Ok(())
}

Cargo.toml is here:

[package]
name = "memory_bug_mistral"
version = "0.1.0"
edition = "2021"

# See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html

[dependencies]
tokio = { version = "1", features = ["full"] }
anyhow = "1.0"
mistralrs = { git = "https://github.com/EricLBuehler/mistral.rs.git", branch = "master", features = [
    "metal",
] }
regex="1.10.6"

Observed Behavior

Memory usage increases with each iteration even after explicit drop
After first iteration, receiving error:

Error: Channel was erroneously closed!

Expected Behavior

Memory should be properly freed when model is dropped
Channel should remain functional for subsequent iterations

Latest commit or version

mistralrs version: latest master: 32e894510696e9aa3c11db79268ee031a3ecefa6
Mac: M2
OS: Sonoma 14.7
Rust version: 1.80.1

The text was updated successfully, but these errors were encountered:

solaoi added the bug Something isn't working label Oct 19, 2024

solaoi mentioned this issue Oct 19, 2024

blocking_recv hangs after first iteration in loop #750

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Memory leak and channel closure issues when reusing/dropping Model #865

Memory leak and channel closure issues when reusing/dropping Model #865

solaoi commented Oct 19, 2024

Memory leak and channel closure issues when reusing/dropping Model #865

Memory leak and channel closure issues when reusing/dropping Model #865

Comments

solaoi commented Oct 19, 2024

Describe the bug

Steps to Reproduce

Observed Behavior

Expected Behavior

Latest commit or version