Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: ai-proxy plugin #11499

Merged
merged 87 commits into from
Sep 17, 2024
Merged

Conversation

shreemaan-abhishek
Copy link
Contributor

@shreemaan-abhishek shreemaan-abhishek commented Aug 15, 2024

Description

Implement the AI-Proxy Plugin.

Checklist

  • I have explained the need for this PR and the problem it solves
  • I have explained the changes or the new features added to this PR
  • I have added tests corresponding to this change
  • I have updated the documentation to reflect this change
  • I have verified that this change is backward compatible (If not, please discuss on the APISIX mailing list first)

@shreemaan-abhishek shreemaan-abhishek marked this pull request as ready for review August 19, 2024 05:11
@dosubot dosubot bot added size:XXL This PR changes 1000+ lines, ignoring generated files. enhancement New feature or request plugin labels Aug 19, 2024
@bzp2010 bzp2010 self-requested a review August 21, 2024 09:48
apisix/plugins/ai-proxy.lua Outdated Show resolved Hide resolved
apisix/plugins/ai-proxy.lua Show resolved Hide resolved
apisix/plugins/ai-proxy/drivers/openai.lua Outdated Show resolved Hide resolved
apisix/plugins/ai-proxy.lua Outdated Show resolved Hide resolved
@kayx23 kayx23 self-requested a review September 9, 2024 15:25
@bzp2010 bzp2010 self-requested a review September 10, 2024 01:13
Copy link
Contributor

@bzp2010 bzp2010 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no more comments now ✨

apisix/init.lua Outdated
@@ -726,6 +726,7 @@ function _M.http_access_phase()
end



Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there are no substantive changes, please do not change the code style.

if header_auth == "Bearer token" or query_auth == "apikey" then
ngx.req.read_body()
local body, err = ngx.req.get_body_data()
local esc = body:gsub('"\\\""', '\"')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this still necessary? Just a confirmation that if it is then no changes are needed.

}'
```

Since `passthrough` is not enabled upstream node can be any arbitrary value because it won't be contacted.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. design question: why do we need the parameter passthrough? is it meant to be a toggle?
  2. is this sentence needed here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

design question: why do we need the parameter passthrough? is it meant to be a toggle?

yes, kinda. when it's enabled, the response from LLM will be set as the request body to the upstream.

is this sentence needed here?

I think it's important to let users know, instead to let them be puzzled about it and then figure out later.

docs/en/latest/plugins/ai-proxy.md Outdated Show resolved Hide resolved
docs/en/latest/plugins/ai-proxy.md Outdated Show resolved Hide resolved
docs/en/latest/plugins/ai-proxy.md Outdated Show resolved Hide resolved
@@ -96,7 +96,8 @@
"plugins/fault-injection",
"plugins/mocking",
"plugins/degraphql",
"plugins/body-transformer"
"plugins/body-transformer",
"plugins/ai-proxy"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The recently added ai-* plugins should probably all go into a new category. Can be amended in a different PR.

@@ -334,6 +335,21 @@ function _M.get_body(max_size, ctx)
end


function _M.get_request_body_table()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Decode body according content-type or rename the method, e.g. get_json_request_body_table?

return 400, "unsupported content-type: " .. ct
end

local request_table, err = core.request.get_request_body_table()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

new naming style?

request_table, count_num, flag_boolean?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand what you mean.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although adding a suffix makes it more readable, it is not the style used in the project.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh I got it now 😂 . request_table makes it more readable and it can be understood that it's not a string.

return
end

if core.table.try_read_attr(conf, "model", "options", "stream") then
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if core.table.try_read_attr(conf, "model", "options", "stream") then
if request_table.stream then

local res, err, httpc = ai_driver.request(conf, request_table, ctx)
if not res then
core.log.error("failed to send request to AI service: ", err)
return 500
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use the predefined Ngx constants?

ngx.HTTP_NOT_ACCEPTABLE = 406
ngx.HTTP_REQUEST_TIMEOUT = 408
ngx.HTTP_CONFLICT = 409
ngx.HTTP_GONE = 410
ngx.HTTP_UPGRADE_REQUIRED = 426
ngx.HTTP_TOO_MANY_REQUESTS = 429
ngx.HTTP_CLOSE = 444
ngx.HTTP_ILLEGAL = 451
ngx.HTTP_INTERNAL_SERVER_ERROR = 500
ngx.HTTP_METHOD_NOT_IMPLEMENTED = 501
ngx.HTTP_BAD_GATEWAY = 502
ngx.HTTP_SERVICE_UNAVAILABLE = 503
ngx.HTTP_GATEWAY_TIMEOUT = 504
ngx.HTTP_VERSION_NOT_SUPPORTED = 505
ngx.HTTP_INSUFFICIENT_STORAGE = 507

local body_reader = res.body_reader
if not body_reader then
core.log.error("LLM sent no response body")
return 500
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

end

local query_params = ""
if conf.auth.query and type(conf.auth.query) == "table" then
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if conf.auth.query and type(conf.auth.query) == "table" then
if type(conf.auth.query) == "table" then

end
end

local path = (parsed_url.path or "/v1/chat/completions") .. query_params
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
local path = (parsed_url.path or "/v1/chat/completions") .. query_params
local path = (parsed_url.path or DEFAULT_URI) .. query_params


local query_params = ""
if conf.auth.query and type(conf.auth.query) == "table" then
query_params = core.string.encode_args(conf.auth.query)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI, from lua-resty-http, params.query already support table type

local function _format_request(self, params)
    local version = params.version
    local headers = params.headers or {}

    local query = params.query or ""
    if type(query) == "table" then
        query = "?" .. ngx_encode_args(query)
    elseif query ~= "" and str_sub(query, 1, 1) ~= "?" then
        query = "?" .. query
    end

    -- Initialize request
    local req = {
        str_upper(params.method),
        " ",
        self.path_prefix or "",
        params.path,
        query,
        HTTP[version],
        -- Pre-allocate slots for minimum headers and carriage return.
        "",
        "",
        "",
    }
    local c = 7 -- req table index it's faster to do this inline vs table.insert

    -- Append headers
    for key, values in pairs(headers) do
        key = tostring(key)

        if type(values) == "table" then
            for _, value in pairs(values) do
                req[c] = key .. ": " .. tostring(value) .. "\r\n"
                c = c + 1
            end

        else
            req[c] = key .. ": " .. tostring(values) .. "\r\n"
            c = c + 1
        end
    end

    -- Close headers
    req[c] = "\r\n"

    return tbl_concat(req)
end

fmt.Fprintf(w, "data: %s\n\n", time.Now().Format(time.RFC3339))

// Flush the data immediately to the client
if f, ok := w.(http.Flusher); ok {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do type assertion outside the loop? :D


local res, err = httpc:request(params)
if not res then
return 500, "failed to send request to LLM server: " .. err
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
return 500, "failed to send request to LLM server: " .. err
return internal_server_error, "failed to send request to LLM server: " .. err

fmt.Fprintf(w, "[ERROR]")
return
}
// A simple loop that sends a message every 2 seconds
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// A simple loop that sends a message every 2 seconds
// A simple loop that sends a message every 500ms

Parameterize it?

@shreemaan-abhishek shreemaan-abhishek merged commit d46737f into apache:master Sep 17, 2024
33 of 34 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request plugin size:XXL This PR changes 1000+ lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants