Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[python-flask] the operationId is not incorrectly sanitized #1263

Open
1 of 3 tasks
julien-lenormand-se opened this issue Mar 7, 2024 · 1 comment
Open
1 of 3 tasks

Comments

@julien-lenormand-se
Copy link

Hello,

Checklist from your CONTRIBUTING files :

  • I have not found a matching issue, neither in swagger-api/swagger-codegen nor swagger-codegen-generators
  • I am using the latest master
    • kinda, I'm using the Docker version of 2 weeks ago : swaggerapi/swagger-codegen-cli-v3:3.0.54
  • test locally with the most recent jar
    • I'll try to do it soon

Using the python-flask generator, I noticed that some functions generated for the controller, when containing an underscore followed by a digit, have a name different than the operationId.

Minimal reproducible example :

openapi.yml

openapi: 3.0.3
info:
  title: Minimal Reproducible Example
  description: "."
  version: 0.0.0
paths:
  /foo:
    get:
      summary: Get
      responses:
        "200":
          description: Status details
      operationId: foo_123

generate_server.bash

set -euo pipefail  # check for errors

echo '(re-)create output directory'
rm -Rf ./output && mkdir ./output
echo 'copy the OpenAPI file into it'
cp ./openapi.yml ./output
echo 'generate the server files from the OpenAPI file'
docker run \
    --rm \
    -u "$(id -u):$(id -g)" \
    -v "${PWD}/output:/local" \
    swaggerapi/swagger-codegen-cli-v3:3.0.54 \
    generate \
    -l python-flask \
    -i /local/openapi.yml \
    -o /local
echo 'check the name of the function defined in the controller'
grep 'foo' < ./output/swagger_server/controllers/default_controller.py

It produces :

check the name of the function defined in the controller
def foo123():  # noqa: E501

Here we can see that the function is named foo123. This is different to the operationId foo123.

I suspect the cause is this line : permalink

public String toOperationId(String operationId) {
    // ... snip
    return underscore(sanitizeName(operationId));
}

I don't have a Java development environment at hand, so I can't find where these 2 functions are defined (being imported by *) and so can't review them.

For reference, here is the rule for valid Python identifiers (cf Python doc on lexical analysis)) but I find it much inscrutable (with all the Unicode linguo), a dumber (but inexact) version would be (according to this StackOverflow answer) :

identifier ::=  (letter|"_") (letter | digit | "_")*

(then applies the rules of keywords and dunder methods, clearly expressed in the Python doc on lexical analysis linked above)

Expected result : the function generated should have the same name than the operationId

Actual result : the function generated have a different name than the operationId

@Lenormju
Copy link

The Swagger Codegen v3.0.57 (most recent release as of now) is using the Swagger Codegen Generators v1.0.50 (also the most recent as of now).
I reproduced the issue by adding a new automatic test in src/test/java/io/swagger/codegen/v3/service/GeneratorServiceTest.java:

    @Test
    public void testIssue1263_python_respect_operationId() throws IOException {
        String path = getTmpFolder().getAbsolutePath() + "/clientdefault";
        GenerationRequest request = new GenerationRequest();
        request
                .codegenVersion(GenerationRequest.CodegenVersion.V3)
                .type(GenerationRequest.Type.SERVER)
                .lang("python-flask")
                .spec(loadSpecAsNode("3_0_0/issue-1263/swagger.yml", true, false))
                .options(
                        new Options()
                                .outputDir(path)
                );
        List<File> files = new GeneratorService().generationRequest(request).generate();
        Assert.assertFalse(files.isEmpty());
        System.out.println("Generated client in:\n" + path);

        String default_controller_file_content = FileUtils.readFileToString(new File(path + "/swagger_server/controllers/default_controller.py"));
        System.out.println(default_controller_file_content);
        Assert.assertTrue(default_controller_file_content.contains("def foo_123"), "partial string is not contained in the given text");
    }

with a corresponding 3_0_0/issue-1263/swagger.yml file that contains exactly the OpenAPI from the issue.

Using the debugger, I found that the GeneratorService is calling the DefaultGenerator to generate, which calls its generateApis, wich calls its generatePaths, which calls its processPaths, which calls its processOperations :
DefaultGenerator.java#L965

    CodegenOperation codegenOperation = config.fromOperation(config.escapeQuotationMark(resourcePath), httpMethod, operation, schemas, openAPI);

Here, the config is a PythonFlaskConnexionCodegen coming from Swagger Codegen Generators (jar), but it is calling the non-overriden method DefaultCodegenConfig.fromOperation which does :
DefaultCodegenConfig.java#L2026

    operationId = this.removeNonNameElementToCamelCase(operationId);

DefaultCodegenConfig.java#L3298

    public String removeNonNameElementToCamelCase(String name) {
        return this.removeNonNameElementToCamelCase(name, "[-_:;#]");
    }

This makes sense for Java, but not for Python. We need to keep the underscores _ for the operationId.

I'd like to contribute a pull request for this problem. But for that I'd like to know what you think is the best way to solve the problem.
Here is what I have thought :

  1. override the removeNonNameElementToCamelCase in PythonFlaskConnexionCodegen so that it applies a Python-compatible name styling, but making the method's name incorrect to what it does (essentially snake_caseinstead of CamelCase)
  2. rename the method to something more coding-style-neutral, for example something like:
    -removeNonNameElementToCamelCase
    +computeValidOperationName
    so that when it is overriden (like suggested in ①) the name stays coherent
  3. the only other place where removeNonNameElementToCamelCase is used in the DefaultCodegenConfig.java file is for its method toParamName:
    DefaultCodegenConfig.java#L3298
        public String toParamName(String name) {
            name = removeNonNameElementToCamelCase(name); // FIXME: a parameter should not be assigned. Also declare the methods parameters as 'final'.
            if (reservedWords.contains(name)) {
                return escapeReservedWord(name);
            }
            return name;
        }
    The FIXME is not relevant to the current issue, not the if after.
  4. the toParamName method in ③ causes problems downstream in the (non-default) language generators :
    • PythonFlaskConnexionCodegen.java#L431
          @Override
          public String toParamName(String name) {
              // don't do name =removeNonNameElementToCamelCase(name); // this breaks connexion, which does not modify param names before sending them
              if (reservedWords.contains(name)) {
                  name = escapeReservedWord(name);
              }
              // Param name is already sanitized in swagger spec processing
              return toVarName(name);
          }
      This seems to already cause problems with the PythonFlaskConnexionCodegen.
    • AbstractKotlinCodegen.java#L550 does a lot of other checks AFTER removeNonNameElementToCamelCase to ensure the name is actually correct in Groovy.
  5. Considering that the method is not used at many places (③), I think it could be possible to replace it completely by another one (②), which would be expected to be adapted in non-default generators to match their respective language style/rules (④).
  6. So here is a rough recap of what I am proposing as solution :
    • rename DefaultCodegenConfig.removeNonNameElementToCamelCase to something else, possibly computeValidOperationName, but keep its current behavior (to prevent having unexpected impacts on other code generators !)
    • rename removeNonNameElementToCamelCase in the classes PythonFlaskConnexionCodegen and AbstractKotlinCodegen, still without altering their behavior
    • override computeValidOperationName in PythonFlaskConnexionCodegen so that it does not remove the underscore in the name (the original issue)
    • add tests in swagger-codegen/src/test/java/io/swagger/codegen/v3/service/GeneratorServiceTest.java to ensure that this works as expected

What do you think ? @micryc @frantuma

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants