Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Roadmap for v1? #13

Open
MathMesquita opened this issue Feb 3, 2019 · 2 comments
Open

Roadmap for v1? #13

MathMesquita opened this issue Feb 3, 2019 · 2 comments

Comments

@MathMesquita
Copy link

MathMesquita commented Feb 3, 2019

First of all, congratulations @timqian, the project is amazing, and by this reason i'm here contributing.

I created this issue to be a place for our discussion about a roadmap for version 1 of this project, the goal here is to create a list of things we should do before we launch v1(like start versioning).

As we discuss, i'll update this comment with the list.

Since i brought this topic, i'll start with a list of things i think we should do.

Roadmap for v1 🎊

  • Start versioning
  • Add Tests (to discuss)
  • Define a code quality tool (to discuss)
  • Define a better contract (to discuss)
  • Split cloudquery-core(backend) from cloudquery-ui(frontend) (to discuss)
@MathMesquita
Copy link
Author

MathMesquita commented Feb 3, 2019

About add tests

Adding tests to a software is important to maintain its predictability, it will help us accept new PRs with enough trust that it is not breaking any working feature.

About define a code quality tool

To keep our codebase easer to understand, and safer from bugs, we should adopt coding conventions and code quality tools. If we accomplish it, people will feel more confortable to contribute and we will be more confortable to review their contributions. Here is a post speaking more about it, why to use eslint.

About define a better contract

Our actual contract

{
  requestId: "string", // self explanatory
  url: "string", // the requested url
  selectors: "string", // the selectors to scrape data splitted by a comma
  contents: [{ // list of scraped data
    href: "string", // href attribute from element
    innerText: "string", // innerText attribute from element
    imgSrc: "string" // src attribute from element
  }]
}

What i've in mind of a better contract

{
  requestedUrl: "string", // avoid mental mapping letting it be self explanatory
  selectors: [{ // since we allow multiple selectors, it'll be better if we return a list of it instead of a "comma-splitted list"
     id: "string", // a hash key based on the selector's string, this way we will prevent duplicated selectors
     alias: "string" || "undefined", // This would be a new feature where consumers could(not mandatory) define an alias for a selector, being easier to find them in the elements list
     selector: "string" // self explanatory
  }], 
  elements: [{ // list of scraped elements
    id: "string", // the id or alias(if present) of it's selector
    type: "string", // an ENUM of element type, like "IMAGE", "TEXT", "LINK", which type will return different attributes, "IMAGE" doesn't have an innerText, but have a title
    attributes: [{
      // the raw attributes from each element type, "IMAGE" elements will have src, width, height for example...
    }]
  }]
}

I removed the requestId from the original contract because i don't see any reason why we should return it to the user, tell me if i'm mistaken.

about spliting core from ui

If we have different projects for them, we will allow them to evolve independently (features and bugs) , plus detaching them will allow people of different skills (backend and frontend) help in their specialty areas without worries of messing in the wront stuff.

Thank you

Everything i mentioned here is what i think of a good roadmap. I'm open for discussion and would love to hear what you think about all of it.

@timqian
Copy link
Member

timqian commented Feb 5, 2019

@MathMesquita Thanks for your suggestions and happy Chinese new year!

  1. about eslint, it is definitely necessary, I will bring it in soon, but after bringing it in, the deployment process needed to be changed, as we don't want to bundle the modules of dev-dependency and push to lambda.
  2. about define a better contract, The protocol you suggest might be too complicated for current usage, I think it is fine to keep it the current way until we have more complicated features to implement
  3. about splitting core from ui, I am a fan of mono repo, store related code in one repo makes it easy to understand, and the frontend code is very simple for now.

By the way, I am developing another simple serverless tool and find some ways to improve the dev experience including use docker-compose to spin up the whole dev stack; update deployment process and introduce eslint for server... After I finish that I will do a simple refactoring to this tool

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

2 participants