-
Notifications
You must be signed in to change notification settings - Fork 132
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Speed Optimizations #586
Comments
This is what I'm focusing on in accord-watcher and accord-parallel, btw. Oh, and jade does return a dep tree (except for when you compile into a function). Oh, and not exactly all of our operations are async... inside the Jade compiler there are actually some synchronous file reads. I started looking at that in yade, but it's gonna take some time to fix. |
So update after sourcemap support has been added. It's a requirement for source maps to output a dependency tree, which means that we do have trees for coffee and stylus. If you are saying that jade also gives a dep tree, this is pretty good stuff, and seems like we might be a little closer to the dream! |
Compiling Jade / Pug returns a deptree (same as sass which we use, don't know about Stylus). We use it on a massive project and we just store the entire thing in memory. This of course needs to be updated as files change and will require a full compile to start with the entire tree. You could obviously store this to disk as well and take the hit that it possibly does not match a 100% (maybe due to a branch switch or something). This brought our dependency graphing from 10s ~ 30s to almost instant for a project containing over thousands of files. The issue is pretty old, is still being looked into? |
@kevin-smets yes it is. the issue is that roots supports many more languages than just jade. but it's in our plans to slim down the number and force jade by default in order to take advantage of this. Next major step for us is force jade, postcss, babel as the stack, and fully support dep trees. I'd love to see what you guys have done with your project! That's a really nice optimization. |
At the moment, roots is not the best fit for absolutely massive static sites (like hundreds/thousands of files). This is because of a couple of technical limitations we have run up against, which I'll detail here for anyone interested.
The Dependency Graph
Right now, roots recompiles every file every time you make a change to the site. You might say "that is stupid, why don't you just recompile the file that changed?" Well, the answer is because any file that changed might have dependencies.
Let's say, for example, you change a jade partial, which is included on the home and about pages of your site. Recompiling the jade partial itself would do nothing, we actually need to recompile the home and about pages. In order to do this, we would need to be aware of the fact that the home and about pages include the partial. This means that we would need a dependency tree of the entire project -- which files depend on which other files, and when any file changes, it would need to update anything that depends on it.
This applies for every language we support. Jade, stylus, ejs, handlebars, etc. All of these languages have ways of doing includes and partials.
So the hard question here is "how do we get dependency graphs out of each of these languages and normalize them into a single project-wide dependenct graph?" The answer, at the moment, is that either a very clever methods needs to be invested that will allow us to determine the dependencies of any given file in any language the file is in, and we need to run that in an efficient manner every time changes are made to the project (your change could have been adding another dependency), or each language needs to return a dependency tree to us, which currently not a single one does.
Further discussion on this issue can be found here.
Threading & Shared Context
Currently, roots compiles all files synchronously. That is, it compiles them one after another after another, on a single thread. This means that no matter how many cores your computer has, it will run all the compiles on a single one, which makes large projects with a lot of files very slow.
Why is this the case, you might ask? Isn't node asynchronous by nature? Well, yes, it is. But only in places where it has been designed to be by designating tasks out to another process, which runs on a different core if available. For example, the
fs.readFile
method in node will asynchronously read and return the contents of a file. Internally it does this by spawning a new process to read the file, then returning the results back to the main process when done, and hitting a callback.Compiles are CPU-heavy operations that do not run on another process for any compiler that is currently supported. So even though every part of roots is so thoroughly async that every file is processed and starts compiling at the same time within less than a second of you running roots watch/compile, once they begin compiling, the compiles lock up the main thread and force it to be synchronous, until the last file is done, then once the main thread is clear of the heavy compile tasks it writes all the results almost instantly through node's async file write I/O. For large projects, the entire process runs incredibly quickly, but is bogged down by the compilers running in the middle and locking everything up.
The solution to this is to run compiles across multiple threads, or as they are called in node, processes. The limitation here is that when passing information between processes, you can only pass strings. That means that no user-provided function can reliably be passed into a process, because when you stringify a function it loses context etc. Basically, you can't reliably stringify a function. So this makes things like passing in locals to a compiler impossible, which is a huge limitation.
A workaround for this could be to require
app.coffee
and any places where functions can be injected into the compiler within the process, rather than passing it in. This slows everything down a lot, as a bunch of large requires need to be made for each process, but after the initial slowdown, the following compile is quicker. The issue here is with extensions though. Some extensions such as dynamic content, need to be able to pass information between views. So in the case of dynamic content, it runs and pulls front matter from each file, making this available to any other view. So now this means we need communication in and out of each process, and additionally central synchronizing between all processes, and insurance that the order of operations within each process is still correct.On top of that, it's possible that one process might have a bunch of quick small files to compile and another one might have a bunch of slow large ones just by the random way the compile tasks are sorted into processes. Because of this, it's important to "rebalance" the compile queue for each process to make sure they are even and running as quickly as possible. Then you also add this that some extensions have requirements that this be done before that etc, and that shared data needs to be present and synced with all processes, and this becomes an incredibly complex and difficult issue.
I personally spent more than a month trying to implement this. I got to a place where compiles were working without any extensions. There was a very long startup time to get all the processes created and loaded in with roots' logic (like 3+ seconds), but subsequent compiles were very quick. Once I hit the ordering and shared extension context issue, I burned out completely, and had to move on to something else to avoid going crazy. You can find my work on this branch, if you are interested. You can also find discussions around this issue here, here, and here.
Conclusion
If either one of these issues could be solved, roots' speed would increase by orders of magnitude immediately. If both of them could be solved, roots would be the fastest and most efficient static generator that exists, by a huge amount. I have personally poured months of work into both of these problems, and unfortunately came out frustrated with nothing to show for either one. Needless to say, these are very difficult problems.
If anyone is interesting in trying to tackle one or both of these problems, that would be amazing. Leave a comment here or reach out to me personally.
The text was updated successfully, but these errors were encountered: