luau/docs/_posts/2020-05-18-luau-recap-may-2...

100 lines
10 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

---
layout: single
title: "Luau Recap: May 2020"
---
Luau (lowercase u, “l-wow”) is an umbrella initiative to improve our language stack - the syntax, compiler, virtual machine, builtin Lua libraries, type checker, linter (known as Script Analysis in Studio), and more related components. We continuously develop the language and runtime to improve performance, robustness and quality of life. Here we will talk about all things that happened since the update in March!
[Originally posted on the [Roblox Developer Forum](https://devforum.roblox.com/t/luau-recap-may-2020/).]
## New function type annotation syntax
As noted in the previous update, the function type annotation syntax now uses `:` on function definitions and `->` on standalone function types:
```
type FooFunction = (number, number) -> number
function foo(a: number, b: number): number
return a + b
end
```
This was done to make our syntax more consistent with other modern languages, and is easier to read in type context compared to our old `=>`.
This change is now live; the old syntax is still accepted but it will start producing warnings at some point and will be removed eventually.
## Number of locals in each function is now limited to 200
As detailed in [Upcoming change to (correctly) limit the local count to 200](https://devforum.roblox.com/t/upcoming-change-to-correctly-limit-the-local-count-to-200/528417) (which is now live), when we first shipped Luau we accidentally set the local limit to 255 instead of 200. This resulted in confusing error messages and code that was using close to 250 locals was very fragile as it could easily break due to minor codegen changes in our compiler.
This was fixed, and now were correctly applying limits of 200 locals, 200 upvalues and 255 registers (per function) - and emit proper error messages pointing to the right place in the code when either limit is exceeded.
This is technically a breaking change but scripts with >200 locals didnt work in our old VM and we felt like we had to make this change to ensure long-term stability.
## Require handling improvements in type checker + export type
Were continuing to flesh out the type checker support for modules. As part of this, we overhauled the require path tracing - type checker is now much better at correctly recognizing (statically) which module youre trying to require, including support for `game:GetService`.
Additionally, up until now we have been automatically exporting all type aliases declared in the module (via `type X = Y`); requiring the module via `local Foo = require(path)` made these types available under `Foo.` namespace.
This is different from the explicit handling of module entries, that must be added to the table returned from the `ModuleScript`. This was highlighted as a concern, and to fix this weve introduced `export type` syntax.
Now the only types that are available after require are types that are declared with `export type X = Y`. If you declare a type without exporting it, its available inside the module, but the type alias cant be used outside of the module. That allows to cleanly separate the public API (types and functions exposed through the module interface) from implementation details (local functions etc.).
## Improve type checker robustness
As were moving closer to enabling type checking for everyone to use (no ETA at the moment), were making sure that the type checker is as robust as possible.
This includes never crashing and always computing the type information in a reasonable time frame, even on obscure scripts like this one:
```
type ( ... ) ( ) ;
( ... ) ( - - ... ) ( - ... )
type = ( ... ) ;
( ... ) ( ) ( ... ) ;
( ... ) ""
```
To that end weve implemented a few changes, most of them being live, that fix crashes and unbounded recursion/iteration issues. This work is ongoing, as were fixing issues we encounter in the testing process.
## Better types for Lua and Roblox builtin APIs
In addition to improving the internals of the type checker, were still working on making sure that the builtin APIs have correct type information exposed to the type checker.
In the last few weeks weve done a major audit and overhaul of that type information. We used to have many builtin methods “stubbed” to have a very generic type like `any` or `(...) -> any`, and while we still have a few omissions were much closer to full type coverage.
One notable exception here is the `coroutine.` library which we didnt get to fully covering, so the types for many of the functions there are imprecise.
If you find cases where builtin Roblox APIs have omitted or imprecise type information, please let us know by commenting on this thread or filing a bug report.
The full set of types we expose as of today is listed here for inquisitive minds: [https://gist.github.com/zeux/d169c1416c0c65bb88d3a3248582cd13](https://gist.github.com/zeux/d169c1416c0c65bb88d3a3248582cd13)
## Removal of __gc from the VM
A bug with `continue` and local variables was reported to us a few weeks ago; the bug was initially believed to be benign but it was possible to turn this bug into a security vulnerability by getting access to `__gc` implementation for builtin Roblox objects. After fixing the bug itself (the turnaround time on the bug fix was about 20 hours from the bug report), we decided to make sure that future bugs like this dont compromise the security of the VM by removing `__gc`.
`__gc` is a metamethod that Lua 5.1 supports on userdata, and future versions of Lua extend to all tables; it runs when the object is ready to be garbage collected, and the primary use of that is to let the userdata objects implemented in C to do memory cleanup. This mechanism has several problems:
* `__gc` is invoked by the garbage collector without context of the original thread. Because of how our sandboxing works this means that this code runs at highest permission level, which is why `__gc` for newproxy-created userdata was disabled in Roblox a long time ago (10 years?)
* `__gc` for builtin userdata objects puts the object into non-determinate state; due to how Lua handles `__gc` in weak keys (see [https://www.lua.org/manual/5.2/manual.html#2.5.2](https://www.lua.org/manual/5.2/manual.html#2.5.2)), these objects can be observed by external code. This has caused crashes in some Roblox code in the past; we changed this behavior at some point last year.
* Because `__gc` for builtin objects puts the object into non-determinate state, calling it on the same object again, or calling any other methods on the object can result in crashes or vulnerabilities where the attacker gains access to arbitrarily mutating the process memory from a Lua script. We normally dont expose `__gc` because the metatables of builtin objects are locked but if it accidentally gets exposed the results are pretty catastrophic.
* Because `__gc` can result in object resurrection (if a custom Lua method adds the object back to the reachable set), during garbage collection the collector has to traverse the set of userdatas twice - once, to run `__gc` and a second time to mark the survivors.
For all these reasons, we decided that the `__gc` mechanism just doesnt pull its weight, and completely removed it from the VM - builtin userdata objects dont use it for memory reclamation anymore, and naturally declaring `__gc` on custom userdata objects still does nothing.
Aside from making sure were protected against these kinds of vulnerabilities in the future, this makes garbage collection ~25% faster.
## Memory and performance improvements
Its probably not a surprise at this point but were never fully satisfied with the level of performance we get. From a language implementation point of view, any performance improvements we can make without changing the semantics are great, since they automatically result in Lua code running faster. To that end, heres a few changes weve implemented recently:
* ~~A few string. methods, notably string.byte and string.char, were optimized to make it easier to write performant deserialization code. string.byte is now ~4x faster than before for small numbers of returned characters. For optimization to be effective, its important to call the function directly (`string.byte(foo, 5)`) instead of using method calls (`foo:byte(5)`).~~ This had to be disabled due to a rare bug in some cases, this optimization will come back in a couple of weeks.
* `table.unpack` was carefully tuned for a few common cases, making it ~15% faster; `unpack` and `table.unpack` now share implementations (and the function objects are equal to each other).
* While we already had a very efficient parser, one long standing bottleneck in identifier parsing was fixed, making script compilation ~5% faster across the board, which can slightly benefit server startup times.
* Some builtin APIs that use floating point numbers as arguments, such as various `Vector3` constructors and operators, are now a tiny bit faster.
* All string objects are now 8 bytes smaller on 64-bit platforms, which isnt a huge deal but can save a few megabytes of Lua heap in some games.
* Debug information is using a special compact format that results in ~3.2x smaller line tables, which ends up making function bytecode up to ~1.5x smaller overall. This can be important for games with a lot of scripts.
* Garbage collector heap size accounting was cleaned up and made more accurate, which in some cases makes Lua heap ~10% smaller; the gains highly depend on the workload.
## Library changes
The standard library doesnt see a lot of changes at this point, but we did have a couple of small fixes here:
* `coroutine.wrap` and `coroutine.create` now support C functions. This was the only API that treated Lua and C functions differently, and now it doesnt.
* `require` silently skipped errors in module scripts that occurred after the module scripts yielding at least once; this was a regression from earlier work on yieldable pcall and has been fixed.
As usual, if you have questions, comments, or any other feedback on these changes, feel free to share it in this thread or create separate posts for bug reports.