Sandboxing JavaScript Code

simonw · on April 20, 2023

This is what I want to be able to use server-side WASM for.

It feels to me like this kind of sandboxing should be the Hello World" of server-side WebAssembly - using wasmer of wasmtime or similar.

To my surprise, actually achieving this with any of those seems to be very poorly documented.

Last time I complained about this on Hacker News Tim Bart figured out a Python recipe for me! https://til.simonwillison.net/webassembly/python-in-a-wasm-s...

I don't have a great recipe for JavaScript yet though. Something based around QuickJS feels like it could work.

The characteristics I'm looking for are:

1. Runs a string of code I give it - pretty much like an eval() function - ideally without needing to write that code to disk

2. Code has a limited amount of memory available to it

3. Code has limited CPU available as well - if it tries to use too many cycles it quits with an error

4. Same with amount of time to execute - cut it off after a specified number of ms and return an error

Bonus: it would be nice to be able to selectively expose additional Al functions to it (like fetch() via a controlled HTTP proxy) but even without that I could do some very useful things.

I remain frustrated that this is so hard to figure out!

tyingq · on April 20, 2023

This maybe, as a start?

https://github.com/justjake/quickjs-emscripten

See https://github.com/justjake/quickjs-emscripten/blob/main/c/i... for the mem/CPU limits.

simonw · on April 20, 2023

That does look good. I'm ideally hoping for something I can call from Python (via wasmtime-py etc).

phickey · on April 20, 2023

`jco componentize` turns a JavaScript module into a WebAssmebly component: https://github.com/bytecodealliance/jco

bhelx · on April 20, 2023

I've done this with Extism before. You can now write JavaScript plug-ins that run both on the server and the browser: https://github.com/extism/js-pdk This uses QuickJS compiled to Wasm. You can also embed these into any of the host languages we support, python included. We do have some rudimentary support for HTTP requests as well.

In order to create a plug-in that allows the host to inject and run code, I've done this:

1. Create some kind of export function like `set_code` that takes the string input of the code in CJS format. Example:

``` function myFunc() { return "hello world" }

module.exports = myFunc ```

2. `set_code` can store that text into a plug-in variable with `Var.set('my-function', code)`

3. Create a second export function that can eval the code. Would look something like this:

``` let result = eval(Var.getString('my-function'))() Host.outputString(result) ```

If you're interested I can compile one of these for you. I've been working on a demo that uses a similar technique.

update: fixed some bad javascript

curryhoward · on April 20, 2023

> Code has limited CPU available as well - if it tries to use too many cycles it quits with an error

That's a pretty strange failure mode. Usually you'd just throttle the sandboxed application (e.g., give it a CFS quota with a CPU cgroup) or give it a limited number of vCPUs.

circuit10 · on April 20, 2023

But then it could just keep running forever which isn’t something you’d want

curryhoward · on April 20, 2023

That's what (4) is for. Time out after some duration.

1letterunixname · on April 19, 2023

What I want is client-side sandboxing of code per-component and per page area to be able to swap them and limit the damage of any particular failure (PWA-/SPA-style but combined stateful URL and efficient reloading like Turbo Frames). Server-side would be nice too. Both FE and BE should be reducible to respective plugins that have hard API "contract" integration points but can't escape like monolithic apps.

https://turbo.hotwired.dev/handbook/introduction

esprehn · on April 20, 2023

What you're describing is basically SES: https://medium.com/agoric/ses-securing-javascript-in-the-rea...

There's pros and cons to that, most folks don't enjoy being inside such a restricted sandbox.

Legogris · on April 20, 2023

Your linked article mentions it as well: For people looking to get these benfits today, LavaMoat[0] builds on endo (Agoric) and SES to expose a user-friendly interface and make it easy to integrate into your build and dev processes. Just curious as to what you perceive as the major cons.

Intro video[1].

[0]: https://github.com/LavaMoat/LavaMoat

[1]: https://youtube.com/watch?v=Z5Bz0DYga1k

remram · on April 20, 2023

Is this usable today? I was about to go down the path of compiling quickjs to WASM to evaluate untrusted code in the browser, but I'd be happier without that. Another concern is that untrusted code might loop forever, I am not sure I can deal with that with either method.

phickey · on April 20, 2023

You can use jco to both compile a JavaScript module into a WebAssembly Component (with spidermonkey running as wasm inside), and to generate a Javascript embedding for that component. This solves all of your problems except for untrusted code looping forever, and that’s something which could be added to spidermonkey.wasm. https://github.com/bytecodealliance/jco

You can also run these same components server-side with Wasmtime.

spankalee · on April 20, 2023

A new proposal sketch for isolated web components: https://github.com/WICG/webcomponents/issues/1002

spankalee · on April 20, 2023

Why oh why are people so eager to fork standard JavaScript syntax?

They're using `@foo` to represent a username so you can reference other "Vals". That's going to break with upcoming decorators and they could have done a completely normal and safe thing and used usernames in import specifiers.

olliej · on April 20, 2023

Even JS engines do it :D

Specifically the JS engines implement a few JS builtins in javascript, but as they need guaranteed values (including functions) in a bunch of cases they use magic syntax to ensure that the JS engine will use internal values and properties instead of performing generic (and hence user modifiable) lookups. In JSC I think I did this with the @ prefix, and I _think_ when I did that there was already literature that used @SomeThing syntax to mean builtins. The JSC mode for builtins also disallows capturing any normal properties from the containing scope. This provides security in two directions: the @ shenanigans and capture restrictions all serve to prevent the builtin providing access to privileged operations or information to malicious JS, and the syntax break means people can't just blindly use the implementations of builtins from the engine itself and assume it will be safe (safety inside the JSVM is ensured because the builtins are parsed and built in a slightly different mode).

tentacleuno · on April 20, 2023

People just love making custom languages and writing their own parser, lexer etc. stack when they could just use JavaScript. Waste of time IMO.

stevekrouse · on April 20, 2023

Yeah I regret this decision. [Founder of Val Town here]

We hope to move to a more web-standard import syntax soon.. I learned my lesson!

yarg · on April 20, 2023

TypeScript did a good enough job that no-one need bother again.