Writing a Lua sandbox using sol3

Tutorials

Sandboxes can protect the user’s computer from malicious or buggy scripts. But sandboxes are difficult to get right; you need to be very careful with what you expose, and make sure you test for vulnerabilities. The Sandboxes on the Lua wiki is required reading, as it contains very helpful advice.

Environments

An environment is a table that stores the global variables available to a function. Each function will have an environment assigned to it, and we can use this to sandbox code. To manage environments in sol3, you will use sol::environment.

// lua is a `sol::state` or `sol::state_view`

// Create new blank environment
auto env = sol::environment(lua, sol::create);

// Set global variable for globals
env["_G"] = env;

To sandbox, you will want to create an environment with only whitelisted, safe functions. Let’s list all the safe global functions:

const std::vector<std::string> whitelisted = {
    "assert",
    "error",
    "ipairs",
    "next",
    "pairs",
    "pcall",
    "print",
    "select",
    "tonumber",
    "tostring",
    "type",
    "unpack",
    "_VERSION",
    "xpcall",

    // These functions are unsafe as they can bypass or change metatables,
    // but they are required to implement classes.
    "rawequal",
    "rawget",
    "rawset",
    "setmetatable",
};

Now, let’s copy the whitelisted globals into the environment:

for (const auto &name : whitelisted) {
    env[name] = lua[name];
}

Next, you’ll want to define and copy whitelisted modules. We didn’t include these in the above list, as we want to copy the tables themselves. This prevents changes that untrusted code makes to modules from affecting trusted code.

std::vector<std::string> safeLibraries = {
        "coroutine", "string", "table", "math"};

for (const auto &name : safeLibraries) {
    sol::table copy(lua, sol::create);
    for (auto pair : lua[name]) {
        // first is the name of a function in module, second is the function
        copy[pair.first] = pair.second;
    }
    env[name] = copy;
}

Finally, you’ll want to partially copy modules that contain unsafe functions:

sol::table os(lua, sol::create);
os["clock"] = lua["os"]["clock"];
os["date"] = lua["os"]["date"];
os["difftime"] = lua["os"]["difftime"];
os["time"] = lua["os"]["time"];
env["os"] = os;

Safe loadstring, loadfile, and dofile

What?

Let’s first revise what each function does:

We need to provide safe implementations of each of these functions. We will do this be making sure that the following things are checked:

  1. The environment is set on any functions. The default loadstring will set the global environment on the returned function, which allows escaping the sandbox.
  2. Bytecode cannot be loaded, as it can be used to escape the sandbox.
  3. In loadfile, we check that the file path is within the sandbox path.

loadstring

std::tuple<sol::object, sol::object> LuaSecurity::loadstring(
        const std::string &str, const std::string &chunkname) {
    if (!str.empty() && str[0] == LUA_SIGNATURE[0]) {
        return std::make_tuple(sol::nil,
                sol::make_object(lua, "Bytecode prohibited by Lua sandbox"));
    }

    sol::load_result result = lua.load(str, chunkname, sol::load_mode::text);
    if (result.valid()) {
        sol::function func = result;
        env.set_on(func);
        return std::make_tuple(func, sol::nil);
    } else {
        return std::make_tuple(
                sol::nil, sol::make_object(lua, ((sol::error)result).what()));
    }
}

LUA_SIGNATURE is the character used to indicate that some source code is bytecode.

env.set_on(func) is used to set the environment.

loadfile

std::tuple<sol::object, sol::object> LuaSecurity::loadfile(
        const std::string &path) {
    if (!checkPath(path)) {
        return std::make_tuple(sol::nil,
                sol::make_object(
                        lua, "Path is not allowed by the Lua sandbox"));
    }

    std::ifstream t(path);
    std::string str((std::istreambuf_iterator<char>(t)),
            std::istreambuf_iterator<char>());
    return loadstring(str, "@" + path);
}

checkPath is a method that will be used later to verify that path is allowed, for now it can be defined to always return true.

dofile

sol::object LuaSecurity::dofile(const std::string &path) {
    std::tuple<sol::object, sol::object> ret = loadfile(path);
    if (std::get<0>(ret) == sol::nil) {
        throw sol::error(std::get<1>(ret).as<std::string>());
    }

    sol::unsafe_function func = std::get<0>(ret);
    return func();
}

dofile will need to check the load result, and run the function in unsafe mode.

Adding to the environment

Don’t forget to actually set them on the environment!

env.set_function("loadstring", &LuaSecurity::loadstring, this);
env.set_function("loadfile", &LuaSecurity::loadfile, this);
env.set_function("dofile", &LuaSecurity::dofile, this);

Running scripts safely

The easiest way to run a script safely is to pass the environment into script_file:

lua.script_file("mods/mymod/init.lua", security->getEnvironment());

Setting the global environment

In order to safely execute our scripts, we need to remember to set the safe environment. Wouldn’t it be nicer to change the default environment in Lua?

The default environment in Lua is stored in a registry value, and so can be assigned like so:

#if LUA_VERSION_NUM >= 502
    // Get environment registry index
    lua_rawgeti(lua, LUA_REGISTRYINDEX, env.registry_index());

    // Set the global environment
    lua_rawseti(lua, LUA_REGISTRYINDEX, LUA_RIDX_GLOBALS);
#else
    // Get main thread
    int is_main = lua_pushthread(lua);
    assert(is_main);
    int thread = lua_gettop(lua);

    // Get environment registry index
    lua_rawgeti(lua, LUA_REGISTRYINDEX, env.registry_index());

    // Set the global environment
    if (!lua_setfenv(lua, thread)) {
        throw ModException(
                "Security: Unable to set environment of the main Lua thread!");
    };
    lua_pop(lua, 1); // Pop thread
#endif

Unfortunately, the preprocessors to support multiple versions of Lua makes this ugly. If you’re only targeting a specific version, you can remove the unused branch.

We can now safely load scripts directly, without specifying the environment:

lua.script_file("mods/mymod/init.lua");

Checking file paths

Ideally, you’d not allow any file system access to untrusted scripts. You can use virtual file systems to load all allowed resources into memory, and then only read from memory.

However, sometimes the scripts aren’t totally untrusted, and you would like to allow some access to the file system. TO do this, you can check the path to make sure it’s in an allowed location. Note that this isn’t completely safe, symlinks can be used to escaped the allowed path - however, if scripts can’t make symlinks then it’s the user’s stupidity.

C++17’s filesystem provides useful path-parsing methods:

bool LuaSecurity::checkPath(const std::string &filepath) {
    if (basePath.empty()) {
        return false;
    }

    auto base = std::filesystem::absolute(basePath).lexically_normal();
    auto path = std::filesystem::absolute(filepath).lexically_normal();

    auto [rootEnd, nothing] =
            std::mismatch(base.begin(), base.end(), path.begin());

    return rootEnd == base.end();
}

You may wish to extend this to add multiple base paths, and also add a way to restrict writing to a subset of the paths.

Preventing infinite loops

You can use lua_sethook to run a callback after a number of instructions, and then raise a Lua error.

https://stackoverflow.com/questions/2777527/stopping-a-runaway-lua-subprocess

Summary

I hope you found this article useful. This doesn’t cover all possible exploits - untrusted code may still crash or freeze the program - but it aims to at least protect the host from the code.

I’d like to finish by reminding you to add sandbox unit tests to make sure that it’s working correctly, and you don’t accidentally break it. This can be as simple as some asserts in a builtin Lua file somewhere.

Links