aboutsummaryrefslogtreecommitdiff
path: root/README.md
diff options
context:
space:
mode:
Diffstat (limited to 'README.md')
-rw-r--r--README.md92
1 files changed, 92 insertions, 0 deletions
diff --git a/README.md b/README.md
new file mode 100644
index 0000000..256ecd2
--- /dev/null
+++ b/README.md
@@ -0,0 +1,92 @@
+# Distributed computing
+
+A framework for distributed computing -- that is, running jobs across multiple
+machines.
+
+## Compute core API version 1
+
+Functions that should be exposed, without mangling, by the compute core `.so`
+library:
+
+- `int32_t worker_init(int32_t version)`
+ - Will be called when initialising the library. The integer argument is the
+ version of this API specification; the library should check that it is
+ equal to the expected value.
+
+ Should return 0 on successful initialisation, or nonzero if an error
+ occurred.
+
+- `int32_t worker_run_job(uint64_t size, void *data, uint64_t *outsize, void **outdata)`
+ - Run a job. The job data is specified in the data blob pointed to by `data`,
+ which is `size` bytes in size. The worker is allowed to modify the memory
+ behind `data` during execution of this function, but the memory will be
+ deallocated as soon as `worker_run_job` returns.
+
+ A pointer to memory containing the computed results should be stored in
+ `outdata`, and its size in `outsize`. When the memory allocated for the
+ output data may be freed, `worker_free_outdata` will be called. If there
+ is no data to return, for example because an error occurred, store 0 in
+ `outsize` and a null pointer in `outdata`.
+
+ Should return 0 on successful execution, or nonzero if an error occurred.
+
+- `void worker_free_outdata(uint64_t size, void *outdata)`
+ - Free memory allocated for the output data of a job. This is called when the
+ memory for the output data of the last job is no longer needed, and will
+ always be called before the next job is started.
+
+Note that there is no function called before unloading the library. If you need
+such a thing, please use [destructors][1], or unload necessary things in
+`worker_free_outdata`.
+
+## Worker socket protocol version 1
+
+All integers in the below description are little-endian.
+
+Common data types used in the message descriptions below:
+- **String**/**Blob**: 8-byte unsigned integer indicating the length of
+ the data, then that many bytes making up the string or blob. A string is
+ valid UTF-8, while a blob can contain arbitrary data.
+
+A **message from controller to worker** has the following format:
+- Message type [1 byte]
+- ID [8 bytes]
+- Payload length [8-byte unsigned integer]
+- Payload [variable length and contents]
+
+A **response from worker to controller** has the following format:
+- Response type [1 byte]
+- ID of message replied to [8 bytes]
+- Payload length [8-byte unsigned integer]
+- Payload [variable length and contents]
+
+The possible **response types** are the following:
+- `0x01`: Successful response to a message, as described in the table of
+ message types.
+- `0xff`: An error response; something went wrong. The entire payload is an
+ UTF-8 error message.
+
+The possible **message types** are the following:
+
+- `0x01`: Version exchange
+ - Payload: 4-byte unsigned integer, the protocol version of the
+ server. In this version, this is 1.
+ - Successful response: 1 byte, 1 if the version is accepted by the worker, 0
+ if not. If the version is not accepted, the connection is closed by both
+ sides.
+
+- `0x02`: New compute core
+ - Payload: A string giving the name of the compute core, then a blob giving
+ the contents of a dynamic library file that can be loaded at runtime, e.g.
+ a `.so` file. This library will be loaded as the compute core for the
+ worker.
+ - Successful response: Empty.
+
+- `0x03`: New job
+ - Payload: An 8-byte unsigned integer giving the ID of the job, then a blob
+ giving the input data for the compute core.
+ - Successful response: A 4-byte signed integer giving the exit code of the
+ job as returned by the compute core, then a blob giving the output data.
+
+
+[1]: https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html#index-destructor-function-attribute