1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
|
# Distributed computing
A framework for distributed computing -- that is, running jobs across multiple
machines.
## Compute core API version 1
Functions that should be exposed, without mangling, by the compute core `.so`
library:
- `int32_t worker_init(int32_t version)`
- Will be called when initialising the library. The integer argument is the
version of this API specification; the library should check that it is
equal to the expected value.
Should return 0 on successful initialisation, or nonzero if an error
occurred.
- `int32_t worker_run_job(uint64_t size, void *data, uint64_t *outsize, void **outdata)`
- Run a job. The job data is specified in the data blob pointed to by `data`,
which is `size` bytes in size. The worker is allowed to modify the memory
behind `data` during execution of this function, but the memory will be
deallocated as soon as `worker_run_job` returns.
A pointer to memory containing the computed results should be stored in
`outdata`, and its size in `outsize`. When the memory allocated for the
output data may be freed, `worker_free_outdata` will be called. If there
is no data to return, for example because an error occurred, store 0 in
`outsize` and a null pointer in `outdata`.
Should return 0 on successful execution, or nonzero if an error occurred.
- `void worker_free_outdata(uint64_t size, void *outdata)`
- Free memory allocated for the output data of a job. This is called when the
memory for the output data of the last job is no longer needed, and will
always be called before the next job is started.
Note that there is no function called before unloading the library. If you need
such a thing, please use [destructors][1], or unload necessary things in
`worker_free_outdata`.
## Worker socket protocol version 1
All integers in the below description are little-endian.
Common data types used in the message descriptions below:
- **String**/**Blob**: 8-byte unsigned integer indicating the length of
the data, then that many bytes making up the string or blob. A string is
valid UTF-8, while a blob can contain arbitrary data.
A **message from controller to worker** has the following format:
- Message type [1 byte]
- ID [8 bytes]
- Payload length [8-byte unsigned integer]
- Payload [variable length and contents]
A **response from worker to controller** has the following format:
- Response type [1 byte]
- ID of message replied to [8 bytes]
- Payload length [8-byte unsigned integer]
- Payload [variable length and contents]
The possible **response types** are the following:
- `0x01`: Successful response to a message, as described in the table of
message types.
- `0xff`: An error response; something went wrong. The entire payload is an
UTF-8 error message.
The possible **message types** are the following:
- `0x01`: Version exchange
- Payload: 4-byte unsigned integer, the protocol version of the
server. In this version, this is 1.
- Successful response: 1 byte, 1 if the version is accepted by the worker, 0
if not. If the version is not accepted, the connection is closed by both
sides.
- `0x02`: New compute core
- Payload: A string giving the name of the compute core, then a blob giving
the contents of a dynamic library file that can be loaded at runtime, e.g.
a `.so` file. This library will be loaded as the compute core for the
worker.
- Successful response: Empty.
- `0x03`: New job
- Payload: An 8-byte unsigned integer giving the ID of the job, then a blob
giving the input data for the compute core.
- Successful response: A 4-byte signed integer giving the exit code of the
job as returned by the compute core, then a blob giving the output data.
[1]: https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html#index-destructor-function-attribute
|