May 10, 2018 - Apache dual-licensed (as is the standard in the Rust community) and .... [12] B. Hindman et al., âMesos: A platform for fine- grained resource ...
ObjectSpace: An Intuitive Model for Concurrent Programming Tuan Tran May 10th, 2018
Abstract
1
Introduction
We introduce ObjectSpace, a new concurrent pro- The rise of the age of Big Data and planetary-scale sysgramming model that aims towards flexibility and tems has put concurrent computing front and center simplicity. The goal of ObjectSpace is to be intuitive
in modern research. The most successful corporations
to programmers while having good expressiveness and such as Google, Microsoft, or Facebook have spent scalability. It is designed to fulfill many roles in a huge effort providing simple, extensible, and faultconcurrent setting, including data passing and com- tolerant concurrent programming models and have munication between threads.
enjoyed considerable success [5], [7], [14], [17]. On the consumer’s side of computing, for the last ten years,
ObjectSpace could be considered a natural evolution of
chip producers have been researching on putting more
Linda, introduced by Gelernter[10]. It centers around
processors into their systems to make up for the end
a concurrent data structure that could store objects
of Moore’s Law. Having a simple and efficient concur-
of arbitrary type. It provides atomic addition and
rent model to take advantage of this new increase in
removal of objects of arbitrary type and lets users
power is more crucial than ever.
retrieve objects based on the value of any of its field.
The addition of complex data structures provides Ob- However, the most common paradigm for concurrency, jectSpace with better expressiveness and type safety multi-threading, requires dealing with lock conditions, than Linda.
which has poor usability and scalability, and is errorprone, as each new thread presents new synchroniza-
We also provide a referential implementation of the
tion issues (coupled with the fact that pthread is
model in Rust. The API of the model proves to be
not an intuitive interface). Message passing interface
simple and intuitive, and could be generalized in many
(MPI) requires machines to be tightly coupled in both
programming languages.
time and space, and therefore is also difficult to use [8]. Even the low-level infrastructure provided by the 1
Unix environment is limited in its capability. This
pass tuples between threads becomes a big limitation.
issue leads to various studies on concurrent program- Not only that tuples are too simplistic to represent ming, starting from the early 1970s, including original
structures occurred in real production, they also fail
and impactful ideas in all aspects: hardware support to convey the meaning of the elements in a message [4], [15], compiler support [2], performance analysis individually and as a whole. [6], and programming models.
ObjectSpace is an extension to Linda aiming to fix
Recently, the distributed community seems to favor these problems. By storing complex objects instead ad hoc systems such as MapReduce [7] or Spark [21], of simple tuples, it enhances the system’s power, imwhich focus on ‘big data’ processing and manipula- proves its flexibility, and enforces type safety, fixing tion. While they have proven to be successful in this
the biggest problems of Linda. We also retain Linda’s
particular field of parallel processing, these models unique ability to specify a value of object’s field as achieve their concurrency by limiting their API, which “filter” condition, producing a powerful but intuitive hampers their ability to integrate with other work- framework for concurrent programming. flows and makes them unsuitable for a wide array of
In the next sections of this paper, we will describe
concurrent programming tasks [16].
the API of an ObjectSpace and introduce a proof-of-
A simple and elegant approach to distributed comput- concept implementation in Rust, a modern language ing is Linda which was first proposed by Gelernter[10]. for system programming gearing towards safety and Although Linda has not received much attention from concurrency. Then we will introduce an example using the community, we believe it offers a great balance be- ObjectSpace to achieve concurrency and analyze the tween simplicity and flexibility. Linda centers around
framework’s performance. Finally, we propose a few
a shared memory space storing tuples from client directions which future work could focus on. nodes in the system. Through Linda’s atomic read and write operations, agents in a concurrent system
2
could achieve time and space decoupling. Its biggest strength though lies on its ability to allow wildcards
2.1
in its operations, which enables nodes in the system to
The ObjectSpace API Basic Operations
control the scope of their operations and open many The basic operations of an ObjectSpace space inpossibilities for communication.
cludes write, read, and take. Each of these opera-
However, as our need to communicate within a concur- tions is atomic. The write operation is simple: given rent setting gets more complex, Linda’s design to only an object obj, we could add it to the ObjectSpace by
2
calling space.write(obj). Notice that obj could be
space.read_by_range("age", range(6,
of any type: an int, a string, a boolean, a tuple, or a
9)) reads an object of type T whose filed 'age'
complex object.
has value between 6 and 9.
read
operation
has
three
variations: Each of these types of conditional operations has
space.try_read() is a non-blocking read of one
all of the variations found in normal read and take
object of type T from the space, space.read() operations. is a blocking read, and space.read_all() is a
In general, the surface API of ObjectSpace is small and
non-blocking read of all elements of type T. Notice
very easy to understand. However, it provides a robust
that since an ObjectSpace could store objects of any
base for multiple data passing and communication jobs
type, a generic type is necessary for the operation.
for concurrent programming.
A take operation is similar to read, except that after an object is read, it will be destroyed from the
3
ObjectSpace. It also has three variations as read.
Reference Implementation
We provide a reference implementation of the Ob-
2.2
Conditional Operations
jectSpace model. The implementation is MIT and Apache dual-licensed (as is the standard in the Rust
Beside reading objects of a type, ObjectSpace also
community) and could be found on GitHub [19]. This
allows filtering output objects based on the value of a
implementation is written in Rust, a language for sys-
field of the type. We call this conditional operations.
tem programming focusing on safety and concurrency.
There are two main types of conditional operations:
Since Rust forbids a lot of possible concurrent error
• Reading by exact value: Given a field name
through its concepts of lifetime and ownership and a
and a value, we return an object of which the
very thorough compiler, limiting how variables could
specified field has the given value. For example: space.read_by_value("age", 8)
be created and passed between functions, it has a
reads
high initial learning curves. However, the benefit com-
an object of type T whose field 'age' has value
ing from native performance with high-level features,
9. • Reading by range:
and straightforward concurrent programming proves Given a field name
worth these limitation.
and a range of values, we return an object
The main data structure of our implementation is a
of which the value of the specified field is in the specified range.
HashMap between the type’s ID and an Entry storing
For example:
all objects of the corresponding type. Before being 3
written into the ObjectSpace, objects are serialized
after a worker thread finds a prime number, it writes
into a flatten JSON-like structure and assigned a
the number into the ObjectSpace. Then at the end of
unique ID. The Entry for each type consists of two
the program, the master thread reads in all calculated
data structures: a HashMap between an object’s ID prime numbers from the ObjectSpace and prints them and itself; and a reversed indexer which maps a pos- out. sible value of a field with an ID list of objects whose
The second usage of ObjectSpace is communication,
mentioned field has the corresponding value.
achieved through the Task object, which represents a
This structure enables straightforward implementa- range of number. In each round of iteration, the mastion of all of ObjectSpace’s features, especially condi- ter thread writes new Tasks to the ObjectSpace. Each tional operations. A downside of this implementation worker thread will take one Task and calculate the though is that it can only store objects that are JSON- prime numbers within the range of the Task, before serializable. Moreover, conditional operations could writing them to the ObjectSpace and using the Task only operate on “leaf fields” of an object: fields whose
to communicate back to the master that a job has
values are numbers, booleans, or strings; and condi- been finished. Notice that the master thread does not tional operations on more than one field at a time
need to know which task is taken by which thread, it
are complicated. However, we find that despite these
just needs to know that such task has been completed
operations, our implementation is still very robust (this information, however, could be easily added to a and flexible, and proves to be suitable for a wide array Task by the programmer if necessary). This helps us of concurrent programming jobs.
achieve space decoupling between threads and reduce the complexity of the program.
4
We also provide a few other examples of ObjectSpace,
Example: Calculating Primes
including a reminder program in the same vein as one introduced in Gelernter’s original paper[10], and a
An example of ObjectSpace usage could be found in
A. This example calculates and prints out all prime program for drawing Mandelbrot fractal. All of them numbers smaller than ten million. The code requires could be found in our GitHub repository[19]. a few changes compared to a program written for single-threaded setting, but in general still clear and
4.1
Performance
simple to follow. Here we measure the performance of the aforemen-
This example introduces two common usages of Ob-
tioned prime numbers example to test the scalability
jectSpace. The more obvious usage is for data passing:
of the framework. All experiments are done on a 4
MacBook Pro 15 inch late 2016 running macOS 10.13, and consistent, lowering the learning curve while allowwith Intel i7 6820HQ, 16GB of RAM and 512 GB of ing a high degree of flexibility for the programmers to SSD.
express their intention through conditional operations. Time(s)
In practice, we find that ObjectSpace is suitable for
Normal single-threaded
24.193
a wide array of concurrent work. The integration of
ObjectSpace 1 thread
32.248
the reference implementation with the Rust language
ObjectSpace 2 threads
19.507
makes it very natural to learn and use. We expect any
ObjectSpace 4 threads
13.931
future implementation of the paradigm to integrate
ObjectSpace 8 threads
14.741
with their respective language similarly closely.
ObjectSpace 16 threads
16.084
Compared to MapReduce, another popular concur-
ObjectSpace 32 threads
16.731
rent programming framework[7], ObjectSpace does not
Setup
Notice that since this machine has only 4 cores, 8 force the programmers into any particular paradigm, threads (and several already used to run the OS), we instead merely serving as a facilitator for parallel comexpect slowdown as the number of threads exceeds
puting. Its flexible nature means that it could serve
four.
as either a data storage, data passer, or communi-
The experiment shows that our implementation of
cation intermediary. As a result, programmers have
ObjectSpace introduces a non-trivial overhead to the more freedom in choosing the best design for their program, most likely due to serialization and synchro- program; yet it also requires more thought and effort nization mechanism using the Task structure, which on the part of the programmers to get their model correctly. However, a MapReduce-like paradigm could
is not necessary for this case. However, it proves to
scale well up to the number of available threads in be achieved quite easily through ObjectSpace using an interface similar to a Task in our example.
the system. We believe that given more optimization
in the example program, we could achieve even better When using ObjectSpace, programmers need to carescalability.
fully think about the flow of objects passing into the structure to maximize performance. We, however, do
5
not think this is a fault of the paradigm, since parallel
Insights and Future Research
programming is too complex to hide perfectly, and The design goal of ObjectSpace API is based on three thus it is best to require programmers to consider it principles: simplicity, flexibility, and good language explicitly[14]. integration. The API surface of ObjectSpace is small Due to the experimental nature of the framework,
5
there are still a lot of room for improvement. Most who has worked with me on the first prototype obvious of all is performance enhancement: the proof- implemented in C#, with could be found at of-concept implementation still heavily relies on locks https://github.com/tmt96/dotSpace-objectSpace. for the sake of ease of implementation, which brings a lot of overhead to the program. Serialization of
7
objects, which in some cases is unnecessary, also contributes to the overall overhead. We hope to improve
References
[1] G. Agha and C. J. Callsen, “ActorSpace: An open
in future iterations.
distributed programming paradigm,” vol. 28, no. 7,
We would also like to investigate additional APIs that 1993. could benefit the framework. An example of such an
[2] D. F. Bacon, S. L. Graham, and O. J. Sharp,
API is the ability to declare conditional operations
“Compiler transformations for high-performance com-
on multiple fields at once, mirroring such ability in
puting,” ACM Computing Surveys (CSUR), vol. 26,
Linda. Extra work still needs to be done to figure the
no. 4, pp. 345–420, 1994.
easiest way to implement such an API.
[3] H. Baker and C. Hewitt, “Laws for communicating
A big goal of ObjectSpace is to generalize to the
parallel processes,” 1977.
distributed setting, similar to Linda. Since in our
[4] G. E. Blelloch, “Scans as primitive parallel opera-
implementation, objects are already serialized before
tions,” IEEE Transactions on computers, vol. 38, no.
added to the ObjectSpace, adding distributed capa-
11, pp. 1526–1538, 1989.
bility should be possible. The distributed setting will
bring new unique challenges to ObjectSpace, for ex- [5] F. Chang et al., “Bigtable: A distributed storage ample: whether objects in he framework should be system for structured data,” ACM Transactions on Computer Systems (TOCS), vol. 26, no. 2, p. 4, 2008.
stored distributively or centrally.
[6] D. Culler et al., “LogP: Towards a realistic model of
6
parallel computation,” in ACM sigplan notices, 1993,
Acknowledgement
vol. 28, pp. 1–12. This project could not have been completed without [7] J. Dean and S. Ghemawat, “MapReduce: Simhelp and support from Professor Duane Bailey and plified data processing on large clusters,” CommuProfessor Jeannie Albretch, who have directly super- nications of the ACM, vol. 51, no. 1, pp. 107–113, vised this project.
2008.
Special thanks to my friend Daishiro Nishida, 6
[8] P. T. Eugster, P. A. Felber, R. Guerraoui, and A.- performance computer architecture, 2007. hpca 2007. M. Kermarrec, “The many faces of publish/subscribe,” ieee 13th international symposium on, 2007, pp. 13– ACM computing surveys (CSUR), vol. 35, no. 2, pp. 24. 114–131, 2003.
[17] K. Shvachko, H. Kuang, S. Radia, and R.
[9] A. S. Foundation, “JavaSpaces service specification.” Chansler, “The hadoop distributed file system,” in 2016. [10] D. Gelernter,
Mass storage systems and technologies (msst), 2010 ieee 26th symposium on, 2010, pp. 1–10.
“Generative communication
in linda,” ACM Transactions on Programming
[18] A. Thusoo et al., “Hive: A warehousing solution
Languages and Systems (TOPLAS), vol. 7, no. 1, pp. over a map-reduce framework,” Proceedings of the 80–112, 1985.
VLDB Endowment, vol. 2, no. 2, pp. 1626–1629, 2009.
[11] S. Ghemawat, H. Gobioff, and S.-T. Leung, “The
google file system,” SIGOPS Oper. Syst. Rev., vol. [19] T. Tran, “Rust object space,” GitHub repository. 37, no. 5, pp. 29–43, 2003.
GitHub, 2018.
[12] B. Hindman et al., “Mesos: A platform for fine- [20] Y. Yu et al., “DryadLINQ: A system for generalgrained resource sharing in the data center.” in NSDI, purpose distributed data-parallel computing using a 2011, vol. 11, pp. 22–22.
high-level language.” in OSDI, 2008, vol. 8, pp. 1–14.
[13] C. A. Hoare, “Communicating sequential pro- [21] M. Zaharia, M. Chowdhury, M. J. Franklin, S. cesses,” Communications of the ACM, vol. 26, no. 1, Shenker, and I. Stoica, “Spark: Cluster computing pp. 100–106, 1983.
with working sets.” HotCloud, vol. 10, nos. 10-10, p. 95, 2010.
[14] M. Isard, M. Budiu, Y. Yu, A. Birrell, and D. Fetterly, “Dryad: Distributed data-parallel programs from sequential building blocks,” in ACM sigops operating systems review, 2007, vol. 41, pp. 59–72. [15] R. E. Ladner and M. J. Fischer, “Parallel prefix computation,” Journal of the ACM (JACM), vol. 27, no. 4, pp. 831–838, 1980. [16] C. Ranger, R. Raghuraman, A. Penmetsa, G. Bradski, and C. Kozyrakis, “Evaluating mapreduce for multi-core and multiprocessor systems,” in High 7
A
Calculating Prime Numbers with ObjectSpace
extern crate object_space; extern crate serde; #[macro_use] extern crate serde_derive;
use std::thread; use std::env; use std::sync::Arc;
use object_space::{ObjectSpace, ValueLookupObjectSpace, RangeLookupObjectSpace, TreeObjectSpace};
fn main() { let mut args = env::args(); let upper_lim = 1000000; let thread_count = 4;
// setup. add 2 & 3 just because we can let mut n = 4; let space = Arc::new(TreeObjectSpace::new()); space.write::(2); space.write::(3);
// create 4 worker threads for _ in 0..thread_count { let space_clone = space.clone(); thread::spawn(move || { check_numbers(space_clone); }); }
8
// continue until we hit limit while n < upper_lim { let max = if n * n < upper_lim { n * n } else { upper_lim };
for i in 0..thread_count { // divide work evenly between threads let start = n + (((max - n) as f64) / (thread_count as f64) * (i as f64)).round() as i64; let end = n + (((max - n) as f64) / (thread_count as f64) * ((i + 1) as f64)).round() as i64;
let clone = space.clone(); clone.write(Task { finished: false, start: start, end: end, }); }
// "joining" threads for _ in 0..thread_count { let clone = space.clone(); clone.take_by_value::("finished", &true); } n = max; } }
fn check_numbers(space: Arc) { loop {
9
let task = space.take_by_value::("finished", &false); let max = task.end; let min = task.start; let primes: Vec = space.read_all::().filter(|i| i * i < max).collect(); for i in min..max { if primes.iter().all(|prime| i % prime != 0) { space.write(i); } } space.write(Task { finished: true, start: min, end: max, }); } }
#[derive(Serialize, Deserialize)] struct Task { finished: bool, start: i64, end: i64, }
10