Publications by John Liagouris
2019
Proceedings of BIRTE Workshop 2019, Los Angeles, CA, USA, August 2019
We explore the performance and resource trade-offs of two alternative
approaches to streaming state management. When
the state size exceeds the amount of available memory, systems
can either scale out and partition the state across distributed
computing nodes or rely on secondary storage and
divide the state into ‘hot’ and ‘cold’ sets. Scaling out a streaming
computation might introduce coordination overhead
among parallel workers, while flushing state to disk requires
efficient data structures and careful caching policies to minimise
expensive I/O.
To study the characteristics of these state management approaches,
we present an integration of the Timely Dataflow
stream processing engine with the FASTER embedded keyvalue
store. We demonstrate a prototype that allows users to
transparently maintain arbitrary larger-than-memory state
with low overhead by making only minimal changes to application
code. Our preliminary experimental results show
that managed state incurs acceptable overhead over built-in
in-memory data structures and, in some cases, performs better
when relying on secondary storage in a
@inproceedings{abc, abstract = {We explore the performance and resource trade-offs of two alternative approaches to streaming state management. When the state size exceeds the amount of available memory, systems can either scale out and partition the state across distributed computing nodes or rely on secondary storage and divide the state into {\textquoteleft}hot{\textquoteright} and {\textquoteleft}cold{\textquoteright} sets. Scaling out a streaming computation might introduce coordination overhead among parallel workers, while flushing state to disk requires efficient data structures and careful caching policies to minimise expensive I/O. To study the characteristics of these state management approaches, we present an integration of the Timely Dataflow stream processing engine with the FASTER embedded keyvalue store. We demonstrate a prototype that allows users to transparently maintain arbitrary larger-than-memory state with low overhead by making only minimal changes to application code. Our preliminary experimental results show that managed state incurs acceptable overhead over built-in in-memory data structures and, in some cases, performs better when relying on secondary storage in a}, author = {Matthew Brokes and Vasiliki Kalavri and John Liagouris}, booktitle = {Proceedings of BIRTE Workshop 2019}, title = {FASTER State Management for Timely Dataflow}, venue = {Los Angeles, CA, USA}, year = {2019} }
2017
Proceedings of the Twelfth European Conference on Computer Systems, EuroSys 2017, Belgrade, Serbia, April 2017
@inproceedings{abc, author = {Zaheer Chothia and John Liagouris and Desislava Dimitrova and Timothy Roscoe}, booktitle = {Proceedings of the Twelfth European Conference on Computer Systems, EuroSys 2017, Belgrade, Serbia}, title = {Online Reconstruction of Structural Information from Datacenter Logs.}, url = {http://doi.acm.org/10.1145/3064176.3064195}, year = {2017} }
2016
PVLDB, January 2016
@inproceedings{abc, author = {Zaheer Chothia and John Liagouris and Frank McSherry and Timothy Roscoe}, booktitle = {PVLDB}, title = {Explaining Outputs in Modern Data Analytics.}, url = {http://www.vldb.org/pvldb/vol9/p1137-chothia.pdf}, year = {2016} }