Publications by Christopher Ré

×

Status message

The Publications site is currently under construction, as a result some publications might be missing.

2017

VLDB J., January 2017
Populating a database with unstructured information is a long-standing problem in industry and research that encompasses problems of extraction, cleaning, and integration. Recent names used for this problem include dealing with dark data and knowledge base construction (KBC). In this work, we describe DeepDive, a system that combines database and machine learning ideas to help develop KBC systems, and we present techniques to make the KBC process more efficient. We observe that the KBC process is iterative, and we develop techniques to incrementally produce inference results for KBC systems. We propose two methods for incremental inference, based respectively on sampling and variational techniques. We also study the tradeoff space of these methods and develop a simple rule-based optimizer. DeepDive includes all of these contributions, and we evaluate DeepDive on five KBC systems, showing that it can speed up KBC inference tasks by up to two orders of magnitude with negligible impact on quality.
@article{abc,
	abstract = {Populating a database with unstructured information is a long-standing problem in industry and research that encompasses problems of extraction, cleaning, and integration. Recent names used for this problem include dealing with dark data and knowledge base construction (KBC). In this work, we describe DeepDive, a system that combines database and machine learning ideas to help develop KBC systems, and we present techniques to make the KBC process more efficient. We observe that the KBC process is iterative, and we develop techniques to incrementally produce inference results for KBC systems. We propose two methods for incremental inference, based respectively on sampling and variational techniques. We also study the tradeoff space of these methods and develop a simple rule-based optimizer. DeepDive includes all of these contributions, and we evaluate DeepDive on five KBC systems, showing that it can speed up KBC inference tasks by up to two orders of magnitude with negligible impact on quality.},
	author = {Christopher De Sa and Alexander Ratner and Christopher R{\'e} and Jaeho Shin and Feiran Wang and Sen Wu and Ce Zhang},
	journal = {VLDB J.},
	title = {Incremental knowledge base construction using DeepDive.},
	url = {http://dx.doi.org/10.1007/s00778-016-0437-2},
	year = {2017}
}
Commun. ACM, -, January 2017
The dark data extraction or knowledge base construction (KBC) problem is to populate a SQL database with information from unstructured data sources including emails, webpages, and pdf reports. KBC is a long-standing problem in industry and research that encompasses problems of data extraction, cleaning, and integration. We describe DeepDive, a system that combines database and machine learning ideas to help develop KBC systems. The key idea in DeepDive is that statistical inference and machine learning are key tools to attack classical data problems in extraction, cleaning, and integration in a unified and more effective manner. DeepDive programs are declarative in that one cannot write probabilistic inference algorithms; instead, one interacts by defining features or rules about the domain. A key reason for this design choice is to enable domain experts to build their own KBC systems. We present the applications, abstractions, and techniques of DeepDive employed to accelerate construction of KBC systems.
@inproceedings{abc,
	abstract = {The dark data extraction or knowledge base construction (KBC) problem is to populate a SQL database with information from unstructured data sources including emails, webpages, and pdf reports. KBC is a long-standing problem in industry and research that encompasses problems of data extraction, cleaning, and integration. We describe DeepDive, a system that combines database and machine learning ideas to help develop KBC systems. The key idea in DeepDive is that statistical inference and machine learning are key tools to attack classical data problems in extraction, cleaning, and integration in a unified and more effective manner. DeepDive programs are declarative in that one cannot write probabilistic inference algorithms; instead, one interacts by defining features or rules about the domain. A key reason for this design choice is to enable domain experts to build their own KBC systems. We present the applications, abstractions, and techniques of DeepDive employed to accelerate construction of KBC systems.},
	author = {Ce Zhang and Christopher R{\'e} and Michael J. Cafarella and Jaeho Shin and Feiran Wang and Sen Wu},
	booktitle = {Commun. ACM},
	title = {DeepDive: declarative knowledge base construction.},
	url = {http://doi.acm.org/10.1145/3060586},
	venue = {-},
	year = {2017}
}

2016

54th Annual Allerton Conference on Communication, Control, and Computing, Allerton 2016, Monticello, IL, USA, September 2016
@inproceedings{abc,
	author = {Ioannis Mitliagkas and Ce Zhang and Stefan Hadjis and Christopher R{\'e}},
	booktitle = {54th Annual Allerton Conference on Communication, Control, and Computing, Allerton 2016, Monticello, IL, USA},
	title = {Asynchrony begets momentum, with an application to deep learning.},
	url = {http://dx.doi.org/10.1109/ALLERTON.2016.7852343},
	year = {2016}
}
Proceedings of the 2016 International Conference on Management of Data, SIGMOD Conference 2016, San Francisco, CA, USA, June 2016
DeepDive is a system for extracting relational databases from dark data: the mass of text, tables, and images that are widely collected and stored but which cannot be exploited by standard relational tools. If the information in dark data - scientific papers, Web classified ads, customer service notes, and so on - were instead in a relational database, it would give analysts a massive and valuable new set of "big data." DeepDive is distinctive when compared to previous information extraction systems in its ability to obtain very high precision and recall at reasonable engineering cost; in a number of applications, we have used DeepDive to create databases with accuracy that meets that of human annotators. To date we have successfully deployed DeepDive to create data-centric applications for insurance, materials science, genomics, paleontologists, law enforcement, and others. The data unlocked by DeepDive represents a massive opportunity for industry, government, and scientific researchers. DeepDive is enabled by an unusual design that combines large-scale probabilistic inference with a novel developer interaction cycle. This design is enabled by several core innovations around probabilistic training and inference.
@inproceedings{abc,
	abstract = {DeepDive is a system for extracting relational databases from dark data: the mass of text, tables, and images that are widely collected and stored but which cannot be exploited by standard relational tools. If the information in dark data - scientific papers, Web classified ads, customer service notes, and so on - were instead in a relational database, it would give analysts a massive and valuable new set of "big data." DeepDive is distinctive when compared to previous information extraction systems in its ability to obtain very high precision and recall at reasonable engineering cost; in a number of applications, we have used DeepDive to create databases with accuracy that meets that of human annotators. To date we have successfully deployed DeepDive to create data-centric applications for insurance, materials science, genomics, paleontologists, law enforcement, and others. The data unlocked by DeepDive represents a massive opportunity for industry, government, and scientific researchers. DeepDive is enabled by an unusual design that combines large-scale probabilistic inference with a novel developer interaction cycle. This design is enabled by several core innovations around probabilistic training and inference.},
	author = {Ce Zhang and Jaeho Shin and Christopher R{\'e} and Michael J. Cafarella and Feng Niu},
	booktitle = {Proceedings of the 2016 International Conference on Management of Data, SIGMOD Conference 2016},
	title = {Extracting Databases from Dark Data with DeepDive.},
	url = {http://doi.acm.org/10.1145/2882903.2904442},
	venue = {San Francisco, CA, USA},
	year = {2016}
}
Commun. ACM, January 2016
@inproceedings{abc,
	author = {Daniel J. Abadi and Rakesh Agrawal and Anastasia Ailamaki and Magdalena Balazinska and Philip A. Bernstein and Michael J. Carey and Surajit Chaudhuri and Jeffrey Dean and AnHai Doan and Michael J. Franklin and Johannes Gehrke and Laura M. Haas and Alon Y. Halevy and Joseph M. Hellerstein and Yannis E. Ioannidis and H. V. Jagadish and Donald Kossmann and Samuel Madden and Sharad Mehrotra and Tova Milo and Jeffrey F. Naughton and Raghu Ramakrishnan and Volker Markl and Christopher Olston and Beng Chin Ooi and Christopher R{\'e} and Dan Suciu and Michael Stonebraker and Todd Walter and Jennifer Widom},
	booktitle = {Commun. ACM},
	title = {The Beckman report on database research.},
	url = {http://doi.acm.org/10.1145/2845915},
	year = {2016}
}
ACM Trans. Database Syst., January 2016
@article{abc,
	author = {Ce Zhang and Arun Kumar and Christopher R{\'e}},
	journal = {ACM Trans. Database Syst.},
	title = {Materialization Optimizations for Feature Selection Workloads.},
	url = {http://doi.acm.org/10.1145/2877204},
	year = {2016}
}
CoRR, January 2016
@article{abc,
	author = {Ioannis Mitliagkas and Ce Zhang and Stefan Hadjis and Christopher R{\'e}},
	journal = {CoRR},
	title = {Asynchrony begets Momentum, with an Application to Deep Learning.},
	url = {http://arxiv.org/abs/1605.09774},
	year = {2016}
}
CoRR, January 2016
@article{abc,
	author = {Xinghao Pan and Maximilian Lam and Stephen Tu and Dimitris S. Papailiopoulos and Ce Zhang and Michael I. Jordan and Kannan Ramchandran and Christopher R{\'e} and Benjamin Recht},
	journal = {CoRR},
	title = {CYCLADES: Conflict-free Asynchronous Machine Learning.},
	url = {http://arxiv.org/abs/1605.09721},
	year = {2016}
}
SIGMOD Record, January 2016
The dark data extraction or knowledge base construction (KBC) problem is to populate a SQL database with information from unstructured data sources including emails, webpages, and pdf reports. KBC is a long-standing problem in industry and research that encompasses problems of data extraction, cleaning, and integration. We describe DeepDive, a system that combines database and machine learning ideas to help develop KBC systems. The key idea in DeepDive is that statistical inference and machine learning are key tools to attack classical data problems in extraction, cleaning, and integration in a unified and more effective manner. DeepDive programs are declarative in that one cannot write probabilistic inference algorithms; instead, one interacts by defining features or rules about the domain. A key reason for this design choice is to enable domain experts to build their own KBC systems. We present the applications, abstractions, and techniques of DeepDive employed to accelerate construction of KBC systems.
@article{abc,
	abstract = {The dark data extraction or knowledge base construction (KBC) problem is to populate a SQL database with information from unstructured data sources including emails, webpages, and pdf reports. KBC is a long-standing problem in industry and research that encompasses problems of data extraction, cleaning, and integration. We describe DeepDive, a system that combines database and machine learning ideas to help develop KBC systems. The key idea in DeepDive is that statistical inference and machine learning are key tools to attack classical data problems in extraction, cleaning, and integration in a unified and more effective manner. DeepDive programs are declarative in that one cannot write probabilistic inference algorithms; instead, one interacts by defining features or rules about the domain. A key reason for this design choice is to enable domain experts to build their own KBC systems. We present the applications, abstractions, and techniques of DeepDive employed to accelerate construction of KBC systems.},
	author = {Christopher De Sa and Alexander Ratner and Christopher R{\'e} and Jaeho Shin and Feiran Wang and Sen Wu and Ce Zhang},
	journal = {SIGMOD Record},
	title = {DeepDive: Declarative Knowledge Base Construction.},
	url = {http://doi.acm.org/10.1145/2949741.2949756},
	year = {2016}
}
CoRR, January 2016
@article{abc,
	author = {Stefan Hadjis and Ce Zhang and Ioannis Mitliagkas and Christopher R{\'e}},
	journal = {CoRR},
	title = {Omnivore: An Optimizer for Multi-device Deep Learning on CPUs and GPUs.},
	url = {http://arxiv.org/abs/1606.04487},
	year = {2016}
}
ETH Zürich, January 2016
@inproceedings{abc,
	author = {Ioannis Mitliagkas and Ce Zhang and Stefan Hadjis and Christopher R{\'e}},
	booktitle = {ETH Z{\"u}rich},
	title = {Asynchrony begets Momentum, with an Application to Deep Learning},
	year = {2016}
}
Bioinformatics, January 2016
@article{abc,
	author = {Emily K. Mallory and Ce Zhang and Christopher R{\'e} and Russ B. Altman},
	journal = {Bioinformatics},
	title = {Large-scale extraction of gene interactions from full-text literature using DeepDive.},
	url = {http://dx.doi.org/10.1093/bioinformatics/btv476},
	year = {2016}
}

2015

Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015, Montreal, Quebec, Canada, December 2015
@inproceedings{abc,
	author = {Christopher De Sa and Ce Zhang and Kunle Olukotun and Christopher R{\'e}},
	booktitle = {Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015},
	title = {Rapidly Mixing Gibbs Sampling for a Class of Factor Graphs Using Hierarchy Width.},
	url = {http://papers.nips.cc/paper/5757-rapidly-mixing-gibbs-sampling-for-a-class-of-factor-graphs-using-hierarchy-width},
	venue = {Montreal, Quebec, Canada},
	year = {2015}
}
Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015, Montreal, Quebec, Canada, December 2015
@inproceedings{abc,
	author = {Christopher De Sa and Ce Zhang and Kunle Olukotun and Christopher R{\'e}},
	booktitle = {Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015},
	title = {Taming the Wild: A Unified Analysis of Hogwild-Style Algorithms.},
	url = {http://papers.nips.cc/paper/5717-taming-the-wild-a-unified-analysis-of-hogwild-style-algorithms},
	venue = {Montreal, Quebec, Canada},
	year = {2015}
}
Proceedings of the Fourth Workshop on Data analytics in the Cloud, DanaC 2015, Melbourne, VIC, Australia, May 2015
@inproceedings{abc,
	author = {Stefan Hadjis and Firas Abuzaid and Ce Zhang and Christopher R{\'e}},
	booktitle = {Proceedings of the Fourth Workshop on Data analytics in the Cloud, DanaC 2015, Melbourne, VIC, Australia},
	title = {Caffe con Troll: Shallow Ideas to Speed Up Deep Learning.},
	url = {http://doi.acm.org/10.1145/2799562.2799641},
	year = {2015}
}
CoRR, January 2015
@article{abc,
	author = {Firas Abuzaid and Stefan Hadjis and Ce Zhang and Christopher R{\'e}},
	journal = {CoRR},
	title = {Caffe con Troll: Shallow Ideas to Speed Up Deep Learning.},
	url = {http://arxiv.org/abs/1504.04343},
	year = {2015}
}
CoRR, January 2015
@article{abc,
	author = {Christopher De Sa and Ce Zhang and Kunle Olukotun and Christopher R{\'e}},
	journal = {CoRR},
	title = {Taming the Wild: A Unified Analysis of Hogwild!-Style Algorithms.},
	url = {http://arxiv.org/abs/1506.06438},
	year = {2015}
}
PVLDB, January 2015
@inproceedings{abc,
	author = {Jaeho Shin and Sen Wu and Feiran Wang and Christopher De Sa and Ce Zhang and Christopher R{\'e}},
	booktitle = {PVLDB},
	title = {Incremental Knowledge Base Construction Using DeepDive.},
	url = {http://www.vldb.org/pvldb/vol8/p1310-shin.pdf},
	year = {2015}
}
CoRR, January 2015
@article{abc,
	author = {Yuke Zhu and Ce Zhang and Christopher R{\'e} and Li Fei-Fei},
	journal = {CoRR},
	title = {Building a Large-scale Multimodal Knowledge Base for Visual Question Answering.},
	url = {http://arxiv.org/abs/1507.05670},
	year = {2015}
}
CoRR, January 2015
@article{abc,
	author = {Christopher De Sa and Ce Zhang and Kunle Olukotun and Christopher R{\'e}},
	journal = {CoRR},
	title = {Rapidly Mixing Gibbs Sampling for a Class of Factor Graphs Using Hierarchy Width.},
	url = {http://arxiv.org/abs/1510.00756},
	year = {2015}
}
CoRR, January 2015
@article{abc,
	author = {Sen Wu and Ce Zhang and Feiran Wang and Christopher R{\'e}},
	journal = {CoRR},
	title = {Incremental Knowledge Base Construction Using DeepDive.},
	url = {http://arxiv.org/abs/1502.00731},
	year = {2015}
}

2014

Advances in Neural Information Processing Systems 27: Annual Conference on Neural Information Processing Systems 2014, Montreal, Quebec, Canada, December 2014
@inproceedings{abc,
	author = {Yingbo Zhou and Utkarsh Porwal and Ce Zhang and Hung Q. Ngo and Long Nguyen and Christopher R{\'e} and Venu Govindaraju},
	booktitle = {Advances in Neural Information Processing Systems 27: Annual Conference on Neural Information Processing Systems 2014},
	title = {Parallel Feature Selection Inspired by Group Testing.},
	url = {http://papers.nips.cc/paper/5296-parallel-feature-selection-inspired-by-group-testing},
	venue = {Montreal, Quebec, Canada},
	year = {2014}
}
Proceedings of the 2nd International Workshop on In Memory Data Management and Analytics, IMDM 2014, Hangzhou, China, September 2014
@inproceedings{abc,
	author = {Victor Bittorf and Marcel Kornacker and Christopher R{\'e} and Ce Zhang},
	booktitle = {Proceedings of the 2nd International Workshop on In Memory Data Management and Analytics, IMDM 2014, Hangzhou, China},
	title = {Tradeoffs in Main-Memory Statistical Analytics from Impala to DimmWitted.},
	url = {http://www-db.in.tum.de/hosted/imdm2014/papers/bittorf.pdf},
	year = {2014}
}
International Conference on Management of Data, SIGMOD 2014, Snowbird, UT, USA, June 2014
@inproceedings{abc,
	author = {Ce Zhang and Arun Kumar and Christopher R{\'e}},
	booktitle = {International Conference on Management of Data, SIGMOD 2014, Snowbird, UT, USA},
	title = {Materialization optimizations for feature selection workloads.},
	url = {http://doi.acm.org/10.1145/2588555.2593678},
	year = {2014}
}
CoRR, January 2014
@article{abc,
	author = {Ce Zhang and Christopher R{\'e}},
	journal = {CoRR},
	title = {DimmWitted: A Study of Main-Memory Statistical Analytics.},
	url = {http://arxiv.org/abs/1403.7550},
	year = {2014}
}
CoRR, January 2014
@article{abc,
	author = {Shanan Peters and Ce Zhang and Miron Livny and Christopher R{\'e}},
	journal = {CoRR},
	title = {A machine-compiled macroevolutionary history of Phanerozoic life.},
	url = {http://arxiv.org/abs/1406.2963},
	year = {2014}
}
CoRR, January 2014
@article{abc,
	author = {Ce Zhang and Christopher R{\'e} and Amir Abbas Sadeghian and Zifei Shan and Jaeho Shin and Feiran Wang and Sen Wu},
	journal = {CoRR},
	title = {Feature Engineering for Knowledge Base Construction.},
	url = {http://arxiv.org/abs/1407.6439},
	year = {2014}
}
PVLDB, January 2014
@inproceedings{abc,
	author = {Ce Zhang and Christopher R{\'e}},
	booktitle = {PVLDB},
	title = {DimmWitted: A Study of Main-Memory Statistical Analytics.},
	url = {http://www.vldb.org/pvldb/vol7/p1283-zhang.pdf},
	year = {2014}
}
IEEE Data Eng. Bull., January 2014
@inproceedings{abc,
	author = {Christopher R{\'e} and Amir Abbas Sadeghian and Zifei Shan and Jaeho Shin and Feiran Wang and Sen Wu and Ce Zhang},
	booktitle = {IEEE Data Eng. Bull.},
	title = {Feature Engineering for Knowledge Base Construction.},
	url = {http://sites.computer.org/debull/A14sept/p26.pdf},
	year = {2014}
}

2013

Advances in Neural Information Processing Systems 26: 27th Annual Conference on Neural Information Processing Systems 2013. Proceedings of a meeting held December 5-8, 2013, Lake Tahoe, Nevada, United States., Lake Tahoe, Nevada, United States., December 2013
@inproceedings{abc,
	author = {Srikrishna Sridhar and Stephen J. Wright and Christopher R{\'e} and Ji Liu and Victor Bittorf and Ce Zhang},
	booktitle = {Advances in Neural Information Processing Systems 26: 27th Annual Conference on Neural Information Processing Systems 2013. Proceedings of a meeting held December 5-8, 2013, Lake Tahoe, Nevada, United States.},
	title = {An Approximate, Efficient LP Solver for LP Rounding.},
	url = {http://papers.nips.cc/paper/4990-an-approximate-efficient-lp-solver-for-lp-rounding},
	venue = {Lake Tahoe, Nevada, United States.},
	year = {2013}
}
Proceedings of The Twenty-Second Text REtrieval Conference, TREC 2013, Gaithersburg, Maryland, USA, November 2013
@inproceedings{abc,
	author = {John R. Frank and Steven J. Bauer and Max Kleiman-Weiner and Daniel A. Roberts and Nilesh Tripuraneni and Ce Zhang and Christopher R{\'e} and Ellen M. Voorhees and Ian Soboroff},
	booktitle = {Proceedings of The Twenty-Second Text REtrieval Conference, TREC 2013, Gaithersburg, Maryland, USA},
	title = {Evaluating Stream Filtering for Entity Profile Updates for TREC 2013.},
	url = {http://trec.nist.gov/pubs/trec22/papers/KBA.OVERVIEW.pdf},
	year = {2013}
}
Proceedings of The Twenty-Second Text REtrieval Conference, TREC 2013, Gaithersburg, Maryland, USA, November 2013
@inproceedings{abc,
	author = {Tushar Khot and Ce Zhang and Jude W. Shavlik and Sriraam Natarajan and Christopher R{\'e}},
	booktitle = {Proceedings of The Twenty-Second Text REtrieval Conference, TREC 2013, Gaithersburg, Maryland, USA},
	title = {Bootstrapping Knowledge Base Acceleration.},
	url = {http://trec.nist.gov/pubs/trec22/papers/wisc-kba.pdf},
	year = {2013}
}
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics, ACL 2013, Sofia, Bulgaria, Volume 2: Short Papers, August 2013
@inproceedings{abc,
	author = {Vidhya Govindaraju and Ce Zhang and Christopher R{\'e}},
	booktitle = {Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics, ACL 2013},
	title = {Understanding Tables in Context Using Standard NLP Toolkits.},
	url = {http://aclweb.org/anthology/P/P13/P13-2116.pdf},
	venue = {Sofia, Bulgaria, Volume 2: Short Papers},
	year = {2013}
}
Proceedings of the ACM SIGMOD International Conference on Management of Data, SIGMOD 2013, New York, NY, USA, June 2013
@inproceedings{abc,
	author = {Ce Zhang and Christopher R{\'e}},
	booktitle = {Proceedings of the ACM SIGMOD International Conference on Management of Data, SIGMOD 2013, New York, NY, USA},
	title = {Towards high-throughput gibbs sampling at scale: a study across storage managers.},
	url = {http://doi.acm.org/10.1145/2463676.2463702},
	year = {2013}
}
Proceedings of the ACM SIGMOD International Conference on Management of Data, SIGMOD 2013, New York, NY, USA, June 2013
@inproceedings{abc,
	author = {Ce Zhang and Vidhya Govindaraju and Jackson Borchardt and Tim Foltz and Christopher R{\'e} and Shanan Peters},
	booktitle = {Proceedings of the ACM SIGMOD International Conference on Management of Data, SIGMOD 2013, New York, NY, USA},
	title = {GeoDeepDive: statistical inference using familiar data-processing languages.},
	url = {http://doi.acm.org/10.1145/2463676.2463680},
	year = {2013}
}
CIDR 2013, Sixth Biennial Conference on Innovative Data Systems Research, Asilomar, CA, USA, January 2013
@inproceedings{abc,
	author = {Michael Anderson and Dolan Antenucci and Victor Bittorf and Matthew Burgess and Michael J. Cafarella and Arun Kumar and Feng Niu and Yongjoo Park and Christopher R{\'e} and Ce Zhang},
	booktitle = {CIDR 2013, Sixth Biennial Conference on Innovative Data Systems Research, Asilomar, CA, USA},
	title = {Brainwash: A Data System for Feature Engineering.},
	url = {http://www.cidrdb.org/cidr2013/Papers/CIDR13_Paper82.pdf},
	year = {2013}
}
CoRR, -, January 2013
Many problems in machine learning can be solved by rounding the solution of an appropriate linear program (LP). This paper shows that we can recover solutions of comparable quality by rounding an approximate LP solution instead of the ex- act one. These approximate LP solutions can be computed efficiently by applying a parallel stochastic-coordinate-descent method to a quadratic-penalty formulation of the LP. We derive worst-case runtime and solution quality guarantees of this scheme using novel perturbation and convergence analysis. Our experiments demonstrate that on such combinatorial problems as vertex cover, independent set and multiway-cut, our approximate rounding scheme is up to an order of magnitude faster than Cplex (a commercial LP solver) while producing solutions of similar quality.
@inproceedings{abc,
	abstract = {Many problems in machine learning can be solved by rounding the solution of an appropriate linear program (LP). This paper shows that we can recover solutions of comparable quality by rounding an approximate LP solution instead of the ex- act one. These approximate LP solutions can be computed efficiently by applying a parallel stochastic-coordinate-descent method to a quadratic-penalty formulation of the LP. We derive worst-case runtime and solution quality guarantees of this scheme using novel perturbation and convergence analysis. Our experiments demonstrate that on such combinatorial problems as vertex cover, independent set and multiway-cut, our approximate rounding scheme is up to an order of magnitude faster than Cplex (a commercial LP solver) while producing solutions of similar quality.},
	author = {Srikrishna Sridhar and Victor Bittorf and Ji Liu and Ce Zhang and Christopher R{\'e} and Stephen J. Wright},
	booktitle = {CoRR},
	title = {An Approximate, Efficient Solver for LP Rounding.},
	url = {http://arxiv.org/abs/1311.2661},
	venue = {-},
	year = {2013}
}

2012

12th IEEE International Conference on Data Mining, ICDM 2012, Brussels, Belgium, December 2012
@inproceedings{abc,
	author = {Feng Niu and Ce Zhang and Christopher R{\'e} and Jude W. Shavlik},
	booktitle = {12th IEEE International Conference on Data Mining, ICDM 2012, Brussels, Belgium},
	title = {Scaling Inference for Markov Logic via Dual Decomposition.},
	url = {http://dx.doi.org/10.1109/ICDM.2012.96},
	year = {2012}
}
Proceedings of The Twenty-First Text REtrieval Conference, TREC 2012, Gaithersburg, Maryland, USA, November 2012
@inproceedings{abc,
	author = {John R. Frank and Max Kleiman-Weiner and Daniel A. Roberts and Feng Niu and Ce Zhang and Christopher R{\'e} and Ian Soboroff},
	booktitle = {Proceedings of The Twenty-First Text REtrieval Conference, TREC 2012, Gaithersburg, Maryland, USA},
	title = {Building an Entity-Centric Stream Filtering Test Collection for TREC 2012.},
	url = {http://trec.nist.gov/pubs/trec21/papers/KBA.OVERVIEW.pdf},
	year = {2012}
}
Proceedings of the Second International Workshop on Searching and Integrating New Web Data Sources, Istanbul, Turkey, August 2012
@inproceedings{abc,
	author = {Feng Niu and Ce Zhang and Christopher R{\'e} and Jude W. Shavlik},
	booktitle = {Proceedings of the Second International Workshop on Searching and Integrating New Web Data Sources, Istanbul, Turkey},
	title = {DeepDive: Web-scale Knowledge-base Construction using Statistical Learning and Inference.},
	url = {http://ceur-ws.org/Vol-884/VLDS2012_p25_Niu.pdf},
	year = {2012}
}
The 50th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference, Jeju Island, Korea - Volume 1: Long Papers, July 2012
@inproceedings{abc,
	author = {Ce Zhang and Feng Niu and Christopher R{\'e} and Jude W. Shavlik},
	booktitle = {The 50th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference},
	title = {Big Data versus the Crowd: Looking for Relationships in All the Right Places.},
	url = {http://www.aclweb.org/anthology/P12-1087},
	venue = {Jeju Island, Korea - Volume 1: Long Papers},
	year = {2012}
}
Int. J. Semantic Web Inf. Syst., January 2012
@inproceedings{abc,
	author = {Feng Niu and Ce Zhang and Christopher R{\'e} and Jude W. Shavlik},
	booktitle = {Int. J. Semantic Web Inf. Syst.},
	title = {Elementary: Large-Scale Knowledge-Base Construction via Machine Learning and Statistical Inference.},
	url = {http://dx.doi.org/10.4018/jswis.2012070103},
	year = {2012}
}

2011

CoRR, January 2011
@article{abc,
	author = {Feng Niu and Ce Zhang and Christopher R{\'e} and Jude W. Shavlik},
	journal = {CoRR},
	title = {Felix: Scaling Inference for Markov Logic with an Operator-based Approach},
	url = {http://arxiv.org/abs/1108.0294},
	year = {2011}
}