EMS+QMS 2010
The Workshop on
Evaluation Methods for Solvers and
Quality Metrics for Solutions
Affiliated with the
IJCAR 2010, the 5th International Joint Conference on Automated Reasoning
and
CAV 2010, the 22nd International Conference on Computer Aided Verification
July 20, 2010
Edinburgh, UK
Background
Logic solvers, e.g., first-order ATP systems, SMT systems, SAT solvers, higher-order proof assistants, are valuable tools in a growing range of applications, e.g., mathematics, software and hardware verification, knowledge based reasoning, etc. A key to progress in the development of high performance logic solvers is their evaluation. Meaningful evaluation is dependent on various components: appropriate benchmark problems, execution in the context of comparable resources, careful analysis of the output data, and so on. Most solver communities support each of these in various ways.
A survey of the evaluation approaches in the various solver communities shows that there is a large emphasis on the ability to solve problems, a moderate emphasis on the resources required to find those solutions, but only a small emphasis on the quality of the solutions. There is however a strong interest by users in solution quality, starting most formally at the correctness of solutions output, but extending to notions of compactness, readability, etc.
Aims and Scope
The EMSQMS workshop is concerned with the aspects of logic solver evaluation described above: methods by which logic solvers can be meaningfully evaluated, with a separated focus on metrics by which solvers' solutions can be evaluated. The workshop will collect ideas and inputs on these topics, to provide specific advice and resources that will help logic solver developers and users obtain meaningful evaluations for research and application.
Topics include, but are not limited to:
- Evaluation methods for logic solvers
- benchmarking frameworks
- solver execution services
- solver competitions
- techniques for checking solver soundness, including solver debugging techniques
- Quality and comparison metrics for solvers' solutions (proofs, models, or
other artifacts)
- correctness
- conciseness
- readability
- portability
- novel metrics
Submissions
Full paper submissions should be at most 12 pages in EasyChair LaTeX format (available here), although shorter papers are welcome. Position statements about "solution quality", of up to 5 pages, are also invited. Submission is via the EasyChair system.Important Dates
Submission deadline: | April 12 |
Notifications: | April 28 |
Final version: | May 17 |
Workshop date: | July 20, 2010 |
Invited Speakers and Panelists
- Joe Hurd, Galois, Inc. (invited speaker and panelist)
- Daniel Le Berre, Universite d'Artois (panelist)
- Ian Horrocks, University of Oxford (panelist)
- Viktor Kuncak, EPFL (panelist)
- Leonardo de Moura, Microsoft Research (panelist)
Organizers
- Deborah McGuinness, Rensselaer Polytechnic Institute
- Aaron Stump, The University of Iowa
- Geoff Sutcliffe, University of Miami
- Cesare Tinelli, The University of Iowa
Program Committee
- Christoph Benzmueller, Articulate Software
- Armin Biere, Johannes Kepler University
- Li Ding, Rensselaer Polytechnic Institute
- Daniel Le Berre, Universite d'Artois
- Deborah McGuinness, Rensselaer Polytechnic Institute
- Adam Pease, Articulate Software
- Paulo Pinheiro da Silva, University of Texas at El Paso
- Albert Rubio, Universitat Politècnica de Catalunya
- Ulrike Sattler, The University of Manchester
- Torsten Schaub, University of Potsdam
- Aaron Stump, The University of Iowa
- Geoff Sutcliffe, University of Miami
- Armando Tacchella, Universita di Genova
- Cesare Tinelli, The University of Iowa
Program
In addition to talks on original and previously presented research, the program will include a panel on the main topics on the workshop, with an emphasis on solver competitions and solver community infrastructure, followed by a facilitated discussion on quality metrics for solver solutions.
09:00-10:00 |
Joe Hurd. Evaluation Opportunities in Mechanized Theories (invited talk) |
10:00-10:30 | Coffee break |
10:30-11:00 |
Robert Brummayer, Aaron Stump and Duckki Oe. Exploring Predictability of SAT/SMT Solvers |
11:00-11:30 |
Claudia Peschiera, Luca Pulina and Armando Tacchella. Designing a solver competition: the QBFEVAL'10 case study |
11:30-12:00 |
Mladen Nikolic. Statistical Methodology for Comparison of SAT Solvers (presentation-only) |
12:00-12:30 |
Christoph Benzmueller and Marvin Schiller. Adaptive Assertion-Level Proofs |
12:30-14:00 | Lunch break |
14:00-14:30 |
Morgan Deters. The SMT-Exec Execution Service: Features, Fairness, and the Future (presentation-only) |
14:30-15:00 | Geoff Sutcliffe, Cynthia Chang, Li Ding, Deborah McGuinness and Paulo Pinheiro da Silva. Different Proofs are Good Proofs (position statement) |
15:00-15:30 | Coffee break |
15:30-16:15 | EMS panel on solver competitions and community infrastructure (shared with SVARM) |
16:15-17:00 | QMS facilitated discussion on quality metrics |
Support
The workshop is partially supported by a planning grant from the National Science Foundation. Any opinions, findings and conclusions or recommendations expressed here are those of the organizers and do not necessarily reflect the views of funding agency.