DaCapo数据集是目前软件分析,特别是动态分析方面经常用到的数据集,但是我之前一直不是很了解,想从今天开始进行深入的学习。在里,引用了的一篇博文,主要讲用temiflex和Soot来对Dacapo数据集进行静态分析,但是对DaCapo数据集并不是很了解。
下面的几篇文章都用到了DaCapo数据集:
E. Bodden, "Efficient hybrid typestate analysis by determining continuation-equivalent states," in Proceedings of the 32nd ACM/IEEE International Conference on Software Engineering - Volume 1, Cape Town, South Africa, 2010, pp. 5-14.
M. Gabel and Z. Su, "Online inference and enforcement of temporal properties," in Proceedings of the 32nd ACM/IEEE International Conference on Software Engineering - Volume 1, Cape Town, South Africa, 2010, pp. 15-24.
M. Pradel and T. R. Gross, "Detecting anomalies in the order of equally-typed method arguments," in Proceedings of the 2011 International Symposium on Software Testing and Analysis, Toronto, Ontario, Canada, 2011, pp. 232-242.
DaCapo数据集最早是在OOPSLA 06上发表和介绍的 :
S. M. Blackburn, R. Garner, C. Hoffmann, A. M. Khang, K. S. McKinley, R. Bentzur, A. Diwan, D. Feinberg, D. Frampton, S. Z. Guyer, M. Hirzel, A. Hosking, M. Jump, H. Lee, J. E. B. Moss, A. Phansalkar, D. Stefanovi, T. VanDrunen, D. von Dincklage, and B. Wiedermann, "The DaCapo benchmarks: java benchmarking development and analysis," in Proceedings of the 21st annual ACM SIGPLAN conference on Object-oriented programming systems, languages, and applications, Portland, Oregon, USA, 2006, pp. 169-190.
今天下午抽一些时间,简单看了一下上面这篇论文,在这篇文章的Introduction部分,作者介绍到,他们主要构建了一个通用的、来源于实际的、无偿提供的Java Benchmark,并且在这篇文章中推荐了一些选择和比较Benchmark的方法,例如使用了Time-series,使用PCA(主成分分析)来评价benchmarks之间的区别。
DaCapo数据集的主页:
DaCapo数据集的下载地址: 其最新版本为2009年底发布的9.12版。这一版本包含14个benchmark:
avrora
simulates a number of programs run on a grid of AVR microcontrollers batik produces a number of Scalable Vector Graphics (SVG) images based on the unit tests in Apache Batik eclipse executes some of the (non-gui) jdt performance tests for the Eclipse IDEfop takes an XSL-FO file, parses it and formats it, generating a PDF file. h2 executes a JDBCbench-like in-memory benchmark, executing a number of transactions against a model of a banking application, replacing the hsqldb benchmark jython inteprets a the pybench Python benchmark luindex Uses lucene to indexes a set of documents; the works of Shakespeare and the King James Bible lusearch Uses lucene to do a text search of keywords over a corpus of data comprising the works of Shakespeare and the King James Bible pmd analyzes a set of Java classes for a range of source code problems sunflow renders a set of images using ray tracing tomcat runs a set of queries against a Tomcat server retrieving and verifying the resulting webpages tradebeans runs the daytrader benchmark via a Jave Beans to a GERONIMO backend with an in memory h2 as the underlying database tradesoap runs the daytrader benchmark via a SOAP to a GERONIMO backend with in memory h2 as the underlying database xalan transforms XML documents into HTML但实际上14个Benchmark对应的是12个软件,如下图所示:
其中,DayTrader分为tradebeans和tradesoap两部分,Lucene分为luindex和lusearch两部分。
但是,实际上怎么使用DaCapo数据集,我并不是很清楚,DaCapo的官网也介绍得很简略,只是介绍,使用命令:
java -jar dacapo-9.12-bach.jar
可以看到帮助,如果要运行其中一个Benchmark,例如Avrora,可以在cmd下输入命令:
java -jar dacapo-9.12-bach.jar avrora
作者介绍,对每一个Benchmark,他们都提供3种输入:“We have provided 3 inputs (small, default, large). We highly recommend using small for testing, and either reporting default or large in any performance analysis. For some benchmarks default and large workloads are identical.”
例如对Avrora,要使用small输入,可以在cmd下输入:
java -jar dacapo-9.12-bach.jar avrora -s small
我在一台性能较好的服务器(8GB RAM)上运行该命令,执行时间为1625ms,在一台ubuntu虚拟机上运行该命令,执行时间为9231ms。无论是Windows Server还是ubuntu虚拟上运行后,都在对应文件夹下生成一个名为“scratch”的目录,里面还挺复杂。应该会在以后搞懂。
今天就暂时学习了这么多。