This page explains how to submit the planners to us, and how the planners will be invoked during the competition. Please pay close attention to this: due to the large number of participants, we will require that the rules on this page are strictly followed.
We remind you that the planner submission deadline is January 15th 2014. We follow the ICAPS rules regarding time zones: Your submission is in time as long as there is some time zone in which it is still January 15th. In other words, the deadline is 11:59 PM UTC-12 (Howland island local time).
How to submit your planner
On the first week of December, participants have received an account to the Development Environment System (DES). This must be used for testing that planners are correctly compiled and for providing the source code. The submission is done throught the DES, your home directory should contains the following:
- Your home directory must include a filled PDDL support questionnaire, which describes the level of PDDL support of your planner. If there are additional limitations not captured by the questionnaire, please indicate them, too.
- Your home directory must include the full source code of your planner, to be published on this web site after the competition.
- Your home directory must contain a number of directories, one for each track in which you participate. So if you only participate in a single track, there must be a single directory in your home. The directories must be named as follows (where planner-name is the name of your planner):
- seq-sat-planner-name for the Sequential satisficing track.
- seq-opt-planner-name for the Sequential optimal track.
- seq-mco-planner-name for the Sequential multi-core track.
- seq-agl-planner-name for the Sequential agile track.
- tempo-sat-planner-name for the temporal satisficing track.
- tempo-opt-planner-name for the temporal optimal track.
- pref-sat-planner-name for the preferences satisficing track.
- pref-opt-planner-name for the preferences optimal track.
- Some teams submit two planners, or two versions of the same planner. In that case, create one set of directories for each planner or planner version. Again, please create several directories even if the planners/planner versions only differ in very minor things (e.g. command-line switches that must be used to run the planner).
- Planner directories should not contain any unnecessary files (editor backup files, .CVS or .svn directories, .DS_Store files, object files, bytecode, ...), but README files that may help with trouble-shooting the planner are appreciated.
Example: You submit the planners "Find Plans Quickly" that participates only in the satisficing temporal track, and "Find Optimal Plans Slowly" that participates in both temporal tracks. Then your home directory on DES should contain the following directories: tempo-sat-fpq tempo-sat-fops tempo-opt-fops
Please, take care that after the deadline, the access to the DES system will be no more permitted.
Compiling you planner
- Each planner directory should contain a shell script, named build (note the letter case), which completely builds your planner. Be sure that your build script is executable. You may assume that build is run from the directory in which it resides. In the common case that you want to use the make tool to build your planner, your build script should look like this: make
- Your build script will be run with limited user rights, but still please make sure that it doesn't contain any operations that can wreak havoc on the computer. In particular, it must not write to any directories outside the directory it is run in (creating and using subdirectories is fine), and it must not use the network.
Running your planner
- Each planner directory should contain a shell script named plan which accepts three arguments: a domain file, a problem file, and the filename in which the result plan should be stored.
- Invoking the script with these three arguments should run your planner. You may assume that the planner is run from the directory in which it resides. You may also assume that the input files and output filename are contained in this directory (and not in a subdirectory). We will run the script in an environment that limits memory usage to 4 GB and overall runtime to 30 minutes (except for the agile track, in which overall runtime is 5 minutes), so you don't need to take such measures manually.
- The solution should be written to the result file in a format understood by the VAL 4.2.09 validator
- Your planner will be run with limited user rights, but still please make sure that it doesn't contain any operations that can wreak havoc on the computer. In particular, it must not write to any directories outside the directory it is run in (creating and using subdirectories is fine), and it must not use the network.
- If your planner generates any temporary files, we will automatically clean these up after each planner run, restoring the planner directory to its previous state. Don't create temporary files with names ending with .pddl, .soln or .log, as we will use such names for inputs, outputs, and redirected stdout/stderr streams, respectively.
Special planner aspects
Some technical aspects of a planner may make the evaluation more complicated. Please check the following list to see if any of the following points applies to your planner. In this case, please clearly indicate this in the questionnaire, and give us the detailed information we need in order to run the planner.
- Anytime algorithms: In the satisficing tracks, if your planner produces multiple plans, which is required, please don't reuse the same filename for the generated plans, because this will lead to problems if your planner times out in the middle of writing a new plan. Instead, append .1 to the plan filename for the first plan that is generated, .2 for the second, and so on. We will evaluate the last complete plan that was generated, so please only output plans in increasing order of quality. (Of course, there is no point in producing lower-quality plans than ones that were previously generated anyway.) For example, if your any-time planner is called as ./plan domain.pddl problem.pddl plan.soln, the first plan it generates should be named plan.soln.1, then plan.soln.2, and so on.
- Randomized algorithms: If your planner uses randomized algorithms, please initialize the random seed to a fixed constant. If there are any reasons to expect that your planner won't generate reproducible results, please tell us clearly by email.
- Concurrency:There is a dedicated track to multi-core satisficing planning. Thus, it is not expected that any planner spawns concurrent threads or processes. Nevertheless, if yours do so, e.g., sequentially first to preprocess the domain and/or problem and then to start the planning process itself, please let us know to make sure that your submission does not violate the rules imposed on the single-core tracks.
- Various executable files: If your planner consists of a number of executable files (most likely because it uses an ensemble of planners which are invoked upon some criteria) make sure only one is executed simultaneously in the single-core tracks.
- Disk usage: If your planner uses external search algorithms, we will need to run it in an isolated environment, so please tell us. In that case, we will of course take time spent doing I/O into account for the time limit. Please also tell us how much hard disk space the planner should be expected to require, at maximum, during execution. The planner is only allowed to write to the directory in which it is invoked, or subdirectories thereof. (For example, don't write to /tmp.)
Alternative domain formulations
Each evaluation domain for the competition will come in several alternative formulations that use different subsets of the optional features for that track (including, of course, formulations that use none of the optional features). For example, we will provide formulations with and without derived predicates, and formulations with and without object fluents. In some cases, it is clear which formulation is most appropriate for a planner (e.g., a planner that does not support derived predicates will only use formulations that don't make use of them). In other cases, the decision is not as clear. For example, a planner might support conditional effects, but still prefer domain formulations that don't use them in some cases.
In previous competitions, participants could manually choose which of the alternative formulations their planners would use in each domain. Due to the blind evaluation mode, this won't be possible this year. Instead, we will choose the "best" formulation for each planner in each domain by first using probing runs, where each planner attempts each formulation of each task in each domain with a reduced timeout of 1 minute. For each domain, we check for which of the formulations the planner achieved the best score in the probing runs, according to the evaluation criteria of the competition. This formulation is then considered the "best" formulation of that domain for that planner, and the results of the probing runs are thrown away. In the final evaluation (which then uses a timeout of 30 minutes), the planner will only be evaluated against the formulation selected earlier. (In case of ties, we will prefer formulations that use a larger subset of PDDL features, because most planners perform better if fewer features need to be compiled away.)
We reserve the right to run some planners against multiple formulations in the final evaluation and only use the best result for scoring, for example in cases where two planners achieve very similar scores or in cases where the probing runs produce very erratic results. In any case, formulations are chosen per-domain, not per-instance.
If you feel that this automated method won't lead to optimal selection of domain formulations for your planner, please bring up the issue so that we can work out an alternative policy.
Bug fix policy
In some cases, we will offer the opportunity to fix bugs that arise during the evaluation period, but any changes after the submission deadline will be strictly limited to bugfixes only. We will use a diff tool to check that patches don't contain new features or parameter tuning, and will reject patches that don't look like minimal changes to fix bugs. It is your responsibility to provide patches that are easy to verify with a diff tool. We reserve the right to reject changes for which the only-bugfixes rule is unnecessarily hard to check (e.g. because you reformatted the whole code).