The sample inputs are in hw3/sample
There will be a limit of 60 seconds for the CPU time used by the actual agent. This does not include the CPU cycles used by the environment program, or the time delays of the communication between the agent and the environment. There will also be a limit of 10,000 moves for each environment.
For each environment, it will be possible to gather enough information to plan a series of actions to get to the gold, before blowing up any walls.
s0.in
doesn't seem to satisfy this condition ?
Whoops! I have now replaced it with
s0a.in
, which (I believe) does satisfy
the above condition.
Since the agent is only presented with a 5-by-5 window at each time step, the agent will need to store its own global map, which it updates whenever new information becomes available. Since the environment is in the form of 2-dimensional grid, it is probably easier and more efficient to store the map as some kind of 2-d array, rather than using general purpose graph data structures. You might start by implementing some version of Dijkstra's Algorithm. using a 2-D array of size 160x160, and then think about how to enhance your search to take account of tools, walls and dynamites.