Posted by: kaising | April 22, 2010

rethinking ‘battle mode’

One problem with battling the creatures in an environment completely different from the training environment is that you lose the ability to train based on environment variables. In retrospect, a more puzzle-oriented game might have been a better application of this training mechanic. However, having the battleground be an open area would accomplish to this an extent. The only remaining question, then, is how to handle the environment column of training data entries. It seems that, if there is an option whose environment matches this one, that option should be selected, but if there is only an entry with everything but environment matching (during battle), that entry should be chosen.

Advertisements
Posted by: kaising | April 21, 2010

goals for the next two weeks

For this independent study, I’m aiming to return the original goal of having a playable prototype, rather than a large-scale facebook game. However, it will still include the added AI component to meet the technological requirement, and actually be accessible to play via Facebook, which had initially only been an if-time-permits goal.

Schedule for the next two weeks:

1. Finish posting cleaned-up notes/research/designs/conclusions to blog.
     04.21.2010 – 04.22.2010

2. Code for play-through demo II.
     04.22.2010 – 05.02.2010 (end of reading days)

3. Prototype art for play-through demo II.
     05.02.2010 – 05.03.2010

4. Have flash game display in Facebook (not necessarily any additional features).
     05.04.2010 – 05.06.2010

5. Address any unforeseen issues that arise.
     05.06.2010 – 05.11.2010 (end of finals)

This project will probably need to be continued into the summer to become a worthwhile prototype, but at least this should give something to build on by the end of the semester.

Posted by: kaising | April 19, 2010

really massive code changes

Things need to be rethought based on overhaul II…

TrainingEntry: a set of attributes (conditions, responses, and ‘goodness’ evaluation)
DecisionTreeID3: one instance for each creature; stores its TrainingEntry instances and evaluates a tree to make a decision
DecisionNode: used for making trees, decide based on TrainingEntry.data[index]

Posted by: kaising | April 19, 2010

Demo play-through II

Based on the first play-through outline, but taking second overhaul changes into account. Key gameplay points are in bold.

1. Player gets a starting creature, with docile nature. Natures are explained; docile learns very quickly, but can only learn low-powered moves. It may have two move slots instead of four. It pretty much is just useful for trying out all the basic features of the game, but not be very competitive in player vs. player battle. It should be really cute to make up for this. The idea here is to teach the player, but in order to be competitive they need to level up and gain access to other natures (and spend more time playing…). For the demo, there is only this one creature available.

2. Player goes to the forest level with their new creature. There’s a quick description of the fact that they will train their critter.

3. The creature wanders around via wander, target, and collision avoidance.

4. The player’s training actions are: click, press command buttons A/B/C, press positive/negative reinforcement buttons. When the player tries a click or command buttons A/B/C, the creature tries to perform an action where it is, biased by previous training.

Instruct the player to train it to come to the mouse… click (stimulus=CLICK) and watch what it does. Reward it (within 3 seconds) for coming to the cursor. Repeat until it comes every time; tell them that continued training helps it understand more clearly where to go.

Notes to self: No encoding of what actions are opposites/mutually exclusive? Game keeps a list of what the creature last did, what last stimulus was used within certain amount of time, etc. – all to be checked against when the user rewards or punishes.

5. Unfreeze its desires. Give an example of it getting hungry, and teaching it what it should eat. [Actually, skip this feature and desires other than boredom for now.]

6. Instruct the player to call it over to investigate things, or to let it wander. When it runs into an item of interest, it waits for a bit (or it’s ‘boredom/idle’ counter can go off and it’ll seek something). When its ‘boredom/idle’ counter goes off (which started when it sat), it may try something, which you can reward or punish (or ignore). Have a blurb explaining this. Equivalently, you may try pressing a button, as mentioned in item 4.

7. Make sure it does one of these things; then instruct the player to press one of the command buttons and give positive reinforcement.
a. Splashing in the water –> learn small or medium splash attack (water)
b. Sitting in the sun, becoming warmer –> learn heat attack (fire)
c. Hopping around –> learn jump attack (air)

8. Explain that repeating this will teach it to respond to the command button with that action, even in battle. Also mention that for some behaviors, the creatures aren’t naturally inspired by the environment. The player may use rare items to trigger these behaviors.

9. Also explain that creatures remember what the target of their action was. [Record sequence of user actions in past N seconds?]

10. Tell the player that once they are done training, they may try a battle. (Or wait until they have a decent response rate of attack for at least one command button.) Show them the button to go into battle.

11. Once in battle mode, tell them to use their commands to try to direct their creature. Have a really dumb opponent, that attacks on a very slow timer, and takes two hits to defeat. The two animals can move freely around the terrain, and the three mentioned moves (splash, etc.) do damage when the player’s animal successfully executes them close to the AI enemy. It should be made apparent by now that the point of training is to 1.) create a good moveset tailored to the battles you plan to have, and 2.) have your creature respond reliably enough to the commands, so that your creature may win.

12. Explain that it is the player who levels up from the accomplishments of their summon. Show the improvement on the player’s stat page. Player leveling up unlocks abilities, such as using certain items, being able to train creatures with certain natures, etc. Creatures’ stats like attack power could level up, and types of creatures could have level ‘caps’ so players must train multiple creatures to continue advancing, but this is not necessary for the demo.

To add if time permits:
Element alignment via a fire level
A creature with a different nature
Facebook functionality

Posted by: kaising | April 19, 2010

Second Overhaul + notes

The main problem with the new design is that it makes the decision tree AI pretty much unnecessary. It is, in fact, too close to the original design still to properly make use of the AI that is supposed to be included as the main technological effort for this project. The original design was meant to be more of a demo for a particular company I wanted to apply to, but now that I am no longer interested in applying, it seems like accomplishing the technological requirements at the cost of design may be best. However,

Elemental alignment is still based on environment, although this need not be in the initial demo. Creatures should have the following desires:

get food (low food/health bar)
violence (low violence bar)
explore (low explore bar)
rates of emptying is determined by nature.

training data = need to meet (allow ‘or none’ somehow)
stimulus = user coaxing; less rewarding things are less likely

how do we get it to try things when presented with stimulus?
you click one of two commands to it, it tries something nearby
train to go to or away from the cursor

basic interactions:

..click, button commands –> it tries things; user action is recorded as ‘stimulus’ in association with the goodness reinforcement (if any)
..desire needs filling –> it tries things, picking the right action; need is recorded as ‘desire’ in association with the goodness reinforcement (if any)
..whenever update data, check if entry exists or not
have ‘find closest’ decision?

..reward/punish

OR just have the training include giving treats.

Posted by: kaising | April 17, 2010

algorithm criticisms

1. There are no ‘desires’ like those in Black and White, where the creature is more self-directed by its biological needs.

–> The more I think about it, the less comfortable I feel about implementing biological desires for this game, as I feel my instinctive approach would be too close to code I have seen in industry (under an NDA). For now I would like to rely on what I’ve learned in CIS564 – Game Design, and what I’ve learned via internet research. In the future though, it might be fun to totally rethink the concept and code my own interpretation of biologically motivated organisms.

2. Actions that are mutually exclusive are not explicitly marked as such.

–> I’m not entirely sure if this will become a problem for special cases, or make the AI seem less intelligent than it could if the issue were addressed. I’m going to wait until after testing to address this if needed.

3. Decision trees are probably overkill.

–> Fixed! See overhaul II.

4. General doubts that this can actually generate interesting gameplay.

–> Fixed! See overhaul II.

Posted by: kaising | April 16, 2010

Demo playthrough

Although it’s very late in the semester, I think it is worthwhile to completely overhaul the project, and aim more for a quality demo than a finished project that has so many known design flaws. Given all the extreme changes being made, I thought it would be helpful to outline a sample playthrough to serve as the end goal of a demo.

0. [Player starts at level 5] Player starts at level 0?
level 0 –> begin
level 1 –> battles
level 2 –> use items
level 3 –> natures other than docile

1. [Player gets a starting creature, with mischievous nature. This means it is more likely to learn damaging moves, but requires a higher player level to control and (possibly) may not like following the cursor as much.] Player gets a starting creature, with docile nature.  Natures are explained; docile learns very quickly, but can only learn low-powered moves.  It may have two move slots instead of four.  It pretty much is just useful for trying out all the basic features of the game, but not be useful in player vs. player battle. It should be really cute to make up for this. The idea here is to teach the player, but in order to be competitive they need to level up and gain access to other natures (and spend more time playing…).

2. Player goes to the forest level with their new creature.  There’s a quick description of the fact that they will train their critter.

3. Watch until the creature does one of the three behaviors prompted by the environment, namely:

    a.     Splashing in the water –> learn splash attack (water)
    b.     Sitting in the sun, gazing up –> learn heat attack (fire)
    c.     Hopping around –> learn jump attack (air)

4. The user may encourage this behavior, and possibly have it learn an attack from it, or discourage the behavior, making the creature more likely to try the other behaviors. Encourage the player that the creature will do something interesting if it is sufficiently encouraged.
     All TrainingData entries should already be in there, they just start off with equal weights.

5. Once a particular move has been encouraged sufficiently (absolute value > X, where X is particular to the move), the creature adds the attack to their repertoire. Popup! Show the UI where they can see their critter’s specs. (what to show vs. hide?)
     Show them the absolute training values of things they’ve punished or rewarded.

Posted by: kaising | April 16, 2010

design criticisms and changes

Issues with current goals:

1. Decision Tree learning is probably better suited to slow training.

Games that do not focus just on one creature (e.g. Black and White, where you have many other features as well) may be a better environment than a one-on-one activity with a learning creature. Otherwise, if the user can only interact with the creature, there may not be enough immediate feedback. Tuning it to allow for immediate feedback makes it seem unnatural, or perhaps even make the nuance possible with decision trees unnecessary.

–> How do we fix this without adding features?
have some features only take one training instance to work 50% of the time,
and then a second to work 100% of the time. Maybe even have one example
work 100% after the first instance, or a solution such as this:
one of the four move slots is for abilities that have 100% response
rate (the creature performs them every time you ask) but aren’t
very strong moves.

Additionally, perhaps there could be a future mode where more than one creature
can be trained at once; or they could even be trained together for two-on-two
battles. (Two-on-two battles are an interesting new feature in later Pokemon
games as well.)

2. Limiting moves to four slots makes for awkward ‘choose a move to forget all training for’ moments. It seems unnatural to forget the training for just one move.

–> How can we fix this?
I think limiting the number of moves the learn is important for strategy. However,
maybe it would be more interesting to have training for multiple moves make them
each less powerful, or less ‘accurate’ (where accuracy is like the parameter in
Pokemon, i.e. the likelihood that a command will result in the action being
performed.

3. Having training move and fighting mode in addition is probably necessary to provide motivation, but means that there need to be a lot of moves that are useful in both situations – e.g. like ‘flash’ in Pokemon, which lights caves outside of battle, and lowers opponent accuracy in battle.

–> This isn’t a true design flaw so much as a huge timesink in terms of time spent designing.

4. Creatures should probably do motions on their own that can be trained on, e.g. they randomly do all kinds of possible moves on the environment. However, this means that a lot of authoring needs to go into the environment, and move variety is limited to how many moves are expressed as prompts in the environment. This also may be frustrating to the player, as their control is rather indirect. Clearly this is very different from the Pokemon mechanic – so we can no longer rely on that precedent as giving a broad, interested audience.

–> Focusing the design more on explorable environments and active creatures that do
things on their own should address these concerns. This is preferable to the passive
creature that you condition, I think, as it feels more natural, and the game / AI pushes
back a bit more. Coding and algorithm updates to come…
Also, it is possible that special moves could be taught via conditioning using rare items
still, for the sake of variety.

5. Leveling up might not have a place here, as much of an incentive as that may be in most games. The player should still level up, but creatures should perhaps only implicitly level up, and become more useful through training like real animals. Giving up the leveling mechanic that is such an incentive in precedents like Pokemon is rather risky, but I think it is worth trying for the sake of experimentation.

6. Having decision tree learning through positive/negative reinforcement means the creature should be doing disadvantageous things sometimes.  This can be really annoying if a lone creature is the sole focus of the player.

–> One solution is to have their stupidity be comical.  Then the player will punish them for their stupid acts.
   Another solution is to have some combinations of moves be far more desirable than others. The player uses training to encourage moves that work well together, and discourage ones that do not.

7. Do we have them only train to create a move set, or do we make them train the controls for each move as well?

–> Training the control for each move is kind of interesting technologically, but not really warranted here for every move. Perhaps some special moves could be trained, however; e.g. train them to perform a really (pre-designated as) ‘special move’ when they see a particular item. Then you can send them into battle with the item and have them see it at some point. This also limits to one special per creature per battle.

8. How much control does the user have over their creature in battle, and how does training relate to this?

–> There are two reasonable options right now, it seems. One is that the user has little control; they can throw items out when they want their dude to use their special attack, or otherwise prompt. The other one is that the N most likely behaviors become moves you can just select Pokemon-style in turn-based battle with opponents. In the latter case, some moves’ accuracy may depend on how the move are rated relative to each other; e.g. fireball may have trainingweight*2 % accuracy.

ID3 as described in a recent post requires a discrete and single evaluation, e.g. do this or don’t do this, good or bad, etc., but the evaluations in the training data will be a wide range of values.

Therefore, we could just normalize the ranging values (from reinforcement training of the creature), and then flip a coin that is biased based on the resulting values to determine if a given action is good or bad. Then we could construct the decision tree. This requires rebuilding the decision tree every time we make a choice, i.e. every time a biological need / user input arises… but the trees should be fairly small so this is okay.

Posted by: kaising | March 15, 2010

notes on decision tree learning algorithms

Goal of decision tree learning:

Take a set of data entries & the results of a judgement/classification on them
–> be able to classify new data entries

Concept Learning System (CLS)

     Let C be the set of all data entries/example “instances”.
     Termination
         If they all have evaluation “Yes”, make a Yes node and it is done.
         If they all have evaluation “No”, make a No node and it is done.
         (etc.)
     Otherwise
         Create a decision node.
         Pick a property X (column of the data entries).
         Partition C into C1, C2, … Cn based on this discretized value/property X.
         recurse on each subset of C.

Quinlan’s ID3 Algorithm

     CLS plus a particular way of (greedily) selecting which feature to branch on.

     It is run on a fixed set of training data, and is not incremental.
     It is expected to be inductive, i.e. forming rules based on a small sample is hoped
     to lead to correctly classifying the entire space of possibilities.
         It may misclassify data.

     Pick the feature that allows for a classification with the highest information gain.
     Specifically, pick the feature that has the lowest entropy, i.e. lowest
     disorganization of the feedback data.

     A measure of entropy is obtained as follows, assuming there are two evaluation
     judgement options [C.R. Dyer]:
         log_2 |S| = expected work to guess an element in a set S, with size |S|.
         if the evaluation options are Y and N, then:

Quinlan’s C4.5 Algorithm

Sources:

http://www.cise.ufl.edu/~ddd/cap6635/Fall-97/Short-papers/2.htm
CLS and ID3

http://www.cis.temple.edu/~ingargio/cis587/readings/id3-c45.html
ID3 and CS4.5

http://www.dcs.napier.ac.uk/~peter/vldb/dm/node11.html
ID3 plus example

http://en.wikipedia.org/wiki/Ross_Quinlan#ID3
ID3

http://en.wikipedia.org/wiki/C4.5_algorithm
C4.5

http://pages.cs.wisc.edu/~dyer/cs540/notes/learning.html
Course notes on machine learning, and decision trees. Also has an algorithm for decision trees.

Older Posts »

Categories