Project Terminology
| Term(s) | Meaning |
| Virtual Environment | The playground in which a reality is simulated. |
| Situational awareness | Having contextual knowledge of the current situation |
| Perceptual understanding | Understanding via sense/perception to obtain understanding. |
| Simulated artificial entity | Agent, Game Character, NPC, Robot, an abstraction of a human intelligence. |
| Agent reasoning repertoire | Cognitive capabilities that the agent can use. |
| Agent | Simulated abstraction of intelligence. |
| Knowledge Acquisition | The ability to use information to make logical choices/decisions and inform reactions. |
| Reasoning | Interpreting a situation and being able to choose/use/select knowledge that applies to it. |
| Situational Rules | Contextual and situational Patterns, Repeatable cause & effect |
| Situational Circumstance | Information that defines circumstance(s) which make up a situation. |
| Experiential | An active process or experience of looking for, or obtaining information about, varying contextual situations. |
| Perceptual, experiential knowledge | Knowledge that is obtained through sources of perception. |
| Adaptable, perceptive, reactive | Adaptable: behaviour and understanding might change based on different knowledge available. Perceptive: Utilising continual perceptual monitoring. Reactive: Showing a response to a stimulus. |
| Real-time | using a continual feedback processing model. |
| Actionable situational knowledge | The ability to use that knowledge in a detected situation |
| Instinctive | Based on rules for behaviour in circumstances. |
| Emergent Rules | Rules can be derived, tuned, and evolve according to an ongoing process of rule detection/creation. |
Online Learning
| Term | Meaning |
| Markov Chain (MC) | A model of state transitions represented as a Transition Matrix |
| Markov Decision Process (MDP) | A model a a game as described in Markov Decision Processes |
| Q-Learning | An approach to reinforcement learning that aims to solve MDPs |
| Q-Value | A value metric for an action/move |
| Policy | A decision-making rule or action selection strategy |
| Decision | A choice that selects an Action |
| Deterministic Policy | Selects an action |
| Stochastic Policy | Provides possible actions as probabiliites ie Left 0.7, Right 0.3 |
| Online-learning | Dynamic dataset |
| Offline-learning (aka batch learning) | Static/Fixed dataset |
| Model free | Does not learn or use the rules of the envionment ie environment transition dynamics |
| Model-based | Uses or learns the rules of the environment (MDP's Transition Probabilities or MC's Transition Matrix) |
General Ideas
|
Term |
Meaning |
|
Experience |
Actions and results (cause and effect) |
|
Use of historical data |
Knowledge |
|
Control policies |
Decision-making strategies |
| Minimizing failed expectations | Need for learning |
| Goals/Rewards | Often tied to Policies that aim to reach goals |
|
Psychology |
Understanding behaviour and what causes it |
|
Minimizing failed expectations |
Need for learning |
|
Goals/Rewards |
Often tied to Policies that aim to reach goals |