A Possible Approach

The premise of this research, is that through an autonomous entity or entities that have been provided the facilities to experience and perceive situational information, and an environment that provides sensory stimulation indicative of those situations, that arbitrary situational experiences that are encountered can, in the first instance be identified and studied, detected, codified and measured, and therefore provide situational data for analysis such as learning, behavior creation and situation detection.

This pursuit is specifically in the context of forming knowledge in order for an agent to formulate an understanding or rationalisation of new and unknown situations it experiences while exploring.

Observations might include the perception of unknown entities, such as game objects, such as characters, walls, etc., within the virtual reality, or equally classification of multiple aspects of an observation to form a sense of defining the makeup of a situation that is currently occurring.

In this way, objects and the environment that the agent interacts with will need to possess characteristics and properties that can be sensed.

Modelling circumstantial perception in agents based on the assimilation of experience, specifically through empirical agent observation, is a key aspect of the simulations proposed in this research.

Figure 4 provides a basic illustrative example of the scope of the research. 

The overall research approach considers a linear progression around phases of modelling and understanding experience within virtual worlds:

  1. Simulating experiences (phase 1)
  2. Observing experiences (phase 2)
  3. Defining experiences (phase 3)
  4. Using experiences (phase 4)
  5. Distributing experience (phase 5)

The methods and research design for each of these phases of research are discussed in more detail next.

Phases 1 and 2

Simulating and Sensing Virtual Environments

The first research goal is to be able to sense and collect observational data systematically about arbitrary in-situ experiences when the agent experiences them.

This will likely involve the ability to create a primary autonomous learning agent and the ability to perceive its observations through the acquisition of environmental stimuli in order to acquire knowledge about situations that are encountered and experienced by the agent.

Our research proposes to position an artificial learning agent within a simulated virtual reality where that environment exerts sensory forces that it can observe through simulated sensation based on stimulus-response. By establishing relationships between stimuli that occur and resultant responses, and an ability to define estimations of ongoing situations through observing environmental contexts that which these occur, provides a model for our research to pursue the establishment of an experience-based autonomous learning environment.

Perceiving Virtual Reality

An important aspect of such a learning simulation is engineering of an environment that can be designed to emit circumstantial information as part of the virtual reality, including housing variables, constantly evolving artificial entities or game objects like characters, behaviours and other environmental variations that the learning agents can perceive and learn from. Examples of possible 2D and 3D virtual environments that might be created are shown in Figure 6.

Our research proposes to use sense detection architectures to simulate the production of situational stimuli within the simulation environment as a means to acquire sensory experience, while the simulation of S-R links is to be derived from an approach like Epic’s Unreal AI Perception System (‘AI Perception’, no date) which is based on the generation and distribution of stimuli events throughout the environment.

Objectives (phases 1 and 2)

Key research goals in phases 1 and 2 include:

  1. Creating a reality (world) to serve as the simulation environment that can be controlled to produce varying perceivable situations to agents within it, likely using stimulus-response theory.
  2. Incorporating a means for the environment to be sensed such that it has an impact on the experience of its inhabitants, i.e the agents. This is likely to include producing sensation.
  3. Creating controllable and autonomous entities that can function within this sensory environment, that can perceive it, and objects within it.
  4. Establishing an observational mechanism for collecting and recording experiential/situational data perceived by the agent. (modelling the unknown)

The combination of these two phases would focus on the technical aspects of constructing the experimental platform on which this research will be used to sense and simulate sensory knowledge acquisition.

The simulation system design itself and implementation of supporting subsystems, and the architecture necessary to unify them, such as event and resource management, networking, scripting, artificial intelligence and computer graphics, will also require implementation. In this way, a technical aim is to produce a virtual environment that can visualise and demonstrate the experiments of models simulated in this research.

It is in this way that this research is not only concerned with how experiences influence an agent’s behaviour and understanding, but also explores how perception and experience can be implemented as sensing software models and data structures, particularly within a context of real-time perceptual learning.

In many respects, this phase forms the foundation on which further exploration and research will be based. It can largely be seen as a building phase, focusing on the software design and engineering required to model and simulate agent observations. This phase constitutes an application of various computer games technologies, including graphics, simulation and engineering.

Phase 3

Formation of Abstractions

Modelling the unknown

The primary goal of the phase is the creation of higher-level situational experiences from the previously collected sensory data in order to act on them. This phase provides a means to formulate a model to define experiences and distinguish between situations (modelling the unknown). This data can then be used in subsequent phases as the basis of a variety of possible outcomes, including to learn about the nature of situations, behaviours, and decision-making in unexpected or new and unknown situations, etc.

While raw stimuli-response sequences might be used to underlie raw sensory experiences, there could be a process that uses those basic sensory experiences to compose a more holistic, higher-level description of that experience - or to interpret it in order to find meaning in that experience.

Allowing a simulation of reality and a means to observe it opens the possibilities to study the effects of situations and, therefore, the resultant observational data that represents them. This, when synthesised into experience, can then provide learning material for subsequent reasoning and understanding of the causes and effects of situations.

A targeted understanding of how and why situations occur and what happens when they do can then allow for the formulation of unique patterns or rules that can define them, ultimately making it possible to detect when rules are (and still are) applicable to a situation.

Objectives

Key research goals in this phase include gaining a discernible definition of abstractions (circumstance, situation and experience) from raw experiential (experienced) data such as S-R links and CSIs:

  1. Defining and detecting, generalising unique and discernible situations and circumstances from experiential data, likely utilising sensory data, scene graphs and temporal networks.
  2. Describing observations, eg, processing through Eidetic and Phenomenological reduction processes.
  3. Producing aggregate, higher-order structures from observations, i.e. defining basic situation-specific knowledge such as experience.
  4. Construction or use of cognitive models for processing sensory experiences.

These represent the formation of abstractions from experiences.

Phase 4

Using and Sharing Experiences

Once a notion of circumstance, situation and experiences can be identified, an agent may be used to become more contextually aware to inform behaviour and gain an understanding in response to those situations, i.e using abstractions to influence their behaviours.

Defining Behavior

Typically, behaviours in games are not generated from experience and instead are defined by pre-defined behavioural logic or scripts that, when interpreted, define how the agent should behave in various circumstances. This would be pre-defined actions to predefined circumstances, meaning behaviour is not created, but pre-existing behaviour is triggered.

Our research proposes that it may be possible though an experiential learning model/approach to dynamically generate these behavioral scripts, as part of continual “online” experience gathering exercise, and then refine them through continual variations in experiences in the virtual reality through a real-time feedback-response loop, and utimately being able to use them to produce behavior in the future in response to similar contexts that were used to generate them.

An approach developed by (Robertson and Watson, 2015) was to generate behaviour trees (BTs) that were based on the supply of action sequences, where the focus was on predefined actions as textual representations such as “train”, “evolve”, etc, in order to remove redundant common behaviours. Our research could instead represent actions as more detailed compositions of S-R sequences derived from actual agent observations. BTs are appropriate because they produce reactive behaviour which aligns well with the stimulus-response model. This would allow agents to define and create behaviour as they experience situations, as new experience-based rules, which the agent can use as part of its behavioural knowledge to either carry out those behaviours in the future or recognise specific behaviours based on their S-R sequences encountered in real-time or “online” while experiencing its virtual reality. Equally, these learnt behaviours could be transferred to other agents for them to use when situations (context) call for it.

The results of externalising “online” experience and other dynamic abstractions could also be used to inform more traditional, static and “offline” training models, which might help make static artificial neural networks (ANNs) more contextually aware by incorporating the variable experiences that only “live” and online training data can provide. This combination of experience and offline deliberation might mitigate failures in traditional ANNs that often occur because, “…internal rules become fixed and do not adapt to new situations of use”, and this could help augment traditional ANNs that traditionally could not, “…adapt fast enough because the training algorithms are too slow to meet real-time requirements”, that occur in the real-world. (Denning and Arquilla, 2022). It's not clear if ANNs would need to be designed initially to be able to train from sensory input or if existing networks that have never witnessed causality data would need to be adapted or re-engineered.

This externalisation of learning and experience, such as saving S-R link sequences or higher-level synthesised situational abstractions, might also be useful for a static analysis of learning performance, providing empirical evidence about what learning actually took place and what has been learnt from any particular experience of a situation. This might include the provenance and formulation details of the underlying expectations/rules/knowledge that were synthesised as part of that learning. This can be used as a basis for evaluating the performance of the underlying model to measure progress, intelligence or learning capability of our model, but more importantly, in showing what specific data/situation/experience contributed to a learnt behaviour.

Social Context and Shared Experience

An extension of the experiential system proposed by this research is the possibility of gaining experience by interacting with other similar learning agents. This might be realised through the networking of autonomous agents. This is the field of multi-agent research.

If primitive experience might be simulated through the evaluation and establishment of S-R links as the basis of experiential knowledge formation, this may be externalised (saved), and then reused and transferred between agents. Such a mechanism may form the basis for distributed advice or learning among collaborating, social agents.

The ramifications of this are that learning need not require initial practical experience once that situational experience is already formalised or when CSIs match (akin perhaps to transfer learning in ML). An agent might be able to pick up foundational sensory-experiential derived advice from other agents as it progresses through levels, situational tests or other virtual realities, for example.

It would be interesting to find out if, as research suggests (Nonaka and Krogh, 2009), tacit knowledge, which is not initially captured as explicit knowledge, can be acquired through additional transferred experiences and therefore enhance and improve prior learnt knowledge. This might, however, be subject to the same data loss that occurs when any simplification/abstraction is formed.

The variation of experiences and perspectives as described would be achievable in the controlled simulation environment through loading of different situational trials, levels and alternative environmental or world circumstances.

In addition, through distributed observational experience, research suggests that even more distributed and communicative agents, which produce swarm-like behaviour, can allow multiple individuals to contribute to the actual monitoring of environmental change and provide feedback about observed variances. This is particularly relevant for the creation of swarms of virtual perception sensors, as previously discussed, to contribute to environmental or situational monitoring within virtual worlds. Virtual worlds in this respect are themselves an abstraction of the execution environment in which reasoning can be established from data that is present within them. This would likely alleviate a single point of sensing failure (or bias), and form part of a decentralised, distributed observational capability that senses change in general. (Greengard, 2022),

Objectives

Analysing abstractions

Key research goals in this phase are based on using established situational knowledge to create and inform a situational perspective or basis from which an agent can:

  1. Learn about the emergent nature of captured situations through analysis (possibly ML) of newly constructed situational knowledge.
  2. Identify relationships and expectations between/in experiences and underlying goals.
  3. Produce respectable reaction in response to situations (behaviour selection), e.g, through the generation of situational behaviour trees
  4. Externalise experience about situations and distribute learning and sharing of situational knowledge between collaborating agents.

Research into simulating agent autonomy based on effective reasoning about circumstances is likely to require research into contemporary AI approaches and theory. In order to implement a strategy for agent autonomy, looking into the types of intelligence models that may help discover emergent rules from complex circumstantial data, such as machine learning (ML), and how the formation of cognitive maps of the circumstance might be developed, is likely to require more research.

This phase might produce some interesting opportunities to explore new synthesised strategies (based on analysis of sensory experiences) and then feed this back into the research simulation based on analysis of patterns and results.

This phase is also concerned with the creation of distributed and shared experiences between agents.

Ultimately, the goal in this area of research is to determine the requirements and theory needed to model novel uses of the abstractions formed by the agent.

Phase 5

Reporting and Outcomes

The final phase of the research is to write up the research write-up of the findings and observations from the research objectives performed in the prior phases.

The outcomes that might be reported, but are not limited to, might include:

  • The learnings discovered while creating perception in agents and environments, including the construction of perceptual software models
  • The analysis of experiences and situations
  • The empirical outcomes of the speculated conjecture as posed throughout this proposal, such as those on expectations and modelling the unknown
  • The usages of the externalised or formalised representation of experience and situations, including relevance to AI and learning
  • Approaches to solving the problem of context and fixed rules for responses.

Our research has the opportunity to explore and contribute to multiple disciplines, for example, Epistemology, Psychology, Artificial Intelligence, Simulation and Computer games technologies such as Graphics and Networking.

An example of topics that are relevant to our research is listed below.

Appendix B - Illustrations

  1. Crytek’s Target Tracks Perception System (TTPS) (Rabin, 2014a)
  2. Subscribing to stimuli in the environment (Epic’s Unreal’s AI Perception System)
  3. How an agent might actively observe its environment.

References

 

‘AI Perception’ (no date). Available at: https://docs.unrealengine.com/5.1/en-US/ai-perception-in-unreal-engine/ (Accessed: 5 January 2023).

Burnes, B. and Cooke, B. (2013) ‘Kurt Lewin’s Field Theory: A Review and Re-evaluation’, International journal of management reviews : IJMR, 15(4), pp. 408–425. doi: 10.1111/j.1468-2370.2012.00348.x.

Ceria, A. and Wang, H. (2023) ‘Temporal-topological properties of higher-order evolving networks’, Scientific Reports, 13(1), p. 5885. doi: 10.1038/s41598-023-32253-9.

Denning, P. J. and Arquilla, J. (2022) ‘The context problem in artificial intelligence’, Communications of the ACM, 65(12), pp. 18–21. doi: 10.1145/3567605.

Greengard, S. (2022) ‘Swarm robotics moves forward’, Communications of the ACM, 65(12), pp. 12–14. doi: 10.1145/3565979.

Kim, S. D. (2012) ‘Characterizing Unknown Unknowns’. Available at: https://www.pmi.org/learning/library/characterizing-unknown-unknowns-6077 (Accessed: 6 December 2023).

Masuda, N. and Holme, P. (2019) ‘Detecting sequences of system states in temporal networks’, Scientific Reports, 9(1), p. 795. doi: 10.1038/s41598-018-37534-2.

Mathews, S. (2022) ‘Cppgamelib’. Available at: https://github.com/stumathews/cppgamelib (Accessed: 21 January 2023). 

Nonaka, I. and Krogh, G. von (2009) ‘Tacit Knowledge and Knowledge Conversion: Controversy and Advancement in Organizational Knowledge Creation Theory’, Organization science (Providence, R.I.), 20(3), pp. 635–652. doi: 10.1287/orsc.1080.0412.

Oettershagen, L. and Mutzel, P. (2022) ‘TGLib: An Open-Source Library for Temporal Graph Analysis’. arXiv. doi: 10.48550/arXiv.2209.12587.

Rabin, S. (2014a) ‘Crytek’s Target Tracks Perception System’, in Game AI Pro. A K Peters/CRC Press, pp. 432–441. doi: 10.1201/b16725-37.

Rabin, S. (2014b) ‘How to Catch a Ninja: NPC Awareness in a 2D Stealth Platformer’, in Game AI Pro. A K Peters/CRC Press, pp. 442–451. doi: 10.1201/b16725-38.

Robertson, G. and Watson, I. (2015) ‘Building behavior trees from observations in real-time strategy games’, in. IEEE, pp. 1–7. doi: 10.1109/INISTA.2015.7276774.

Wu, H. et al. (2014) ‘Path problems in temporal graphs’, Proceedings of the VLDB Endowment, 7(9), pp. 721–732. doi: 10.14778/2732939.2732945.

Zhan, X.-X. et al. (2021) ‘Measuring and utilizing temporal network dissimilarity’, arXiv.org. Available at: https://arxiv.org/abs/2111.01334v1 (Accessed: 1 January 2024).