Blog - Stuart Mathews

Details: Category: Blog; By Stuart Mathews; 06.Sep; 06 September 2024; Last Updated: 06 September 2025; Hits: 417

Research

Avenue of Research	Topics
Developing sensory environments	Modelling the formation of environments that emit data that can be sensed by inhabitants.
Simulating virtual perception	Modelling the ability to experience sensation from virtual reality.
Developing sensory agents	Modelling agents that feel, perceive and experience sensations received from their environment and themselves.
Defining situational circumstance	Processes and methodologies for characterising and deriving context from acquired situational information. Modelling, identifying and discerning arbitrary situations.
Agent knowledge formation	Methods and mechanisms for information acquisition and discovery. Developing situational awareness through virtual agent experience. Codification of situational information into contextual situational knowledge.
Producing perceptual behaviour	Use of contextual knowledge to create/trigger situational responses. Defining what influences and drives agent behaviour and reactions, including motivation, purpose and goals.
Situational agent learning	Deriving patterns and rules from experienced situations. Incorporating situational learning aids to influence learning. Modelling behaviour selection based on situational knowledge.
Modelling and quantifying agent experience	Conceptual modelling and simulation of experience in agents. Simulating experiential discovery and classification of observations.
Distributed and social agent learning	Communication and sharing of knowledge and experience between agents.
Situational analysis	Definition and simulation of types of learning situations.

Figure 3 shows potential avenues and directions for this research, followed by a table defining the avenues of research.

Details: Category: Blog; By Stuart Mathews; 06.Sep; 06 September 2024; Last Updated: 29 September 2025; Hits: 371

Research

Project Terminology

Term(s)	Meaning
Virtual Environment	The playground in which a reality is simulated.
Situational awareness	Having contextual knowledge of the current situation
Perceptual understanding	Understanding via sense/perception to obtain understanding.
Simulated artificial entity	Agent, Game Character, NPC, Robot, an abstraction of a human intelligence.
Agent reasoning repertoire	Cognitive capabilities that the agent can use.
Agent	Simulated abstraction of intelligence.
Knowledge Acquisition	The ability to use information to make logical choices/decisions and inform reactions.
Reasoning	Interpreting a situation and being able to choose/use/select knowledge that applies to it.
Situational Rules	Contextual and situational Patterns, Repeatable cause & effect
Situational Circumstance	Information that defines circumstance(s) which make up a situation.
Experiential	An active process or experience of looking for, or obtaining information about, varying contextual situations.
Perceptual, experiential knowledge	Knowledge that is obtained through sources of perception.
Adaptable, perceptive, reactive	Adaptable: behaviour and understanding might change based on different knowledge available. Perceptive: Utilising continual perceptual monitoring. Reactive: Showing a response to a stimulus.
Real-time	using a continual feedback processing model.
Actionable situational knowledge	The ability to use that knowledge in a detected situation
Instinctive	Based on rules for behaviour in circumstances.
Emergent Rules	Rules can be derived, tuned, and evolve according to an ongoing process of rule detection/creation.

Online Learning

Term	Meaning
Markov Chain (MC)	A model of state transitions represented as a Transition Matrix
Markov Decision Process (MDP)	A model a a game as described in Markov Decision Processes
Q-Learning	An approach to reinforcement learning that aims to solve MDPs
Q-Value	A value metric for an action/move
Policy	A decision-making rule or action selection strategy
Decision	A choice that selects an Action
Deterministic Policy	Selects an action
Stochastic Policy	Provides possible actions as probabiliites ie Left 0.7, Right 0.3
Online-learning	Dynamic dataset
Offline-learning (aka batch learning)	Static/Fixed dataset
Model free	Does not learn or use the rules of the envionment ie environment transition dynamics
Model-based	Uses or learns the rules of the environment (MDP's Transition Probabilities or MC's Transition Matrix)

General Ideas

Term	Meaning
Experience	Actions and results (cause and effect)
Use of historical data	Knowledge
Control policies	Decision-making strategies
Minimizing failed expectations	Need for learning
Goals/Rewards	Often tied to Policies that aim to reach goals
Psychology	Understanding behaviour and what causes it
Minimizing failed expectations	Need for learning
Goals/Rewards	Often tied to Policies that aim to reach goals

Details: Category: Blog; By Stuart Mathews; 06.Sep; 06 September 2024; Last Updated: 06 September 2025; Hits: 500

When considering a single learning agent in a virtualised world, research has shown that in social contexts, humans seeing or experiencing other people’s reactions or emotions, i.e observing stimuli and resultant responses in a social environment, can cause the same reactions to recur, for example, shared disgust (Sowden, Khemka and Catmur, 2022).

This phenomenon is called Mirroring, and appears to be a mechanism that humans and animals, and possibly agents, could use to begin learning in specifically unexpected, new and unknown situations.

While individual exploration may be an antidote to being able to generalise from variances in experiences, a step up would be incorporating the variances introduced by others. This is essentially social exploration to bring about social learning.

Social contexts appear to provide more variety than individual experiences, which would otherwise be singular and stable (biased) for each individual, and one might therefore reasonably expect or assume that a group’s response (consisting of more collective experiences) would be better than a personal, single and therefore limited experience, particularly when lacking personal experience in that particular circumstance.

This might be a particularly useful social adaptation to reduce the cognitive overhead of determining appropriate behaviour for an unfamiliar situation for ourselves. It may also be an optimisation strategy for learning about unknown or unfamiliar situations generally, for example, exposing oneself to new (and therefore unfamiliar) topics taught in a classroom, in the same way as an optimisation). Unmet expectations and emotions might be used as an optimisation for detecting a lack of knowledge, and therefore the need for more learning. Research has also shown that we tend to abide by the group, and this might be a reason why this is an attractive strategy to avoid the rigorous process of learning individually.

Simulating forms of social knowledge acquisition (social exploration) could help inform how a naive agent might respond in new social situations, such as a group of agents experiencing a specific situation purely through observation and collection of S-R links and CSIs. In this way, social collaboration likely yields more opportunities to establish knowledge to inform behaviour in a more timely manner, particularly if the situation is unknown.

Indeed, from a behavioural standpoint, if others’ behaviours are remotely similar to how we might begin to react (perhaps determined by a measure of experience codified as confidence), then we - or an agent- are unlikely to need much convincing (or perform more contextual processing) to accept/learn others’ behavioural reactions as our own.

Teaching and mentoring, i.e the transfer of knowledge (and identification of lack of situational knowledge), might thus be possible by allowing naive agents to experience or observe other agents’ responses in simulated social situations, and teaching might be simulated through the transmission of the aforementioned externalised agent knowledge.

In the context of simulation, one might be able to introduce other agents’ reactions through a shared social context (a meeting of agents within the same situation) to simulate the sharing of experience and resulting behaviour. This might then be used as the basis of new agent learning. There is also an opportunity to incorporate multi-player agents.

Generally, it might be possible to simulate social interactions between distributed agents to try and improve their own experience, and being social and being in a social context, may provide important variance in circumstantial experience available in the situational environment and increase the amount of information that is available for learning, in a similar way that variance in training data in ML is an an important way to improve learning.

A reason why contemporary AI approaches such as Artificial Neural Networks (ANNs) often fail is because, “…new context powerfully reveals human cognitive biases in the selection of the training data.”, however integrating training data from disparate sources might alleviate this bias in training data and provide the contextual variance which was not selected (or had not been exposed to) by the individual, and therefore not catered for by their independent learning (Denning and Arquilla, 2022).

Early learning

In collaborating environments in nature, initial intellectual growth (beginning to learn) about unknown circumstances appears to be modelled through specific kinds of learning behaviour, which appears to assist early learning. These behaviours include imitation, role-modelling and imprinting. The first two are obvious, while imprinting refers to a form of rapid learning that occurs just after being born (McCabe, 2013). For example, newly hatched chicks physically copy/replicate everything their mothers do, which might be among the first types of learning that occurs in response to having no experience with a given situation.

Some of these social models, such as imprinting and imitation appears to represent a way to form an initial level of knowledge from a clean slate (tabula rasa), so to speak, as proposed by Locke, and this appears to be safer than the alternative, which would be to randomly stimulates all ones physical senses or abilities to produce arbitrary behavior, i.e to learn from them. This is especially true in nature, as this would likely draw unwanted attention from predators.

That this learning mechanism appears to exist instinctively after birth, appears to validate that some knowledge is, in fact, innate and so while it is unlikely that the idea proposed by Locke, of initial learning starting from a blank slate so to speak (tabula rasa), is entirely accurate (but it might be), it is likely that learning does accumulate. This type of initial learning might be triggered by the identification that a high-priority goal exists when no base knowledge for it exists. This might suggest that a form of initial behaviour replication could be encoded into agents when they start…or are ‘born’ (Bigelow et al., 2018). In this way, one might begin simulating the teaching of an agent to grow its initial intellect.

Social learning would represent a much later, higher-order task in this research that depends on having first established prior dependencies, including a consistent observation protocol, a simulation environment extended to support collaborating agents and shared experiences and an ability to reason based on observation.

References

Sowden, S., Khemka, D. and Catmur, C. (2022) ‘Regulating mirroring of emotions: A social-specific mechanism?’, Quarterly journal of experimental psychology (2006), 75(7), pp. 1302–1313. doi: 10.1177/17470218211049780.

Denning, P. J. and Arquilla, J. (2022) ‘The context problem in artificial intelligence’, Communications of the ACM, 65(12), pp. 18–21. doi: 10.1145/3567605.

McCabe, B. J. (2013) ‘Imprinting’, Wiley interdisciplinary reviews. Cognitive science, 4(4), pp. 375–390. doi: 10.1002/wcs.1231.

Bigelow, A. E. et al. (2018) ‘The Effect of Maternal Mirroring Behavior on Infants’ Early Social Bidding During the Still‐Face Task’, Infancy, 23(3), pp. 367–385. doi: 10.1111/infa.12221.

Projects

Login

Twitter

Avenues of research

Research Terminology

Project Terminology

Online Learning

General Ideas

Models of social learning

Early learning

More Articles …