Action Policy Arbitrage

1. Preamble

A. The state of a person’s life is the result of an accumulation of decisions.

B. The decisions that a person makes form a pattern.

C. From this pattern it is possible to derive an emergent policy of action (Action Policy).

D. An Action Policy is the ruleset by which actions are generated from beliefs about the person’s environment: “Given X facts, make Y action”.

E. Notionally, an Action policy is an algorithm for decision-making given incomplete knowledge about an environment (ie. an Action Policy is as an approximate algorithmic solution to a Partially Observable Markov Decision Process).

2. Passive Amendment to the Action Policy

A. An Action Policy is subconsciously and continuously revised and updated through:

Learning
Divine Experience.
Trauma

B. In general, revisions to the Action Policy stimulated by Learning and Divine Experience are edifying, and lead to greater Agency and Victory, while revisions resulting from Trauma are regressive, leading to risk-aversion and Failure.

The shrewd pupil is wary of Trauma disguised as Learning.
The pupil also acknowledges that Divine Experience may be traumatic, but still be edifying.
The pupil knows that Unbeautiful Experience is traumatic

C. All non-divine experience is metabolised as either Learning or Trauma, or it is Forgotten.

3. Effortful Amendment to the Action Policy

A. An Action Policy can also be updated effortfully, through Discipline.

B. Discipline competes with the Inertia of an Action Policy, which leaves it resistant to change. The prospect of Disciplinary Victory over Action Policy Inertia increases and decreases as a function of its cumulative past victories over Inertia, adjusted for recency.

Alternately phrased, like begets like exponentially, and the effect size of Discipline is therefore causally circular, either entropically or negentropically.
Prolonged negative emotion, malaise, and powerlessness are the result of continued Indulgence, concessions to Inertia, and autocatalytic downturn in Agency stimulated by Failure and Defeat

4. The Belief State

A. The Belief State is a predictive model of a person’s environment, and is coterminous with a person’s beliefs and presumptions about the state of their environment.

B. Implicit in, and derivable from, the Belief State, are accurate or inaccurate Predictions, which are anticipatory fantasies about the experiences expected to follow from Actions.

The shrewd pupil will understand that the only available input to measure the accuracy of a prediction is the resulting experiences from actions taken on them. Predictions anticipate the Perceived Result, which the pupil must not mistake for the Real Result.
The accuracy of a prediction is measured by the extent of Victory achieved as a result of acting upon the prediction. Total Victory is achieved by acting on totally accurate predictions. Properly conceived, Victory is the sole measure of the accuracy of a prediction, and of the appropriateness of the Belief State.
Improperly conceived, predictions are accurate to the extent of their correlation with the phantasmic ‘reality’, and unrelated to Victory. This conception is improper because it is circular: the phantasmic ‘reality’ is actually the Belief State in disguise, an internal and mutable phenomenon mistaken for an external and static one.

C. Notionally, the Real Result is comprised of the Perceived Result, reduced in proportion to Misattributed Causality, and increased in proportion to the Unperceived Results.

(PR - MR) + UR

D. The Belief State is parsimoniously updated as a function of the Prediction Delta, a measurement of the difference between the Prediction and the Perceived Outcome based on discrete inferences of causal relation.

E. The State Transition Function is the function by which the Belief State is amended, and resists amendment. The evolution of the Belief State is determined by the State Transition Function.

5. Action Policy Arbitrage

A. Notionally, the Belief State can be directly amended effortfully through Discipline, if the pupil is in a state of Ultimate Agency. Most pupils will never achieve true discipline, let alone Ultimate Agency.

B. Through Shrewd Interrogation, it is possible to backwards-engineer the priors of another person’s Action Policy.

C. Through Comparative Evaluation, it is possible to identify the inferiorities of one’s own Action Policy, and the corresponding priors which underlie the inferiorities.

D. The process of amending the Action Policy is Action Policy Arbitrage.

The Pupil has Learned:

All non-divine experience is metabolised as either Learning or Trauma, or it is Forgotten.
The pupil’s Perceived Result is easily mistaken for the Real Result.
The phantasmic ‘reality’ is actually the pupil’s Belief State
Total Victory is the final measure of the accuracy of Predictions and Beliefs.
Ultimate Agency is required to force amendments to the Belief State.
Thusly, Ultimate Agency materialises TOTAL VICTORY.

Thusly the Pupil May Make Himself Holy:

Observing the Belief State and Action Policy of others
Amending his own Action Policy through discipline and Action Policy Arbitrage
Controlling, limiting, or metabolising Unbeautiful Experience
Increasing openness to Divine Experience and devoting himself to Learning
Directly Amending the Belief State to achieve Total Victory