Observations

An observation is a class object that records simulation results; they are responsible for initializing, gathering, updating, and formatting results.

The provided Observation class is an abstract base class that should be subclassed by concrete observations. While there are no required abstract methods to define when subclassing, the class does provide common attributes as well as an observe method that determines whether to observe results for a given event.

At the highest level, an observation can be categorized as either an UnstratifiedObservation or a StratifiedObservation. More specialized implementations of these classes involve defining the various methods provided as attributes to the parent class.

class vivarium.engine.framework.results.observation.Observation(name, population_filter, when, requires_attributes, results_initializer, results_gatherer, results_updater, results_formatter, to_observe, stratifications=None)[source]

An abstract base dataclass to be inherited by concrete observations.

This class includes an observe method that determines whether to observe results for a given event.

Parameters:
  • name (str)

  • population_filter (PopulationFilter)

  • when (str)

  • requires_attributes (list[str])

  • results_initializer (Callable[[], pd.DataFrame])

  • results_gatherer (Callable[[pd.DataFrame | DataFrameGroupBy[tuple[str, ...] | str, bool], tuple[str, ...] | None], pd.DataFrame])

  • results_updater (Callable[[pd.DataFrame, pd.DataFrame], pd.DataFrame])

  • results_formatter (Callable[[str, pd.DataFrame], pd.DataFrame])

  • to_observe (Callable[[Event], bool])

  • stratifications (tuple[Stratification, ...] | None)

name: str

Name of the observation. It will also be the name of the output results file for this particular observation.

population_filter: PopulationFilter

A named tuple of population filtering details. The first item is a Pandas query string to filter the population down to the simulants who should be considered for the observation. The second item is a boolean indicating whether to include untracked simulants from the observation.

when: str

Name of the lifecycle phase the observation should happen. Valid values are: “time_step__prepare”, “time_step”, “time_step__cleanup”, or “collect_metrics”.

requires_attributes: list[str]

The population attributes required for this observation.

results_initializer: Callable[[], pd.DataFrame]

Method or function that initializes the raw observation results prior to starting the simulation. This could return, for example, an empty DataFrame or one with a complete set of stratifications as the index and all values set to 0.0.

results_gatherer: Callable[[pd.DataFrame | DataFrameGroupBy[tuple[str, ...] | str, bool], tuple[str, ...] | None], pd.DataFrame]

Method or function that gathers the new observation results.

results_updater: Callable[[pd.DataFrame, pd.DataFrame], pd.DataFrame]

Method or function that updates existing raw observation results with newly gathered results.

results_formatter: Callable[[str, pd.DataFrame], pd.DataFrame]

Method or function that formats the raw observation results.

to_observe: Callable[[Event], bool]

Method or function that determines whether to perform an observation on this Event.

stratifications: tuple[Stratification, ...] | None = None

Optional tuple of the Stratifications this observation should use.

observe(df, stratifications)[source]

Gathers the results of the observation.

Parameters:
  • df (pd.DataFrame | DataFrameGroupBy[tuple[str, ...] | str, bool]) – The population or population grouped by the stratifications.

  • stratifications (tuple[str, ...] | None) – The stratifications to use for the observation.

Returns:

The results of the observation.

Return type:

pd.DataFrame

abstractmethod classmethod is_stratified()[source]
Return type:

bool

class vivarium.engine.framework.results.observation.UnstratifiedObservation(name, population_filter, when, requires_attributes, results_gatherer, results_updater, results_formatter, to_observe=<function UnstratifiedObservation.<lambda>>)[source]

Concrete class for observing results that are not stratified.

The parent class stratifications are set to None and the results_initializer method is explicitly defined.

Parameters:
  • name (str)

  • population_filter (PopulationFilter)

  • when (str)

  • requires_attributes (list[str])

  • results_gatherer (Callable[[pd.DataFrame], pd.DataFrame])

  • results_updater (Callable[[pd.DataFrame, pd.DataFrame], pd.DataFrame])

  • results_formatter (Callable[[str, pd.DataFrame], pd.DataFrame])

  • to_observe (Callable[[Event], bool])

name

Name of the observation. It will also be the name of the output results file for this particular observation.

population_filter

A named tuple of population filtering details. The first item is a Pandas query string to filter the population down to the simulants who should be considered for the observation. The second item is a boolean indicating whether to include untracked simulants from the observation.

when

Name of the lifecycle phase the observation should happen. Valid values are: “time_step__prepare”, “time_step”, “time_step__cleanup”, or “collect_metrics”.

requires_attributes

The population attributes required for this observation.

results_gatherer

Method or function that gathers the new observation results.

results_updater

Method or function that updates existing raw observation results with newly gathered results.

results_formatter

Method or function that formats the raw observation results.

to_observe

Method or function that determines whether to perform an observation on this Event.

classmethod is_stratified()[source]
Return type:

bool

static create_empty_df()[source]

Initializes an empty dataframe.

Return type:

DataFrame

Returns:

An empty DataFrame.

class vivarium.engine.framework.results.observation.StratifiedObservation(name, population_filter, when, requires_attributes, results_updater, results_formatter, aggregator_sources, aggregator, to_observe=<function StratifiedObservation.<lambda>>)[source]

Concrete class for observing stratified results.

The parent class results_initializer and results_gatherer methods are explicitly defined and stratification-specific attributes aggregator_sources and aggregator are added.

Parameters:
  • name (str)

  • population_filter (PopulationFilter)

  • when (str)

  • requires_attributes (list[str])

  • results_updater (Callable[[pd.DataFrame, pd.DataFrame], pd.DataFrame])

  • results_formatter (Callable[[str, pd.DataFrame], pd.DataFrame])

  • aggregator_sources (list[str] | None)

  • aggregator (Callable[[pd.DataFrame], float | pd.Series[float]])

  • to_observe (Callable[[Event], bool])

name

Name of the observation. It will also be the name of the output results file for this particular observation.

population_filter

A named tuple of population filtering details. The first item is a Pandas query string to filter the population down to the simulants who should be considered for the observation. The second item is a boolean indicating whether to include untracked simulants from the observation.

when

Name of the lifecycle phase the observation should happen. Valid values are: “time_step__prepare”, “time_step”, “time_step__cleanup”, or “collect_metrics”.

requires_attributes

The population attributes required for this observation.

results_updater

Method or function that updates existing raw observation results with newly gathered results.

results_formatter

Method or function that formats the raw observation results.

aggregator_sources

List of population view columns to be used in the aggregator.

aggregator

Method or function that computes the quantity for this observation.

to_observe

Method or function that determines whether to perform an observation on this Event.

classmethod is_stratified()[source]
Return type:

bool

observe(df, stratifications)[source]

Gathers the results of the observation.

Parameters:
  • df (pd.DataFrame | DataFrameGroupBy[tuple[str, ...] | str, bool]) – The population or population grouped by the stratifications.

  • stratifications (tuple[str, ...] | None) – The stratifications to use for the observation.

Returns:

The results of the observation.

Return type:

pd.DataFrame

create_expanded_df()[source]

Initializes a dataframe of 0s with complete set of stratifications as the index.

Return type:

DataFrame

Returns:

An empty DataFrame with the complete set of stratifications as the index.

Notes

If no stratifications are requested, then we are aggregating over the entire population and a single-row index named ‘stratification’ is created.

get_complete_stratified_results(pop_groups, stratifications)[source]

Gathers results for this observation.

Parameters:
  • pop_groups (DataFrameGroupBy[str, bool]) – The population grouped by the stratifications.

  • stratifications (tuple[str, ...]) – The stratifications to use for the observation.

Returns:

The results of the observation.

Return type:

pd.DataFrame

class vivarium.engine.framework.results.observation.AddingObservation(name, population_filter, when, requires_attributes, results_formatter, aggregator_sources, aggregator, to_observe=<function AddingObservation.<lambda>>)[source]

Concrete class for observing additive and stratified results.

The parent class results_updater method is explicitly defined and stratification-specific attributes aggregator_sources and aggregator are added.

Parameters:
  • name (str)

  • population_filter (PopulationFilter)

  • when (str)

  • requires_attributes (list[str])

  • results_formatter (Callable[[str, pd.DataFrame], pd.DataFrame])

  • aggregator_sources (list[str] | None)

  • aggregator (Callable[[pd.DataFrame], float | pd.Series[float]])

  • to_observe (Callable[[Event], bool])

name

Name of the observation. It will also be the name of the output results file for this particular observation.

population_filter

A named tuple of population filtering details. The first item is a Pandas query string to filter the population down to the simulants who should be considered for the observation. The second item is a boolean indicating whether to include untracked simulants from the observation.

when

Name of the lifecycle phase the observation should happen. Valid values are: “time_step__prepare”, “time_step”, “time_step__cleanup”, or “collect_metrics”.

requires_attributes

The population attributes required for this observation.

results_formatter

Method or function that formats the raw observation results.

stratifications

Tuple of Stratifications to be used by the observation. If empty, the observation is aggregated over the entire population.

aggregator_sources

List of population view columns to be used in the aggregator.

aggregator

Method or function that computes the quantity for this observation.

to_observe

Method or function that determines whether to perform an observation on this Event.

static add_results(existing_results, new_observations)[source]

Adds newly-observed results to the existing results.

Return type:

DataFrame

Parameters:
  • existing_results (DataFrame) – The existing results DataFrame.

  • new_observations (DataFrame) – The new observations DataFrame.

Returns:

The new results added to the existing results.

Notes

If the new observations contain columns not present in the existing results, the columns are added to the DataFrame and initialized with 0.0s.

class vivarium.engine.framework.results.observation.ConcatenatingObservation(name, population_filter, when, requires_attributes, results_formatter, to_observe=<function ConcatenatingObservation.<lambda>>)[source]

Concrete class for observing concatenating (and by extension, unstratified) results.

The parent class results_gatherer and results_updater methods are explicitly defined.

Parameters:
name

Name of the observation. It will also be the name of the output results file for this particular observation.

population_filter

A named tuple of population filtering details. The first item is a Pandas query string to filter the population down to the simulants who should be considered for the observation. The second item is a boolean indicating whether to include untracked simulants from the observation.

when

Name of the lifecycle phase the observation should happen. Valid values are: “time_step__prepare”, “time_step”, “time_step__cleanup”, or “collect_metrics”.

requires_attributes

The population attributes required for this observation.

results_formatter

Method or function that formats the raw observation results.

to_observe

Method or function that determines whether to perform an observation on this Event.

get_results_of_interest(pop)[source]

Return the population with only the included_columns.

Return type:

DataFrame

Parameters:

pop (DataFrame)

static concatenate_results(existing_results, new_observations)[source]

Concatenates the existing results with the new observations.

Return type:

DataFrame

Parameters:
  • existing_results (DataFrame) – The existing results.

  • new_observations (DataFrame) – The new observations.

Returns:

The new results concatenated to the existing results.