{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Collecting Data\n",
    "\n",
    "### The Boltzmann Wealth Model "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "If you want to get straight to the tutorial checkout these environment providers:<br>\n",
    "(with Google Account) [![Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/projectmesa/mesa/blob/main/docs/tutorials/2_collecting_data.ipynb)<br>\n",
    "(No Google Account) [![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/projectmesa/mesa/main?labpath=docs%2Ftutorials%2F2_collecting_data.ipynb) (This can take 30 seconds to 5 minutes to load)\n",
    "\n",
    "*If you are running locally, please ensure you have the latest Mesa version installed.*\n",
    "\n",
    "## Tutorial Description\n",
    "\n",
    "This tutorial extends the Boltzmann wealth model from the [Adding Space tutorial](https://mesa.readthedocs.io/latest/tutorials/1_adding_space.html), by adding Mesa's data collection module. \n",
    "\n",
    "In this portion, we will collect both model level data and agent level data to better understand the dynamics of our model. \n",
    "\n",
    "*If you are starting here please see the [Running Your First Model tutorial](https://mesa.readthedocs.io/latest/tutorials/0_first_model.html) for dependency and start-up instructions*"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### IN COLAB? - Run the next cell "
   ]
  },
  {
   "cell_type": "raw",
   "metadata": {
    "vscode": {
     "languageId": "raw"
    }
   },
   "source": [
    "%pip install --quiet mesa[rec]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Import Dependencies\n",
    "This includes importing of dependencies needed for the tutorial."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [],
   "source": [
    "# Has multi-dimensional arrays and matrices.\n",
    "# Has a large collection of mathematical functions to operate on these arrays.\n",
    "import numpy as np\n",
    "\n",
    "# Data manipulation and analysis.\n",
    "import pandas as pd\n",
    "\n",
    "# Data visualization tools.\n",
    "import seaborn as sns\n",
    "\n",
    "import mesa\n",
    "\n",
    "# Import Cell Agent and OrthogonalMooreGrid\n",
    "from mesa.discrete_space import CellAgent, OrthogonalMooreGrid"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Base Model\n",
    "\n",
    "The below provides the base model from which we will add our space functionality. \n",
    "\n",
    "This is from the [Adding Space tutorial](https://mesa.readthedocs.io/latest/tutorials/1_adding_space.html) tutorial. If you have any questions about it functionality please review that tutorial."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "class MoneyAgent(CellAgent):\n",
    "    \"\"\"An agent with fixed initial wealth.\"\"\"\n",
    "\n",
    "    def __init__(self, model, cell):\n",
    "        super().__init__(model)\n",
    "        self.cell = cell  # Instantiate agent with location (x,y)\n",
    "        self.wealth = 1\n",
    "\n",
    "    # Move Function\n",
    "    def move(self):\n",
    "        self.cell = self.cell.neighborhood.select_random_cell()\n",
    "\n",
    "    def give_money(self):\n",
    "        cellmates = [\n",
    "            a for a in self.cell.agents if a is not self\n",
    "        ]  # Get all agents in cell\n",
    "\n",
    "        if self.wealth > 0 and cellmates:\n",
    "            other_agent = self.random.choice(cellmates)\n",
    "            other_agent.wealth += 1\n",
    "            self.wealth -= 1\n",
    "\n",
    "\n",
    "class MoneyModel(mesa.Model):\n",
    "    \"\"\"A model with some number of agents.\"\"\"\n",
    "\n",
    "    def __init__(self, n, width, height, seed=None):\n",
    "        super().__init__(seed=seed)\n",
    "        self.num_agents = n\n",
    "        # Instantiate an instance of Moore neighborhood space\n",
    "        self.grid = OrthogonalMooreGrid(\n",
    "            (width, height), torus=True, capacity=10, random=self.random\n",
    "        )\n",
    "\n",
    "        # Create agents\n",
    "        agents = MoneyAgent.create_agents(\n",
    "            self,\n",
    "            self.num_agents,\n",
    "            # Randomly select agents cell\n",
    "            self.random.choices(self.grid.all_cells.cells, k=self.num_agents),\n",
    "        )\n",
    "\n",
    "    def step(self):\n",
    "        self.agents.shuffle_do(\"move\")\n",
    "        self.agents.do(\"give_money\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let's create a model with 100 agents on a 10x10 grid, and run it for 20 steps to make sure our base model works."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "model = MoneyModel(100, 10, 10)\n",
    "for _ in range(20):\n",
    "    model.step()\n",
    "# Let's make sure it worked\n",
    "print(len(model.agents))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Collecting Data\n",
    "\n",
    "**Background:** So far, at the end of every model run, we've had to go and write our own code to get the data out of the model. This has two problems: it isn't very efficient, and it only gives us end results. If we wanted to know the wealth of each agent at each step, we'd have to add that to the loop of executing steps, and figure out some way to store the data.\n",
    "\n",
    "Since one of the main goals of agent-based modeling is generating data for analysis, Mesa provides a class which can handle data collection and storage for us and make it easier to analyze.\n",
    "\n",
    "The data collector stores three categories of data: \n",
    " - Model-level variables : Model-level collection functions take a model object as an input. Such as a function that computes a dynamic of the whole model (in this case we will compute a measure of wealth inequality based on all agent's wealth)\n",
    " - Agent-level variables: Agent-level collection functions take an agent object as an input and is typically the state of an agent attributes, in this case wealth.\n",
    " - Tables (which are a catch-all for everything else). \n",
    "\n",
    "**Model-specific information:** We will collect two variables to show Mesa capabilities. \n",
    "- At the model level, let's measure the model's [Gini Coefficient](https://en.wikipedia.org/wiki/Gini_coefficient), a measure of wealth inequality.\n",
    "- At the agent level, we want to collect every agent's wealth at every step. "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Code implementation:**\n",
    "\n",
    "Let's add a DataCollector to the model with [`mesa.DataCollector`](https://github.com/projectmesa/mesa/blob/main/mesa/datacollection.py), and collect the agent's wealth and the gini coefficient at each time step. In the below code each new line of code is described with a comment. These additions are described below. \n",
    "\n",
    "**Helper Function**<br>\n",
    "\\# Add function for model level collection\n",
    "-*Description:* Helper function used by the model class to compute the gini coefficient as described previously. \n",
    "-*API:* N/A\n",
    "\n",
    "**MoneyModel Class**<br>\n",
    "\\# Instantiate DataCollector\n",
    "- *Description:* Create a mesa data collector instance and use keyword arguments (kwargs) `model_reporters` and `agent_reporters` to pass in a dictionary, where the key is the name of the data collected and the value is either function (i.e. computer gini) or an attribute (i.e. \"wealth\"). If it is an attribute it is passed in as a string. \n",
    "- *API:* [Data Collection](https://mesa.readthedocs.io/latest/apis/datacollection.html)\n",
    "\n",
    "\\# Collect data each step\n",
    "- *Description:* Call the `collect` method from `DataCollector`. This causes the reporters to collect the data at each step. If this is not put in the step function then the data collector will collect the described information at the end of the model run. If you want to collect the data only on lets say the 5th step, then you can just add an  `if` statement to only collect on the fifth step.\n",
    "- *API:* [DataCollector.collect](https://mesa.readthedocs.io/latest/apis/datacollection.html)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Add function for model level collection\n",
    "def compute_gini(model):\n",
    "    agent_wealths = [agent.wealth for agent in model.agents]\n",
    "    x = sorted(agent_wealths)\n",
    "    n = model.num_agents\n",
    "    B = sum(xi * (n - i) for i, xi in enumerate(x)) / (n * sum(x))\n",
    "    return 1 + (1 / n) - 2 * B\n",
    "\n",
    "\n",
    "class MoneyAgent(CellAgent):\n",
    "    \"\"\"An agent with fixed initial wealth.\"\"\"\n",
    "\n",
    "    def __init__(self, model, cell):\n",
    "        super().__init__(model)\n",
    "        self.cell = cell\n",
    "        self.wealth = 1\n",
    "\n",
    "    def move(self):\n",
    "        self.cell = self.cell.neighborhood.select_random_cell()\n",
    "\n",
    "    def give_money(self):\n",
    "        cellmates = [a for a in self.cell.agents if a is not self]\n",
    "\n",
    "        if self.wealth > 0 and cellmates:\n",
    "            other_agent = self.random.choice(cellmates)\n",
    "            other_agent.wealth += 1\n",
    "            self.wealth -= 1\n",
    "\n",
    "\n",
    "class MoneyModel(mesa.Model):\n",
    "    \"\"\"A model with some number of agents.\"\"\"\n",
    "\n",
    "    def __init__(self, n, width, height, seed=None):\n",
    "        super().__init__(seed=seed)\n",
    "        self.num_agents = n\n",
    "        self.grid = OrthogonalMooreGrid(\n",
    "            (width, height), torus=True, capacity=10, random=self.random\n",
    "        )\n",
    "        # Instantiate DataCollector\n",
    "        self.datacollector = mesa.DataCollector(\n",
    "            model_reporters={\"Gini\": compute_gini}, agent_reporters={\"Wealth\": \"wealth\"}\n",
    "        )\n",
    "\n",
    "        # Create agents\n",
    "        agents = MoneyAgent.create_agents(\n",
    "            self,\n",
    "            self.num_agents,\n",
    "            self.random.choices(self.grid.all_cells.cells, k=self.num_agents),\n",
    "        )\n",
    "\n",
    "    def step(self):\n",
    "        # Collect data each step\n",
    "        self.datacollector.collect(self)\n",
    "        self.agents.shuffle_do(\"move\")\n",
    "        self.agents.do(\"give_money\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "At every step of the model, the datacollector will collect and store the model-level current Gini coefficient, as well as each agent's wealth, associating each with the current step.\n",
    "\n",
    "We run the model just as we did above. Now is when an interactive session, especially via a notebook, comes in handy: the DataCollector can export the data it has collected as a pandas* DataFrame, for easy and interactive analysis. \n",
    "\n",
    "*If you are new to Python, please be aware that pandas is already installed as a dependency of Mesa and that [pandas](https://pandas.pydata.org/docs/) is a \"fast, powerful, flexible and easy to use open source data analysis and manipulation tool\". Pandas is a great resource to help analyze the data collected in your models."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "model = MoneyModel(100, 10, 10)\n",
    "for _ in range(100):\n",
    "    model.step()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Analyzing MoneyModel Data\n",
    "\n",
    "**Code implementation:**\n",
    "\n",
    "\\# Extract MoneyModel data in a Pandas dataframe\n",
    "- *Description:* Call `DataCollector.get_model_vars_dataframe()` method to get the model reporters (in this case gini coefficient) from the model object. We the use seaborn (sns) to do a line plot of the data of the model run. \n",
    "- *API:* [get_model_vars_dataframe](https://mesa.readthedocs.io/latest/apis/datacollection.html#datacollection.DataCollector.get_model_vars_dataframe)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Extract MoneyModel data in a Pandas dataframe\n",
    "gini = model.datacollector.get_model_vars_dataframe()\n",
    "g = sns.lineplot(data=gini)\n",
    "g.set(title=\"Gini Coefficient over Time\", ylabel=\"Gini Coefficient\");"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Exercises\n",
    "- Display just the data to see the format\n",
    "- Comment on the collect method on the step function and see the impact\n",
    "- Increase agents and time to see how the plot changes"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Analyzing an MoneyAgent Data\n",
    "\n",
    "**Code implementation:**\n",
    "\n",
    "\\# Extract MoneyAgent data in a Pandas dataframe\n",
    "- *Description:* Call `DataCollector.get_model_agent_dataframe()` method to get the agent reporters (in this case agent wealth attribute) from the model object. \n",
    "- *API:* [get_model_agent_dataframe](https://mesa.readthedocs.io/latest/apis/datacollection.html#datacollection.DataCollector.get_agent_vars_dataframe)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Extract MoneyAgent data in a Pandas dataframe\n",
    "agent_wealth = model.datacollector.get_agent_vars_dataframe()\n",
    "agent_wealth.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "You'll see that the DataFrame's index is pairings of model step and agent ID. This is because the data collector stores the data in a dictionary, with the step number as the key, and a dictionary of agent ID and variable value pairs as the value. The data collector then converts this dictionary into a DataFrame, which is why the index is a pair of (model step, agent ID). You can analyze it the way you would any other DataFrame. For example, to get a histogram of agent wealth at the model's end.\n",
    "\n",
    "*Note: As the following code is pandas and seaborn we do not provide explanatory text*"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "last_step = agent_wealth.index.get_level_values(\"Step\").max()  # Get the last step\n",
    "end_wealth = agent_wealth.xs(last_step, level=\"Step\")[\n",
    "    \"Wealth\"\n",
    "]  # Get the welath of each agentat the last step\n",
    "# Create a histogram of wealth at the last step\n",
    "g = sns.histplot(end_wealth, discrete=True)\n",
    "g.set(\n",
    "    title=\"Distribution of wealth at the end of simulation\",\n",
    "    xlabel=\"Wealth\",\n",
    "    ylabel=\"number of agents\",\n",
    ");"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Or to plot the wealth of a given agent (in this example, agent 7):"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Get the wealth of agent 7 over time\n",
    "one_agent_wealth = agent_wealth.xs(7, level=\"AgentID\")\n",
    "\n",
    "# Plot the wealth of agent 7 over time\n",
    "g = sns.lineplot(data=one_agent_wealth, x=\"Step\", y=\"Wealth\")\n",
    "g.set(title=\"Wealth of agent 7 over time\");"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "You can also plot a reporter of multiple agents over time."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "agent_list = [3, 14, 25]\n",
    "\n",
    "# Get the wealth of multiple agents over time\n",
    "multiple_agents_wealth = agent_wealth[\n",
    "    agent_wealth.index.get_level_values(\"AgentID\").isin(agent_list)\n",
    "]\n",
    "# Plot the wealth of multiple agents over time\n",
    "g = sns.lineplot(data=multiple_agents_wealth, x=\"Step\", y=\"Wealth\", hue=\"AgentID\")\n",
    "g.set(title=\"Wealth of agents 3, 14 and 25 over time\");"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We can also plot the average of all agents, with a 95% confidence interval for that average."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Transform the data to a long format\n",
    "agent_wealth_long = agent_wealth.T.unstack().reset_index()\n",
    "agent_wealth_long.columns = [\"Step\", \"AgentID\", \"Variable\", \"Value\"]\n",
    "agent_wealth_long.head(3)\n",
    "\n",
    "# Plot the average wealth over time\n",
    "g = sns.lineplot(data=agent_wealth_long, x=\"Step\", y=\"Value\", errorbar=(\"ci\", 95))\n",
    "g.set(title=\"Average wealth over time\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Which is exactly 1, as expected in this model, since each agent starts with one wealth unit, and each agent gives one wealth unit to another agent at each step."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "You can also use pandas to export the data to a CSV (comma separated value) file, which can be opened by any common spreadsheet application or opened by pandas.\n",
    "\n",
    "If you do not specify a file path, the file will be saved in the local directory. After you run the code below you will see two files appear (*model_data.csv* and *agent_data.csv*)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# save the model data (stored in the pandas gini object) to CSV\n",
    "gini.to_csv(\"model_data.csv\")\n",
    "\n",
    "# save the agent data (stored in the pandas agent_wealth object) to CSV\n",
    "agent_wealth.to_csv(\"agent_data.csv\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Challenge update the model, conduct a batch run with a parameter sweep,\n",
    "# and visualize your results"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Next Steps\n",
    "\n",
    "Check out the [Agent Management Through AgentSet tutorial](https://mesa.readthedocs.io/latest/tutorials/3_agentset.html) on effective ways to manage agents."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "[Comer2014] Comer, Kenneth W. “Who Goes First? An Examination of the Impact of Activation on Outcome Behavior in AgentBased Models.” George Mason University, 2014. http://mars.gmu.edu/bitstream/handle/1920/9070/Comer_gmu_0883E_10539.pdf\n",
    "\n",
    "[Dragulescu2002] Drăgulescu, Adrian A., and Victor M. Yakovenko. “Statistical Mechanics of Money, Income, and Wealth: A Short Survey.” arXiv Preprint Cond-mat/0211175, 2002. http://arxiv.org/abs/cond-mat/0211175."
   ]
  }
 ],
 "metadata": {
  "anaconda-cloud": {},
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.12.5"
  },
  "widgets": {
   "state": {},
   "version": "1.1.2"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}