{ "cells": [ { "cell_type": "markdown", "id": "2a113e8a", "metadata": {}, "source": [ "# Grama: Model Building\n", "\n", "*Purpose*: Grama provides tools to work with models, but in order to use these tools we need to be able to buidl models in grama! This exercise will introduce fundamental concepts for building and sanity-checking models. We'll build on these skills in future exercises.\n" ] }, { "cell_type": "markdown", "id": "067569a6", "metadata": {}, "source": [ "## Setup\n" ] }, { "cell_type": "code", "execution_count": 1, "id": "5b289c69", "metadata": { "collapsed": false }, "outputs": [], "source": [ "import grama as gr\n", "import pandas as pd\n", "DF = gr.Intention()\n", "%matplotlib inline" ] }, { "cell_type": "markdown", "id": "012c4621", "metadata": {}, "source": [ "# Composition\n", "\n", "Recall that there are four classes of verb in grama; *composition* verbs take a model as an input and produce a new model as an output. Using compositions we can add information to a model, such as input metadata and functions.\n" ] }, { "cell_type": "markdown", "id": "8f4ada0b", "metadata": {}, "source": [ "\n", "![Grama verb class diagram](./images/verb-classes.png)\n" ] }, { "cell_type": "markdown", "id": "14f502c6", "metadata": {}, "source": [ "# Useful Programming Tools\n", "\n", "To build grama models, we'll need to use a few simple programming tools.\n" ] }, { "cell_type": "markdown", "id": "64e04f54", "metadata": {}, "source": [ "## Lambda functions\n", "\n", "In an earlier exercise, we learned how to define Python functions using the `def` keyword:\n", "\n", "```python\n", "def fcn(x):\n", " return x ** 2\n", "```\n", "\n", "However, we can define the same function with `lambda` syntax:\n", "\n", "```python\n", "fcn = lambda x: x ** 2\n", "```\n", "\n", "A `lambda` function starts with the keyword `lambda`, and is followed by its arguments. The example above has just one argument `x`. After the arguments comes a colon `:`, which signals that what follows is the output of the function.\n", "\n", "The advantage of this `lambda` syntax is that it is more compact, and can be incorporated with a grama model building pipeline. Let's get some practice defining lambda functions.\n" ] }, { "cell_type": "markdown", "id": "c5a56de8", "metadata": {}, "source": [ "### __q1__ Implement a `lambda` function\n", "\n", "Use the `lambda` syntax to implement the following function:\n", "\n", "$$f(x) = x + 1$$\n" ] }, { "cell_type": "code", "execution_count": 2, "id": "f6435619", "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Success!\n" ] } ], "source": [ "# TASK: Create a lambda function to implement the function above\n", "\n", "fcn = lambda x: x + 1\n", "# Use the following to check your work\n", "assert \\\n", " fcn(1) == 2, \\\n", " \"Incorrect value\"\n", "\n", "print(\"Success!\")" ] }, { "cell_type": "markdown", "id": "0176dd11", "metadata": {}, "source": [ "## Working with DataFrames\n", "\n", "Grama uses DataFrames to represent data and to interface with models. The constructor `gr.df_make()` is a convenient way to make a simple DataFrame:\n" ] }, { "cell_type": "code", "execution_count": 3, "id": "d339625c", "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
xyz
01arecycled value
12brecycled value
23crecycled value
\n", "
" ], "text/plain": [ " x y z\n", "0 1 a recycled value\n", "1 2 b recycled value\n", "2 3 c recycled value" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# NOTE: No need to edit; recall that gr.df_make(...)\n", "# helps us construct DataFrames\n", "gr.df_make(\n", " x=[1, 2, 3],\n", " y=[\"a\", \"b\", \"c\"],\n", " z=\"recycled value\",\n", ")" ] }, { "cell_type": "markdown", "id": "eba03a86", "metadata": {}, "source": [ "We can *combine* this DataFrame constructor with a `lambda` function to take a DataFrame as an input, and return a DataFrame as an output. For instance, the following is a DataFrame version of the previous `lambda` function:\n", "\n", "```python\n", "fcn_df = lambda df: gr.df_make(y=df.x ** 2)\n", "```\n", "\n", "Note that this `lambda` function takes in a DataFrame, uses specific columns from that input `df`, and returns a DataFrame.\n" ] }, { "cell_type": "markdown", "id": "c9586f21", "metadata": {}, "source": [ "### __q2__ Functions on DataFrames\n", "\n", "Use the `lambda` syntax to implement the following function:\n", "\n", "$$y(x) = x + 1$$\n", "\n", "Make sure your lambda function takes a DataFrame as an argument, and returns a DataFrame as an output." ] }, { "cell_type": "code", "execution_count": 4, "id": "8a7ae65d", "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Success!\n" ] } ], "source": [ "# TASK: Create a lambda function based on DataFrames\n", "\n", "fcn_df = lambda df: gr.df_make(y=df.x + 1)\n", "# NOTE: Use the following to check your work\n", "df_res = fcn_df(gr.df_make(x=[0, 1]))\n", "df_out = gr.df_make(y=[1, 2])\n", "\n", "assert \\\n", " isinstance(df_res, pd.DataFrame), \\\n", " \"Output must be DataFrame\"\n", "\n", "assert \\\n", " gr.df_equal(df_res, df_out), \\\n", " \"Incorrect output\"\n", "\n", "print(\"Success!\")" ] }, { "cell_type": "markdown", "id": "000ea6e9", "metadata": {}, "source": [ "# Constructing Grama Models\n", "\n", "Remember that *composition* verbs take in a grama model and return a new model. We use compositions primarily to construct grama models. We can start a blank model with `gr.Model()`, but then we need to add functionality to that model!\n" ] }, { "cell_type": "markdown", "id": "5ff1c1ff", "metadata": { "tags": [] }, "source": [ "## Add a function\n", "\n", "One of the most core parts of a model is its set of *functions*; these map from inputs to outputs. For functions defined by simple mathematical expressions, the composition `gr.cp_vec_function()` is the appropriate tool for the job.\n" ] }, { "cell_type": "markdown", "id": "ff424b78", "metadata": {}, "source": [ "### __q3__ Add a function to a model\n", "\n", "Add a function to `md_basic` that provides the output `y = x + 1`.\n", "\n", "*Hint 1*: Consult the documentation for `gr.cp_vec_function()` to see what arguments it requires. Remember that you can use `help(gr.cp_vec_function)` or use `Shift + Tab` to bring up the documentation at your cursor.\n", "\n", "*Hint 2*: Use a DataFrame-based lambda function, as you did for q2 above.\n" ] }, { "cell_type": "code", "execution_count": 5, "id": "fa076f73", "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Success!\n" ] } ], "source": [ "# TASK: Finish implementing the model\n", "md_basic = (\n", " gr.Model(\"Basic model\")\n", " >> gr.cp_vec_function(\n", " fun=lambda df: gr.df_make(\n", " y=df.x + 1\n", " ),\n", " var=[\"x\"],\n", " out=[\"y\"],\n", " )\n", ")\n", "\n", "# NOTE: Use the following to check your work\n", "df_res = (\n", " md_basic\n", " >> gr.ev_df(gr.df_make(x=0))\n", ")\n", "df_out = gr.df_make(x=0, y=1)\n", "\n", "assert \\\n", " set(md_basic.var) == {\"x\"}, \\\n", " \"md_basic has wrong variables\"\n", " \n", "assert \\\n", " set(md_basic.out) == {\"y\"}, \\\n", " \"md_basic has wrong outputs\"\n", "\n", "assert \\\n", " gr.df_equal(df_res, df_out), \\\n", " \"md_basic function incorrect\"\n", " \n", "print(\"Success!\")" ] }, { "cell_type": "markdown", "id": "aba3eebb", "metadata": {}, "source": [ "## Add bounds\n", "\n", "Once your model has a function, it is useful to define bounds for the inputs. This does not *force* the model to reject values outside the bounds, but rather serves as useful *metadata* about the model. Bounds are used by other verbs like exploratory tools (e.g. for parameter sweeps) and optimization.\n", "\n", "The verb `gr.cp_bounds()` allows you to add bounds for inputs.\n" ] }, { "cell_type": "markdown", "id": "fa35ba7e", "metadata": {}, "source": [ "### __q4__ Add bounds to a model\n", "\n", "For the following model, add bounds $0 \\leq x_1 \\leq 1$ and $0 \\leq x_2 \\leq 1$.\n" ] }, { "cell_type": "code", "execution_count": 6, "id": "87dfea8e", "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Success!\n" ] } ], "source": [ "# TASK: Finish implementing the model\n", "md_bounded = (\n", " gr.Model(\"Bounded input\")\n", " >> gr.cp_vec_function(\n", " fun=lambda df: gr.df_make(\n", " y=gr.sin(df.x1) + gr.sin(df.x2)**2\n", " ),\n", " var=[\"x1\", \"x2\"],\n", " out=[\"y\"],\n", " )\n", " >> gr.cp_bounds(\n", " x1=(0, 1),\n", " x2=(0, 1),\n", " )\n", ")\n", "\n", "# NOTE: Use the following to check your work\n", "assert \\\n", " (md_bounded.domain.bounds[\"x1\"][0] == 0) and \\\n", " (md_bounded.domain.bounds[\"x1\"][1] == 1) and \\\n", " (md_bounded.domain.bounds[\"x2\"][0] == 0) and \\\n", " (md_bounded.domain.bounds[\"x2\"][1] == 1), \\\n", " \"md_bounded bounds incorrect\"\n", " \n", "print(\"Success!\")" ] }, { "cell_type": "markdown", "id": "541e0f24", "metadata": {}, "source": [ "A model with functions and bounds already has a lot of useful information! There is a lot more information that a grama model can have, but that's enough model building for this exercise.\n" ] }, { "cell_type": "markdown", "id": "cd1834ab", "metadata": {}, "source": [ "## Composition: Quick Reference\n", "\n", "As a quick-reference, here is a list of the most important grama composition verbs. Note that some of these are covered later in the exercise sequence; most notably, the verbs related to quantifying uncertainties (marginals and copulas).\n", "\n", "| Verb | Description |\n", "|---|---|\n", "| `gr.Model()` | Start a new grama model |\n", "| `gr.cp_vec_function()` | Add a *vectorized* (DataFrame-based) function |\n", "| `gr.cp_function()` | Add a *non-vectorized* (array-based) function |\n", "| `gr.cp_bounds()` | Add bounds for inputs |\n", "| `gr.cp_marginals()` | Add marginal distributions for inputs |\n", "| `gr.cp_copula_independence()` | Assume random inputs are independent |\n", "| `gr.cp_copula_gaussian()` | Assume random inputs are correlated |\n" ] }, { "cell_type": "markdown", "id": "7a52352b", "metadata": {}, "source": [ "# Checking models\n", "\n", "Once you've built a grama model, you can use a variety of tools to work with the model. These are useful for making sense of model behavior.\n", "\n", "One of the most key studies you can do with a model is a parameter sweep. There are a variety of grama tools to help do parameter sweeps.\n" ] }, { "cell_type": "markdown", "id": "2be8866d", "metadata": { "tags": [] }, "source": [ "### __q5__ Create a grid of values\n", "\n", "The verb `gr.df_grid()` is a helper function that creates a \"grid\" of points. Modify the code below to see how this function operates.\n" ] }, { "cell_type": "code", "execution_count": 7, "id": "daf71558", "metadata": { "collapsed": false }, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/plain": [ "" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# TASK: Modify the code, note the changes\n", "(\n", " # TODO: Try modifying the following code; see how\n", " # the results change\n", " gr.df_grid(\n", " x1=[0.0, 0.5, 1.0],\n", " x2=gr.linspace(0, 1, 25),\n", " )\n", " \n", " # NOTE: No need to edit the following\n", " >> gr.ggplot(gr.aes(\"x1\", \"x2\"))\n", " + gr.geom_point()\n", ")" ] }, { "cell_type": "markdown", "id": "67ac9b2d", "metadata": {}, "source": [ "Once you've created a grid of points, you can evaluate the model on that grid to perform a parameter sweep.\n" ] }, { "cell_type": "markdown", "id": "60bf9dbb", "metadata": {}, "source": [ "### __q6__ Evaluate a grid of values\n", "\n", "Create a grid of points in `x1` and `x2` to evaluate the model.\n" ] }, { "cell_type": "code", "execution_count": 8, "id": "c9fcffed", "metadata": { "collapsed": false }, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/plain": [ "" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# TASK: Create a grid of points\n", "(\n", " md_bounded\n", " >> gr.ev_df(\n", "\n", " gr.df_grid(\n", " x1=gr.linspace(0, 1, 25),\n", " x2=gr.linspace(0, 1, 25),\n", " )\n", " )\n", " \n", " >> gr.ggplot(gr.aes(\"x1\", \"x2\", fill=\"y\"))\n", " + gr.geom_tile()\n", ")" ] }, { "cell_type": "markdown", "id": "6e62f9a2", "metadata": {}, "source": [ "## Model sanity checks\n", "\n", "Parameter sweeps are particularly useful when checking that you implemented a model *correctly*. For example, let's suppose someone implemented the following function in a model\n", "\n", "$$\\begin{aligned}f(x, y) &= x + y^2 \\,|\\, x < 1/2 \\\\ & = 1 - x + y^2 \\,|\\, x \\geq 1/2 \\end{aligned}$$\n", "\n", "The following code *attempts* to implement the function as a grama model.\n" ] }, { "cell_type": "code", "execution_count": 9, "id": "5c4ffc94", "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "model: Error example\n", "\n", " inputs:\n", " var_det:\n", " y: [0, 1]\n", " x: [0, 1]\n", "\n", " var_rand:\n", "\n", " copula:\n", " None\n", "\n", " functions:\n", " f0: ['x', 'y'] -> ['f']" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# NOTE: No need to edit; you'll explore this model in the next task\n", "md_error = (\n", " gr.Model(\"Error example\")\n", " >> gr.cp_vec_function(\n", " fun=lambda df: gr.df_make(\n", " f=(df.x + df.y**2) * (df.x < 0.5)\n", " +(1 + df.x + df.y**2) * (df.x >= 0.5)\n", " ),\n", " var=[\"x\", \"y\"],\n", " out=[\"f\"],\n", " )\n", " >> gr.cp_bounds(\n", " x=(0, 1),\n", " y=(0, 1),\n", " )\n", ")\n", "md_error" ] }, { "cell_type": "markdown", "id": "9a27e88f", "metadata": {}, "source": [ "However, the model implementation above is in error. Up next, you'll use a parameter sweep to help find the error.\n" ] }, { "cell_type": "markdown", "id": "750c6972", "metadata": { "tags": [] }, "source": [ "### __q7__ Find the error\n", "\n", "Construct a sinew plot to inspect the model behavior and find the implementation error. Answer the questions under *observations* below.\n", "\n", "Make sure to sweep over all the deterministic variables in the model. Remember that the effect of `x` should switch from positive to negative at the midpoint of its domain.\n", "\n", "*Hint*: We learned about sinew plots in the previous grama exercise `e-grama01-basics`.\n" ] }, { "cell_type": "code", "execution_count": 10, "id": "28ebc7e4", "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Calling plot_sinew_outputs....\n" ] }, { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" }, { "data": { "text/plain": [ "" ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# TASK: Explore the model's behavior to find the implementation error\n", "(\n", " md_error\n", "\n", " >> gr.ev_sinews(df_det=\"swp\")\n", " # NOTE: No need to edit; use this to visualize your results\n", " >> gr.pt_auto()\n", ")" ] }, { "cell_type": "markdown", "id": "abda48a5", "metadata": {}, "source": [ "*Observations*\n", "\n", "- Do there appear to be any \"jumps\" in the output? Which variable seems to cause the jump?\n", " - Yes, there's a jump in `x` around the middle of its domain.\n", "- What is the error in the implementation?\n", " - The sign of `x` in the second part of the piecewise function is wrong; it should be negative.\n", "" ] }, { "cell_type": "markdown", "id": "472527f0", "metadata": {}, "source": [ "# Payoff: Rapid model exploration\n", "\n", "One of the big payoffs from these model building tools is the ability to rapidly explore models. Since a grama model includes a lot of information (functions and bounds), the tools for evaluating and visualizing models are extremely simple.\n", "\n", "For instance, we can re-build the model to fix the sign error on `x`, and quickly construct a sinew plot to verify that we've fixed the issue:\n" ] }, { "cell_type": "code", "execution_count": 11, "id": "7ac6d63e", "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Calling plot_sinew_outputs....\n" ] }, { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" }, { "data": { "text/plain": [ "" ] }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# NOTE: No need to edit; this fixes the model\n", "(\n", " # Build the model\n", " gr.Model(\"Fixed model\")\n", " >> gr.cp_vec_function(\n", " fun=lambda df: gr.df_make(\n", " f=(df.x + df.y**2) * (df.x < 0.5)\n", " +(1 - df.x + df.y**2) * (df.x >= 0.5)\n", " ),\n", " var=[\"x\", \"y\"],\n", " out=[\"f\"],\n", " )\n", " >> gr.cp_bounds(x=(0, 1), y=(0, 1))\n", " \n", " # Evaluate\n", " >> gr.ev_sinews(df_det=\"swp\")\n", " \n", " # Plot\n", " >> gr.pt_auto()\n", ")" ] }, { "cell_type": "markdown", "id": "50ce52a3", "metadata": {}, "source": [ "We'll see in the next grama exercise how to generate and visualize contour data. Once we've implemented a model, the syntax for generating a contour plot is quite simple:\n" ] }, { "cell_type": "code", "execution_count": 12, "id": "bba7b0ae", "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Calling plot_contour....\n" ] }, { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/plain": [ "" ] }, "execution_count": 12, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# NOTE: No need to edit; this generates a contour plot\n", "(\n", " # Select the model\n", " md_error\n", " # Evaluate\n", " >> gr.ev_contour(\n", " var=[\"x\", \"y\"],\n", " out=[\"f\"],\n", " )\n", " # Plot\n", " >> gr.pt_auto()\n", ")" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.9.7" } }, "nbformat": 4, "nbformat_minor": 5 }