{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Marketing is about connecting the best products or services to the right customers. \n",
    "\n",
    "#### In today’s digital world, personalization is essential for meeting customer’s needs more effectively, thereby increasing customer satisfaction and the likelihood of repeat purchases. \n",
    "\n",
    "#### Recommendation systems are a set of algorithms which recommend most relevant items to users based on their preferences predicted using the algorithms. It acts on behavioral data, such as customer’s previous purchase, ratings or reviews to predict their likelihood of buying a new product or service."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### For our model building today we will be using groceries.csv. This dataset contains transactions of a grocery store. This has been made available by College of Science - Cal State East Bay"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### ASSOCIATION RULES (ASSOCIATION RULE MINING)\n",
    "\n",
    "- Association rule finds combinations of items that frequently occur together in orders or baskets (in a retail context). \n",
    "\n",
    "- The items that frequently occur together are called itemsets. Itemsets help to discover relationships between items that people buy together and use that as a basis for creating strategies like \n",
    "    - combining products as combo offer or \n",
    "    - place products next to each other in retail shelves to attract customer attention.\n",
    "    \n",
    "    \n",
    "- An application of association rule mining is in Market Basket Analysis. A technique used mostly by retailers to find associations between items purchased by customers."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Data structure Libraries\n",
    "import pandas as pd\n",
    "import numpy as np\n",
    "\n",
    "# Graphing Libraries\n",
    "import seaborn as sn \n",
    "import matplotlib.pyplot as plt\n",
    "%matplotlib inline\n",
    "\n",
    "import warnings\n",
    "warnings.filterwarnings('ignore')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Python’s open() method can be used to open the file and readlines() to read each line. The following code block can be used for loading and reading the data"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [],
   "source": [
    "all_txns = []\n",
    "\n",
    "#open the file\n",
    "with open('groceries.csv') as f:\n",
    "    \n",
    "    #read each line\n",
    "    content = f.readlines()\n",
    "    \n",
    "    #Remove white space from the beginning and end of the line\n",
    "    txns = [x.strip() for x in content]\n",
    "    \n",
    "    # Iterate through each line and create a list of transactions\n",
    "    for each_txn in txns:\n",
    "        #Each transaction will contain a list of item in the transaction\n",
    "        all_txns.append( each_txn.split(',') )"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### print the first five transactions"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[['citrus fruit', 'semi-finished bread', 'margarine', 'ready soups'],\n",
       " ['tropical fruit', 'yogurt', 'coffee'],\n",
       " ['whole milk'],\n",
       " ['pip fruit', 'yogurt', 'cream cheese ', 'meat spreads'],\n",
       " ['other vegetables',\n",
       "  'whole milk',\n",
       "  'condensed milk',\n",
       "  'long life bakery product']]"
      ]
     },
     "execution_count": 3,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "all_txns[0:5]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Python library mlxtend provides methods to generate association rules from a list of transactions. But these methods require the data to be fed in specific format. \n",
    "\n",
    "### The transactions and items need to be converted into a tabular or matrix format. Each row represents a transaction and each column represents an item. \n",
    "\n",
    "### So, the matrix size will be of M × N, where M represents the total number of transactions and N represents all unique items available across all transactions (or the number of items sold by the seller).\n",
    "\n",
    "### The items available in each transaction will be represented in one-hot-encoded format, that is, the item is encoded True if it exists in the transaction or False otherwise."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [],
   "source": [
    "from mlxtend.preprocessing import TransactionEncoder\n",
    "from mlxtend.frequent_patterns import apriori, association_rules"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Initialize OnehotTransactions\n",
    "one_hot_encoding = TransactionEncoder()\n",
    "\n",
    "# Transform the data into one-hot-encoding format\n",
    "one_hot_txns = one_hot_encoding.fit(all_txns).transform(all_txns)\n",
    "\n",
    "# Convert the matrix into the dataframe.\n",
    "one_hot_txns_df = pd.DataFrame( one_hot_txns, columns=one_hot_encoding.columns_)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>berries</th>\n",
       "      <th>beverages</th>\n",
       "      <th>bottled beer</th>\n",
       "      <th>bottled water</th>\n",
       "      <th>brandy</th>\n",
       "      <th>brown bread</th>\n",
       "      <th>butter</th>\n",
       "      <th>butter milk</th>\n",
       "      <th>cake bar</th>\n",
       "      <th>candles</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>False</td>\n",
       "      <td>False</td>\n",
       "      <td>False</td>\n",
       "      <td>False</td>\n",
       "      <td>False</td>\n",
       "      <td>False</td>\n",
       "      <td>True</td>\n",
       "      <td>False</td>\n",
       "      <td>False</td>\n",
       "      <td>False</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>6</th>\n",
       "      <td>False</td>\n",
       "      <td>False</td>\n",
       "      <td>False</td>\n",
       "      <td>False</td>\n",
       "      <td>False</td>\n",
       "      <td>False</td>\n",
       "      <td>False</td>\n",
       "      <td>False</td>\n",
       "      <td>False</td>\n",
       "      <td>False</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>7</th>\n",
       "      <td>False</td>\n",
       "      <td>False</td>\n",
       "      <td>True</td>\n",
       "      <td>False</td>\n",
       "      <td>False</td>\n",
       "      <td>False</td>\n",
       "      <td>False</td>\n",
       "      <td>False</td>\n",
       "      <td>False</td>\n",
       "      <td>False</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>8</th>\n",
       "      <td>False</td>\n",
       "      <td>False</td>\n",
       "      <td>False</td>\n",
       "      <td>False</td>\n",
       "      <td>False</td>\n",
       "      <td>False</td>\n",
       "      <td>False</td>\n",
       "      <td>False</td>\n",
       "      <td>False</td>\n",
       "      <td>False</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>9</th>\n",
       "      <td>False</td>\n",
       "      <td>False</td>\n",
       "      <td>False</td>\n",
       "      <td>False</td>\n",
       "      <td>False</td>\n",
       "      <td>False</td>\n",
       "      <td>False</td>\n",
       "      <td>False</td>\n",
       "      <td>False</td>\n",
       "      <td>False</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   berries  beverages  bottled beer  bottled water  brandy  brown bread  \\\n",
       "5    False      False         False          False   False        False   \n",
       "6    False      False         False          False   False        False   \n",
       "7    False      False          True          False   False        False   \n",
       "8    False      False         False          False   False        False   \n",
       "9    False      False         False          False   False        False   \n",
       "\n",
       "   butter  butter milk  cake bar  candles  \n",
       "5    True        False     False    False  \n",
       "6   False        False     False    False  \n",
       "7   False        False     False    False  \n",
       "8   False        False     False    False  \n",
       "9   False        False     False    False  "
      ]
     },
     "execution_count": 6,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "one_hot_txns_df.iloc[5:10, 10:20]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### As we can see transaction with index 5 contains an item called butter and transaction with index 7 contains an item bottled beer."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### finding the size (shape or dimension) of the matrix."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(9835, 171)"
      ]
     },
     "execution_count": 7,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "one_hot_txns_df.shape"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### The sparse matrix has a dimension of 9835 × 171. So, a total of 9835 transactions and 171 items are available. This matrix can be fed to generate rules"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### We will use apriori algorithms to generate itemset. The total number of itemset will depend on thenumber of items that exist across all transactions. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "171"
      ]
     },
     "execution_count": 8,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "#check the number of items in the data\n",
    "len(one_hot_txns_df.columns)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### For itemset containing 2 items in each set, the total number of itemsets will be 171 C 2, = 14535."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Since it is a very large number and computationally intensive. To limit the number of generated rules, we will apply minimum support value. All items that do not have the minimum support will be removed from the possible itemset combinations."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Apriori algorithm takes the following parameters:\n",
    "1. df: pandas − DataFrame in a one-hot-encoded format.\n",
    "\n",
    "2. min_support: A float between 0 and 1 for minimum support of the itemsets returned. Default is 0.5.\n",
    "\n",
    "3. use_colnames: boolean − If true, uses the DataFrames’ column names in the returned DataFrame instead of column indices."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Let us build our algorithm with a minimum support of 0.02, that is, the itemset is available in at least 2% of all transactions."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [],
   "source": [
    "frequent_itemsets = apriori( one_hot_txns_df, min_support=0.02, use_colnames=True)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>support</th>\n",
       "      <th>itemsets</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>60</th>\n",
       "      <td>0.020437</td>\n",
       "      <td>[bottled beer, whole milk]</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>52</th>\n",
       "      <td>0.033859</td>\n",
       "      <td>[sugar]</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>89</th>\n",
       "      <td>0.035892</td>\n",
       "      <td>[other vegetables, tropical fruit]</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>105</th>\n",
       "      <td>0.021047</td>\n",
       "      <td>[root vegetables, tropical fruit]</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>88</th>\n",
       "      <td>0.032740</td>\n",
       "      <td>[other vegetables, soda]</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>16</th>\n",
       "      <td>0.058058</td>\n",
       "      <td>[coffee]</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>111</th>\n",
       "      <td>0.024504</td>\n",
       "      <td>[shopping bags, whole milk]</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>36</th>\n",
       "      <td>0.079817</td>\n",
       "      <td>[newspapers]</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>119</th>\n",
       "      <td>0.056024</td>\n",
       "      <td>[whole milk, yogurt]</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>55</th>\n",
       "      <td>0.071683</td>\n",
       "      <td>[whipped/sour cream]</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "      support                            itemsets\n",
       "60   0.020437          [bottled beer, whole milk]\n",
       "52   0.033859                             [sugar]\n",
       "89   0.035892  [other vegetables, tropical fruit]\n",
       "105  0.021047   [root vegetables, tropical fruit]\n",
       "88   0.032740            [other vegetables, soda]\n",
       "16   0.058058                            [coffee]\n",
       "111  0.024504         [shopping bags, whole milk]\n",
       "36   0.079817                        [newspapers]\n",
       "119  0.056024                [whole milk, yogurt]\n",
       "55   0.071683                [whipped/sour cream]"
      ]
     },
     "execution_count": 10,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# printing 10 randomly sampled itemsets and their corresponding support\n",
    "frequent_itemsets.sample(10, random_state = 90)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### The apriori algorithm filters out frequent itemsets which have minimum support of greater than 2%. As we can infer that whole milk and yogurt appear together in about 5.6% of the baskets"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### These itemsets can now be passed to association_rules for generating rules and corresponding metrics."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### The association rules takes in following parameters, \n",
    "1. df : pandas − DataFrame of frequent itemsets with columns [‘support’, ‘itemsets’].\n",
    "\n",
    "2. metric −  ‘confidence’ and ‘lift’ to evaluate if a rule is of interest. Default is ‘confidence’.\n",
    "\n",
    "3. min_threshold − Minimal threshold for the evaluation metric to decide whether a candidate rule is of interest."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [],
   "source": [
    "rules = association_rules( frequent_itemsets, # itemsets \n",
    "                          metric=\"lift\", # lift \n",
    "                          min_threshold=1 )"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>antecedants</th>\n",
       "      <th>consequents</th>\n",
       "      <th>antecedent support</th>\n",
       "      <th>consequent support</th>\n",
       "      <th>support</th>\n",
       "      <th>confidence</th>\n",
       "      <th>lift</th>\n",
       "      <th>leverage</th>\n",
       "      <th>conviction</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>44</th>\n",
       "      <td>(pip fruit)</td>\n",
       "      <td>(other vegetables)</td>\n",
       "      <td>0.075648</td>\n",
       "      <td>0.193493</td>\n",
       "      <td>0.026131</td>\n",
       "      <td>0.345430</td>\n",
       "      <td>1.785237</td>\n",
       "      <td>0.011494</td>\n",
       "      <td>1.232118</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>87</th>\n",
       "      <td>(yogurt)</td>\n",
       "      <td>(rolls/buns)</td>\n",
       "      <td>0.139502</td>\n",
       "      <td>0.183935</td>\n",
       "      <td>0.034367</td>\n",
       "      <td>0.246356</td>\n",
       "      <td>1.339363</td>\n",
       "      <td>0.008708</td>\n",
       "      <td>1.082825</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>41</th>\n",
       "      <td>(newspapers)</td>\n",
       "      <td>(whole milk)</td>\n",
       "      <td>0.079817</td>\n",
       "      <td>0.255516</td>\n",
       "      <td>0.027351</td>\n",
       "      <td>0.342675</td>\n",
       "      <td>1.341110</td>\n",
       "      <td>0.006957</td>\n",
       "      <td>1.132597</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>35</th>\n",
       "      <td>(fruit/vegetable juice)</td>\n",
       "      <td>(other vegetables)</td>\n",
       "      <td>0.072293</td>\n",
       "      <td>0.193493</td>\n",
       "      <td>0.021047</td>\n",
       "      <td>0.291139</td>\n",
       "      <td>1.504653</td>\n",
       "      <td>0.007059</td>\n",
       "      <td>1.137751</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>57</th>\n",
       "      <td>(other vegetables)</td>\n",
       "      <td>(tropical fruit)</td>\n",
       "      <td>0.193493</td>\n",
       "      <td>0.104931</td>\n",
       "      <td>0.035892</td>\n",
       "      <td>0.185497</td>\n",
       "      <td>1.767790</td>\n",
       "      <td>0.015589</td>\n",
       "      <td>1.098913</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "                antecedants         consequents  antecedent support  \\\n",
       "44              (pip fruit)  (other vegetables)            0.075648   \n",
       "87                 (yogurt)        (rolls/buns)            0.139502   \n",
       "41             (newspapers)        (whole milk)            0.079817   \n",
       "35  (fruit/vegetable juice)  (other vegetables)            0.072293   \n",
       "57       (other vegetables)    (tropical fruit)            0.193493   \n",
       "\n",
       "    consequent support   support  confidence      lift  leverage  conviction  \n",
       "44            0.193493  0.026131    0.345430  1.785237  0.011494    1.232118  \n",
       "87            0.183935  0.034367    0.246356  1.339363  0.008708    1.082825  \n",
       "41            0.255516  0.027351    0.342675  1.341110  0.006957    1.132597  \n",
       "35            0.193493  0.021047    0.291139  1.504653  0.007059    1.137751  \n",
       "57            0.104931  0.035892    0.185497  1.767790  0.015589    1.098913  "
      ]
     },
     "execution_count": 12,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "rules.sample(5)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Now let us look at the top 10 association rules sorted by confidence."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>antecedants</th>\n",
       "      <th>consequents</th>\n",
       "      <th>antecedent support</th>\n",
       "      <th>consequent support</th>\n",
       "      <th>support</th>\n",
       "      <th>confidence</th>\n",
       "      <th>lift</th>\n",
       "      <th>leverage</th>\n",
       "      <th>conviction</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>122</th>\n",
       "      <td>(other vegetables, yogurt)</td>\n",
       "      <td>(whole milk)</td>\n",
       "      <td>0.043416</td>\n",
       "      <td>0.255516</td>\n",
       "      <td>0.022267</td>\n",
       "      <td>0.512881</td>\n",
       "      <td>2.007235</td>\n",
       "      <td>0.011174</td>\n",
       "      <td>1.528340</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>17</th>\n",
       "      <td>(butter)</td>\n",
       "      <td>(whole milk)</td>\n",
       "      <td>0.055414</td>\n",
       "      <td>0.255516</td>\n",
       "      <td>0.027555</td>\n",
       "      <td>0.497248</td>\n",
       "      <td>1.946053</td>\n",
       "      <td>0.013395</td>\n",
       "      <td>1.480817</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>25</th>\n",
       "      <td>(curd)</td>\n",
       "      <td>(whole milk)</td>\n",
       "      <td>0.053279</td>\n",
       "      <td>0.255516</td>\n",
       "      <td>0.026131</td>\n",
       "      <td>0.490458</td>\n",
       "      <td>1.919481</td>\n",
       "      <td>0.012517</td>\n",
       "      <td>1.461085</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>116</th>\n",
       "      <td>(other vegetables, root vegetables)</td>\n",
       "      <td>(whole milk)</td>\n",
       "      <td>0.047382</td>\n",
       "      <td>0.255516</td>\n",
       "      <td>0.023183</td>\n",
       "      <td>0.489270</td>\n",
       "      <td>1.914833</td>\n",
       "      <td>0.011076</td>\n",
       "      <td>1.457687</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>115</th>\n",
       "      <td>(whole milk, root vegetables)</td>\n",
       "      <td>(other vegetables)</td>\n",
       "      <td>0.048907</td>\n",
       "      <td>0.193493</td>\n",
       "      <td>0.023183</td>\n",
       "      <td>0.474012</td>\n",
       "      <td>2.449770</td>\n",
       "      <td>0.013719</td>\n",
       "      <td>1.533320</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>29</th>\n",
       "      <td>(domestic eggs)</td>\n",
       "      <td>(whole milk)</td>\n",
       "      <td>0.063447</td>\n",
       "      <td>0.255516</td>\n",
       "      <td>0.029995</td>\n",
       "      <td>0.472756</td>\n",
       "      <td>1.850203</td>\n",
       "      <td>0.013783</td>\n",
       "      <td>1.412030</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>109</th>\n",
       "      <td>(whipped/sour cream)</td>\n",
       "      <td>(whole milk)</td>\n",
       "      <td>0.071683</td>\n",
       "      <td>0.255516</td>\n",
       "      <td>0.032232</td>\n",
       "      <td>0.449645</td>\n",
       "      <td>1.759754</td>\n",
       "      <td>0.013916</td>\n",
       "      <td>1.352735</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>91</th>\n",
       "      <td>(root vegetables)</td>\n",
       "      <td>(whole milk)</td>\n",
       "      <td>0.108998</td>\n",
       "      <td>0.255516</td>\n",
       "      <td>0.048907</td>\n",
       "      <td>0.448694</td>\n",
       "      <td>1.756031</td>\n",
       "      <td>0.021056</td>\n",
       "      <td>1.350401</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>51</th>\n",
       "      <td>(root vegetables)</td>\n",
       "      <td>(other vegetables)</td>\n",
       "      <td>0.108998</td>\n",
       "      <td>0.193493</td>\n",
       "      <td>0.047382</td>\n",
       "      <td>0.434701</td>\n",
       "      <td>2.246605</td>\n",
       "      <td>0.026291</td>\n",
       "      <td>1.426693</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>33</th>\n",
       "      <td>(frozen vegetables)</td>\n",
       "      <td>(whole milk)</td>\n",
       "      <td>0.048094</td>\n",
       "      <td>0.255516</td>\n",
       "      <td>0.020437</td>\n",
       "      <td>0.424947</td>\n",
       "      <td>1.663094</td>\n",
       "      <td>0.008149</td>\n",
       "      <td>1.294636</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "                             antecedants         consequents  \\\n",
       "122           (other vegetables, yogurt)        (whole milk)   \n",
       "17                              (butter)        (whole milk)   \n",
       "25                                (curd)        (whole milk)   \n",
       "116  (other vegetables, root vegetables)        (whole milk)   \n",
       "115        (whole milk, root vegetables)  (other vegetables)   \n",
       "29                       (domestic eggs)        (whole milk)   \n",
       "109                 (whipped/sour cream)        (whole milk)   \n",
       "91                     (root vegetables)        (whole milk)   \n",
       "51                     (root vegetables)  (other vegetables)   \n",
       "33                   (frozen vegetables)        (whole milk)   \n",
       "\n",
       "     antecedent support  consequent support   support  confidence      lift  \\\n",
       "122            0.043416            0.255516  0.022267    0.512881  2.007235   \n",
       "17             0.055414            0.255516  0.027555    0.497248  1.946053   \n",
       "25             0.053279            0.255516  0.026131    0.490458  1.919481   \n",
       "116            0.047382            0.255516  0.023183    0.489270  1.914833   \n",
       "115            0.048907            0.193493  0.023183    0.474012  2.449770   \n",
       "29             0.063447            0.255516  0.029995    0.472756  1.850203   \n",
       "109            0.071683            0.255516  0.032232    0.449645  1.759754   \n",
       "91             0.108998            0.255516  0.048907    0.448694  1.756031   \n",
       "51             0.108998            0.193493  0.047382    0.434701  2.246605   \n",
       "33             0.048094            0.255516  0.020437    0.424947  1.663094   \n",
       "\n",
       "     leverage  conviction  \n",
       "122  0.011174    1.528340  \n",
       "17   0.013395    1.480817  \n",
       "25   0.012517    1.461085  \n",
       "116  0.011076    1.457687  \n",
       "115  0.013719    1.533320  \n",
       "29   0.013783    1.412030  \n",
       "109  0.013916    1.352735  \n",
       "91   0.021056    1.350401  \n",
       "51   0.026291    1.426693  \n",
       "33   0.008149    1.294636  "
      ]
     },
     "execution_count": 13,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "rules.sort_values( 'confidence', ascending = False)[0:10]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### we can infer that the probability that a customer buys (whole milk), given he/she has bought (yogurt, other vegetables), is 0.51. Now, these rules can be used to create strategies like keeping the items together in store shelves or cross-selling"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.12"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}