Creating a scatterplot

Let's see what factors might influence Drug, the target variable. As a researcher, you know that the concentrations of sodium and potassium in the blood are important factors. Since these concentrations are both numeric values, you can create a scatterplot of sodium versus potassium that uses the drug categories as a color overlay.

Figure 1. Plot node
Plot node
  1. Place a Plot node on the canvas and connect it to the drug1n.csv Data Asset node. Then double-click the Plot node to edit its properties.
  2. Select Na as the X field, K as the Y field, and Drug as the Color (overlay) field. Click Save. Hover over the Plot node, then click the overflow menu and select Run. A plot chart is added to the Outputs pane.

    The plot clearly shows a threshold. For values higher than the threshold, drug Y is always the correct drug. And for values less than the threshold, drug Y is never the correct drug. This threshold is the ratio of sodium (Na) to potassium (K).

    Figure 2. Scatterplot of drug distribution
    Scatterplot of drug distribution