Results and Interpretation

Descriptive Statistics
The selected variables exhibit diverse distributions:
Figure 1. Descriptive statistics

These metrics indicate substantial variability in income and employment durations among loan applicants, which might influence financial relationships in nonlinear ways.

Distance Correlation Analysis:
Distance correlation identifies both linear and nonlinear associations between variable pairs.
Figure 2. Distance correlation coefficients

Key findings:

  • Strongest Relationship: employvs.income shows the highest distance correlation (dCor = 0.444), suggesting a moderate nonlinear relationship: more years with an employer are likely associated with higher income.
  • Moderate Relationships: agevs.employ and agevs.income (both around dCor = 0.28) reflect mild dependencies, possibly due to accumulated experience or career progression with age.
  • Negligible or No Relationships: Debt-to-Income Ratio shows minimal association with any of the other variables, with dCor values close to zero and wide confidence intervals encompassing 0.
Distance Correlation Estimates
Figure 3. Distance correlation estimates
  • Distance Covariance (dCov) reflects the absolute magnitude of joint variability between variables in the distance metric space. Larger values indicate stronger joint dependence before normalization.
  • Distance Correlation (dCor) is a normalized version of dCov, ranging from 0 (no dependence) to 1 (perfect dependence). It enables comparison across variable pairs of different scales.
High dCov and dCor
  • The pair employs.income not only has the highest correlation (dCor = 0.444) but also a notable dCov (0.0026), showing substantial shared variability.
  • agevs.employ and agevs.income follow next, with lower dCov but still meaningful dCor values (~0.28), revealing moderate relationships despite differing scales.

Low dCov and dCor

Variable pairs involving debtinc have dCov values near zero, and dCor values also close to zero, confirming that debt-to-income ratio behaves independently of the other variables in this dataset.

  • While dCor shows how strongly variables are linked relative to their own scale, dCov can help reveal if the raw joint variability is trivial even if normalized correlation appears nonzero.
  • In the dataset, low dCov values (For example, incomevs.debtinc: 0.0000) reinforce the case that these variables are statistically unlinked.
Distance Variance
Figure 4. Distance variance
These values quantify the variability in pairwise distances within each variable. Notably:
  • age and employment exhibit the highest internal distance variability.
  • income has relatively low distance variance despite a wide numerical range, possibly due to normalization and a skewed distribution.
Pairwise Distance Scatter Plot
The selected scatter plot (age vs debtinc ratio) provides a visual perspective on their pairwise distances:
Figure 5. Scatter plot of pairwise distances
  • The plot displays the Min-Max normalized distances for Age (X-axis) and Debt-to-Income Ratio (Y-axis).
  • There is no discernible pattern or clustering, which aligns with the very low distance correlation (dCor = 0.0021).
  • This further confirms that Age and Debt-to-Income Ratio are statistically independent in this dataset.