Research Activities

Areas of Research Interests

I am interested in solving practical problems in computational Sciences (e.g., data science, actuarial science, computer science,applied probability, mathematics, and engineering). My publications include five Books, eight book chapters, and more than 100 articles (see recent publications or download complete CV). The methods for outlier detection, the Influence measure and the Potential-Residual Plot have been implemented by others in several statistics packages. See, for example,

  • Stata,
  • SYSTAT,
  • Data Desk, and
  • the following functions in R: olsrr::ols_hadi, olsrr::ols_plot_hadi, olsrr::ols_plot_resid_pot, robustX::BACON, robustX::mvBACON, wbacon::wbacon-package, wbacon::wbacon, wbacon::wBACON_reg, wbacon::plot.wbaconmv, wbacon::plot.wbaconlm, etc.
  • Areas of my research Interests include:

    Probability, Statistics, and Data Science:

    Computer Science:

    Mathematics:

    Finance and Actaurial Science:

    Engineering:

    Interdisciplinary:

    Back to top of page


    Regression Analysis

    Back to top of page


    Regression Diagnostics

    Back to top of page


    Categorical and Mixed Data

    Back to top of page


    Robust Statistics and Outlier Detection

    Although it is customary to assume that data are homogeneous, in fact they often contain outliers or subgroups. Scientists and philosophers have recognized for at least 380 years that real data are not homogeneous and that the identification of outliers is an important step in the progress of scientific understanding. Methods that deal with robust estimation and outlier detection are presented in the following articles:

    Robust Regression Methods:

    Detection of Outliers in Large Data Sets:

    Detection of Outliers in Multivariate Data:

    Cluster Analysis

    Detection of Outliers in Regression Data:

    Graphical Methods for the Detection of Outliers:

    Back to top of page


    Parameter and Quantile Estimation

    Back to top of page


    Fatigue and Lifetime Data Analysis

    Back to top of page


    Reliability of Engineering Structures

    Back to top of page


    Extreme Value Distributions

    Back to top of page


    Matrix Alegebra

    Back to top of page


    Perturbed Eigenvalue Problem

    Back to top of page


    Generalized Inverses

    Back to top of page


    Constructing Composite Indices

    Back to top of page


    Statistical Analysis of Employment Discrimination Data

    Back to top of page


    Probability

    Back to top of page


    Expert systems and probabilistic reasoning

    Back to top of page


    Feature Selection

    Back to top of page


    Neural and Functional Networks

    Back to top of page


    Bayesian and Markov Networks

    Back to top of page


    Data Mining and Visualization

    Back to top of page


    Statistics and Computer Science Literatures

    Back to top of page


    Software Available

    Back to top of page


    S- PLUS Code:

    function(X) {
    # -----------------------------------------------------------------
    #  Hadi, Ali S. (1994), "A Modification of a Method for the
    #  Detection of Outliers in Multivariate Samples," Journal of the
    #  Royal Statistical Society (B), 2, 393-396.
    # -----------------------------------------------------------------
      n <- dim(X) [1]
      p <- dim(X) [2]
      h <- trunc((n + p + 1)/2)     id <- 1:n
      r <- p
      out <- 0
      cf <- (1 + ((p + 1)/(n - p)) + (2/(n - 1 - (3*p))) )^2
    # cf <- (1 + ((p + 1)/(n - p)) + (1/(n - p - h)) )^2
      alpha <- 0.05
      tol <- max(10^-(p+5), 10^-12)
    # -----------------------------------------------------------------
    # **  Compute Mahalanobis distance
    # -----------------------------------------------------------------
      C <- apply(X, 2, mean)
      S <- var(X)
      if (det(S) < tol) stop ()
      D <- mahalanobis(X, C, S)
      mah.out <- 0
      cv <- qchisq(1-(alpha/n), p)
      for (i in 1:n) if (D[i] >= cv) mah.out <- cbind(mah.out, i)
      mah.out <- mah.out[-1]
      mah <- sqrt(D)
      Xbar <- C
      Covariance <- S   #
    # ----------------------------------------------------------------
    #  **  Step 0
    # ----------------------------------------------------------------
    #  **  Compute Di(Cm, Sm)
      C <- apply(X, 2, median)
      C <- t(array(C, dim = c(n, p)))
      Y <- X - C
      S <- ((n - 1)^-1)*(t(Y) %*% Y)
      D <- mahalanobis(X, C[1, ], S)
      Z <- sort.list(D)
    # ----------------------------------------------------------------
    #  **  Compute Di(Cv, Sv)
      repeat {
        Y <- X[Z[1:h], ]
        C <- apply(Y, 2, mean)
        S <- var(Y)
        if (det(S) > tol) {
           D <- mahalanobis(X, C, S)
           Z <- sort.list(D); break }
        else h <- h + 1
        }
    # ----------------------------------------------------------------
    #  **  Step 1
    # ----------------------------------------------------------------
      repeat {
        r <- r + 1
        if ( h < r) break
        Y <- X[Z[1:r],]
        C <- apply(Y, 2, mean)
        S <- var(Y)
        if (det(S) > tol) {
           D <- mahalanobis(X, C, S)
           Z <- sort.list(D) }
        }
    # ----------------------------------------------------------------
    #  **  Step 3
    # ----------------------------------------------------------------
    #  **  Compute Di(Cb, Sb)
      repeat {
        Y <- X[Z[1:h],]
        C <- apply(Y, 2, mean)
        S <- var(Y)
        if (det(S) > tol) {
           D <- mahalanobis(X, C, S)
           Z <- sort.list(D)
           if (D[Z[h + 1]] >= (cf*qchisq(1-(alpha/n), p))) {
                out <- Z[(h + 1) : n]
                break }
           else { h <- h + 1
                  if (n <= h) break }
           }
        else { h <- h + 1
              if (n <= h) break }
        }
      D <- sqrt(D/cf)
      dst <- cbind(id, mah, D)
      Outliers <- out
      Cb <- C;
      Sb <- S
      Distances <- dst
      return(Xbar, Covariance, mah.out, Outliers, Cb, Sb, Distances)
      result
    }
    # ----------------------------------------------------------------
    			

    Back to top of page