# Calculation formulas

Welcome to the KensoBI datasource math formula documentation! This documentation contains comprehensive information about the available mathematical formulas and their SQL implementation.

## Table of Contents

- Sample Mean Formula
- Average Range Formula
- Standard Deviation Formulas
- Mean of Means Formula
- Capability Indices Formulas
- Control Limits Calculation

## Sample Mean Formula

The sample mean ($\bar{X}$) is a measure of central tendency, representing the average value of a set of sample data.

### Formula

The formula for the sample mean is given by:

$\bar{X} = \frac{\sum_{i=1}^{n} X_i}{n}$

Where:

- $\bar{X}$ is the sample mean,
- $n$ is the number of data points in the sample,
- $X_i$ represents each individual data point.

In this formula, $\sum$ represents the sum, and $\bar{X}$ denotes the mean.

#### SQL query:

```
'avg(measurementValue) as sampleAgg'
```

## Average Range Formula

The average range ($\bar{R}$) in statistical process control is calculated as the average of the individual ranges. The range is the difference between the maximum and minimum values in a sample.

### Formula

The formula for average range is given by:

$\bar{R} = \frac{R_1 + R_2 + \ldots + R_k}{k}$

Where:

- $\bar{R}$ is the average range,
- $R_1, R_2, \ldots, R_k$ are the individual ranges in each of the $k$ samples, and
- $k$ is the number of samples.

This formula represents the central tendency of the range values and is commonly used in control chart calculations, such as in the context of an R-chart in statistical process control.

#### SQL query:

```
'max(sampleAgg)-min(sampleAgg) as range'
```

## Standard Deviation Formulas

The standard deviation ($\sigma$) is a measure of the amount of variation or dispersion in a set of values.

### Population Standard Deviation Formula

The formula for the population standard deviation is given by:

$\sigma = \sqrt{\frac{\sum_{i=1}^{N}(X_i - \mu)^2}{N}}$

Where:

- $\sigma$ is the population standard deviation,
- $N$ is the number of data points in the population,
- $X_i$ represents each individual data point,
- $\mu$ is the mean of the population.

### Sample Standard Deviation Formula

The formula for the sample standard deviation is given by:

$s = \sqrt{\frac{\sum_{i=1}^{n}(X_i - \bar{X})^2}{n-1}}$

Where:

- $s$ is the sample standard deviation,
- $n$ is the number of data points in the sample,
- $X_i$ represents each individual data point,
- $\bar{X}$ is the mean of the sample.

In both formulas, $\sqrt{}$ denotes the square root, $\sum$ represents the sum, and $\bar{X}$ denotes the mean.

#### SQL query:

```
'stddev_pop(sampleAgg) as stdDev'
```

## Sample Range Formula

The sample range ($R$) is a measure of the spread or dispersion of a set of sample data.

### Formula

The formula for the sample range is given by:

$R = X_{\text{max}} - X_{\text{min}}$

Where:

- $R$ is the sample range,
- $X_{\text{max}}$ is the maximum value in the sample,
- $X_{\text{min}}$ is the minimum value in the sample.

In this formula, $X_{\text{max}}$ represents the maximum value, and $X_{\text{min}}$ represents the minimum value in the sample.

#### SQL query:

```
'max(sampleAgg)-min(sampleAgg) as range'
```

## Mean of Means Formula

The mean of means ($\bar{\bar{X}}$) is a measure that represents the average of sample means across multiple groups or samples.

### Formula

The formula for the mean of means is given by:

$\bar{\bar{X}} = \frac{\sum_{i=1}^{k} \bar{X}_i}{k}$

Where:

- $\bar{\bar{X}}$ is the mean of means,
- $k$ is the number of groups or samples,
- $\bar{X}_i$ represents the mean of the $i^{th}$ group or sample.

In this formula, $\sum$ represents the sum, and $\bar{\bar{X}}$ denotes the mean of means.

#### SQL formula:

```
'avg(sampleAgg) as mean'
```

## Capability Indices Formulas

Capability indices are statistical measures used to assess the ability of a process to meet specifications.

### Cp Formula

The Cp (Process Capability) index is calculated as:

$Cp = \frac{{\text{{USL}} - \text{{LSL}}}}{{6 \times \sigma}}$

#### SQL query:

```
'(c.usl - c.lsl)/(6*stddev_pop(sampleAgg)) as cp'
```

### Pp Formula

The Pp (Potential Process Capability) index is calculated as:

$Pp = \frac{{\text{(nominal + USL)} - \text{(nominal + LSL)}}}{{6 \times \sigma_x}}$

#### SQL query:

```
'((c.nominal + c.usl) - (c.nominal + c.lsl)) / (6 * stddev_pop(sampleAgg)) AS pp'
```

### Cpk Formula

The Cpk (Process Capability Index) index is calculated as:

$Cpk = \min\left(\frac{\text{USL} - \bar{\bar{x}}}{{3 \times \sigma}}, \frac{\bar{\bar{x}} - \text{LSL}}{{3 \times \sigma}}\right)$

#### SQL query:

```
'LEAST (((c.nominal + c.usl) - avg(sampleAgg))/(3*stddev_pop(sampleAgg)), (avg(sampleAgg) - (c.nominal + c.lsl))/(3*stddev_pop(sampleAgg))) as cpk'
```

### Ppk Formula

The Ppk (Potential Process Capability Index) index is calculated as:

$Ppk = \min\left(\frac{\text{(nominal + USL)} - \bar{\bar{x}}}{{3 \times \sigma_x}}, \frac{\bar{\bar{x}} - \text{(nominal + LSL)}}{{3 \times \sigma_x}}\right)$

#### SQL query:

```
'LEAST (((c.nominal + c.usl) - avg(sampleAgg))/(3 * stddev_pop(sampleAgg)), (avg(sampleAgg) - (c.nominal + c.lsl))/(3*stddev_pop(sampleAgg))) as ppk'
```

## Control Limits Calculation

It's important to note that X-bar charts can have up to four control limits, depending on the type of chart and the desired analysis.

### X-bar charts

The control limits for **X-bar charts** are calculated as follows:

$UCL_{Rbar} = \bar{x} + A_2\bar{R}$

$UCL_{Sbar} = \bar{x} + A_3\bar{S}$

#### SQL query:

```
'avg(sampleAgg) + ($a2_xbar_limit_range * (max(sampleAgg)-min(sampleAgg))) as ucl_rbar'
'avg(sampleAgg) + ($a3_xbar_limit_sigma * stddev_pop(sampleAgg)) as ucl_sbar'
```

$LCL_{Rbar} = \bar{x} - A_2\bar{R}$

$LCL_{Sbar} = \bar{x} - A_3\bar{S}$

#### SQL query:

```
'avg(sampleAgg) - ($a2_xbar_limit_range * (max(sampleAgg)-min(sampleAgg))) as lcl_rbar'
'avg(sampleAgg) - ($a3_xbar_limit_sigma * stddev_pop(sampleAgg)) as lcl_sbar'
```

### R-bar charts

For **R-bar charts**, the control limits are calculated as follows:

$UCL_R = D_4\bar{R}$

$LCL_R = D_3\bar{R}$

#### SQL query:

```
'$d4_range_ucl * (max(sampleAgg)-min(sampleAgg)) as ucl'
'$d3_range_lcl * (max(sampleAgg)-min(sampleAgg)) as lcl'
```

### S-bar charts

For **S-bar charts**, the control limits are calculated as follows:

$UCL_S = B_4\bar{S}$

$LCL_S = B_3\bar{S}$

#### SQL query:

```
'$b4_sigma_ucl * stddev_pop(sampleAgg) as ucl'
'$b3_sigma_lcl * stddev_pop(sampleAgg) as lcl'
```

### Constants

The values $A2, A3, D3, D4, B3, B4$ are constants configured for calculations in the SPC panel. The values are also available as an SQL script.