averaged return

averaged current return

averaged trailing return

color scheme

discounted returns

distribution

enhanced return

inflation model

maximum loss

median loss

metric

mid-mark

Monte Carlo simulation

rank

regression line (r)

resistance to index drop (r)

score

standard deviation

survival

Trailing return

weighting

weighted score

**Averaged return** is defined for this site as the average of all of the possible returns of 3 months or more within the time period under inspection. So, the averaged return for the last year would be all returns of 3, 4, ..., 12 months (which is roughly the average of 10 (3 month returns) + 9 (4 month returns) + ... 1 (12 month return) = 55 returns). This measure attempts to give a more likely return investors would have obtained over the time period under inspection, since investors are constantly entering and exiting the fund. This metric also has less "whip-sawing" than the traditional trailing return (See graph). D*ata*M*ining*G*raphics Complete* allows the user to change the minimum holding periods.

**Averaged trailing return** is defined for this site as the average of all of the returns of a set length of time over a selected time period. So, a rolling one yr return for the last five years would be the average of all one-year returns for the last five years (which is roughly the average of 12 x (5-1) = 48 one-year returns). This measure reduces the effect of extremes, both high and low, in returns.

The **color scheme** for each of the metrics in our products consists of 9 colors, ranging from blue (best) to red (worst) (example). Blues highlight the *score* as being above the *mid-mark* of the metric, green as enclosing the mid-mark, and yellow, orange, and red as being below the mid-mark. For a metric score to warrant either the highest or lowest colors, the score must be significantly different than its neighbors. D*ata*M*ining*G*raphics* uses a proprietary algorithm for this determination. In practice, all metrics have red-colored scores, but not all have dark blue colored scores.

The **discounted return** calculates the averaged return over the time period of interest, after removing (discounting) the top 2.5% of the month to month returns. The purpose of this measure is to reduce the actual yield by what can arguably be termed fortuitous circumstances, and thus unlikely to reoccur. The resultant return is also one more likely to be had by a new investor into the fund; 2.5% over 10 years is but 3 months, so the likelihood of a short-term investor of having benefited from those high returns is small. D*ata*M*ining*G*raphics Complete* product allows the user to change the discount percent and to add additional discounting percentages. See also the *enhanced returns* for the analogous treatment of dropping the bottom month to month returns.

The **distribution** of data is the frequency with which any value in a set of data occurs. One can also speak of the probability density or frequency distribution of the data. Distributions have characteristic properties, so the particular distribution that a data set conforms to is valuable information. The distribution of monthly returns for mutual funds is similar to that of a *normal distribution*. However, the distributions of the metrics are typically bimodal.

The **enhanced return** calculates the averaged return over the time period of interest, after removing (discounting) the bottom 2.5% of the month to month returns. The purpose of this measure is to increase the actual return by removing those month to month drops that occurred during "market crashes" or "bad luck". See also the *discounted returns* for the analogous treatment of dropping the top month to month returns.

A simple **inflationary** model was used for the Monte Carlo simulations for survivability estimates. Its characteristics were: random cycles of 2 to 6 years, mean inflation range of 0.6 to 14% per year, and a 30 year mean of 3.4% per year. Several 30-year inflation simulations generated by the model can be viewed in the graph.

The **maximum** loss that an investor could have experienced through ownership of that Fund over the course of the last 9+ years (from 1997). This is equivalent to the ultimate "buy height, sell low".

The **median** loss is the median of all possible losses that investors could have experienced through ownership of that Fund over the course of the last 9+ years (from 1997).

The **mid-mark** of a metric is the average of the *range of the metric* after the top and bottom (blue and red banded) scores have been removed. The averaged return graph illustrates this term. It typically is not representative of either the median or average of the metric. The distributions of our metrics for mutual funds are typically skewed and bimodal, so means and averages have less meaning.

A **metric** is any calculation performed on the data, such as determining a fund's averaged return, or maximum loss. When all of the data from a metric is analysed, a *score* can be calculated for each fund for that metric.

**Monte Carlo** is a computer simulation technique used to estimate likely outcomes given probabilistic inputs. In our case of estimating the probability of *survival* of an initial investment in a particular mutual fund, the inputs were the actual distribution of the fund month to month variations and a simple *inflation model*. 3000, 30-year simulations were used for each estimation of the survival metric.

D*ata*M*ining*G*raphics* does not **rank** funds. Instead, we prefer to use *scores* as the means of gauging relative performance. Ranking tends to imply a linear scale of performance (as does percentiles), when this is rarely the case. Typically, the extremes of the measures are not linear with respect to the remaining data. See *color scheme* and the example given there for further information.

The **regression line (r)** is a measure of ...

The **resistance** to the index drop indicates what percentage of the time the fund did not decline in value when the index did. This metric is offered as an alternative to the correlation between a fund and its index.

A fund is given a **score** for each of the metrics. Each metric is calculated in a such a manner that the fund having the most desirable outcome (for example, the greatest averaged return or the smallest maximum loss) is given a score of 10, and the fund having the worst outcome a score of 0. Enhanced meaning is given to the score through our *color scheme*. Unlike percentiles, an equal difference in the score represents the an equal difference in the actual metric (for example, the averaged return).

**Standard deviation** is a measure of the variability of data, in this case the month to month returns, from the mean of the data. This information is most useful when the *distribution* of the data is also known.

**Survival** is the estimated probability that an individual would **not** outlive an initial investment in that particular mutual fund over the course of thirty years. *Monte Carlo* simulation was used for these estimations. For the 600 or so funds that have been around for the last 20 years, the parameters for each simulation were as follows: an *inflationary model*, biannual withdraws from the investment account at the inflationary adjusted rate of 4% of the initial amount per year, and a declaration that the investment account was consumed at any time when the amount remaining in the account after a withdraw was less than 10% of the initial amount. For the remaining funds, an estimate is given based on a "best fit" of the month to month percentage change distribution of the fund to one of those 600+ funds having 20 years of historical data. These estimates are less certain than those for which the 20 year historical data was used.

Each metric can be emphasized, or not, through **weightings**. The user assigns each metric a number, from 0 on up, to reflect his or her judgement of the importance of the metric to the screening task under consideration. The relative importance of any metric would be its number (weighting) divided by the total of all the weightings of all of the metrics. In practice, weighting of many metrics by nearly equal amounts is not as effective as limiting the number of metrics to weigh or increasing the range of weighting used for the metrics. D*ata*M*ining*G*raphics* provides suggested weightings as starting points for fund evaluations based on more or less risk tolerance.

The **weighted score** is derived from the score and its associated weighting (including none) of the metrics available for scoring. Since the score is dependent on whatever weightings were used, it should not be interpreted as a recommendation.

The **averaged current return** is defined for this site as the average of all of the returns over a selected time period that includes the last month in the time interval. So, an averaged current return of 1 yr would be the average return of the last 12 returns of 1 year or less that include the last month (which is the average of the 12 month, 11 month, 10 month, ... 1 month returns). This measure gives an indication of what would be the return on an investment made sometime within the last 1 yr.

The **trailing return** is defined in the ususal manner. It is the annualized return of the fund, including any distributions. This is calculated at a specific moment in time, typically 'ending yesterday', for a specified period of time - typically 1, 3, or 5 years.