Difference between revisions of "Whitening and standardization"
(Created blank page) |
(Replaced content with "Data Standardization Standardizing a dataset involves rescaling the distribution of values so that the mean of observed values is 0 and the standard deviation is 1. It is...") Tags: Replaced Visual edit: Switched |
||
| (One intermediate revision by the same user not shown) | |||
| Line 1: | Line 1: | ||
Data Standardization | |||
Standardizing a dataset involves rescaling the distribution of values so that the mean of observed values is 0 and the standard deviation is 1. It is sometimes referred to as “whitening.” | |||
This can be thought of as subtracting the mean value or centering the data. | |||
Like normalization, standardization can be useful, and even required in some machine learning algorithms when your data has input values with differing scales. | |||
Standardization assumes that your observations fit a Gaussian distribution (bell curve) with a well behaved mean and standard deviation. You can still standardize your data if this expectation is not met, but you may not get reliable results. | |||
Standardization requires that you know or are able to accurately estimate the mean and standard deviation of observable values. You may be able to estimate these values from your training data. | |||
Latest revision as of 12:51, 13 September 2021
Data Standardization
Standardizing a dataset involves rescaling the distribution of values so that the mean of observed values is 0 and the standard deviation is 1. It is sometimes referred to as “whitening.”
This can be thought of as subtracting the mean value or centering the data.
Like normalization, standardization can be useful, and even required in some machine learning algorithms when your data has input values with differing scales.
Standardization assumes that your observations fit a Gaussian distribution (bell curve) with a well behaved mean and standard deviation. You can still standardize your data if this expectation is not met, but you may not get reliable results.
Standardization requires that you know or are able to accurately estimate the mean and standard deviation of observable values. You may be able to estimate these values from your training data.