# I. Introduction
he Gaussian distribution is also commonly known as the Gaussian distribution, and it is generally known that the height, weight, and even IQ of a group of people are relatively consistent with the Gaussian distribution. However, like fatigue life of structures is often far from the Gaussian distribution and more in line with the Weibull distribution. In [1] it was pointed out that the Weibull distribution is a full state distribution, i.e., it can depict not only left-skewed and right-skewed data, but to some extent also symmetric as well as data satisfying a power law. In this sense it is more versatile than the Gaussian distribution [2], [3] and plays a very important role especially in fitting the fatigue life of structures. However, because of the difficulties encountered in determining the three parameters of the Weibull distribution, the problem was solved by taking the logarithm to make the data appear to be more in line with the Gaussian distribution. In fact, this approach is problematic. This paper points out that logging the original data is only a spatial transformation from a mathematical point of view, but from a physical point of view, it changes the structure of the data, and the physical meaning is changed, so it is not appropriate to use logarithmic Gaussian distribution to fit the original data after logarithm. To determine the three parameters of the Weibull distribution, the graphical and analytical methods [4] were previously adopted, the former being inconvenient to use and with relatively large errors; the latter involves solving a system of three joint transcendental equations, which, despite the availability of computers to do so, still has the problem of being inconsistent. This problem can now be solved relatively well by using T.Z. Gao method proposed by [1].
# II. The Characteristics of the Gaussian Distribution
It is well known [4] that the so-called Gaussian distribution is a distribution in which the random variable is a PDF of X with the form,
f(x)=[1/(2?) 1/2 ?]exp[-(x-?) 2 /2? 2 ](1)
where ? and ? 2 are the mean and variance of the Gaussian distribution, respectively. And when the mean ? = 0 and the standard deviation ? = 1 is called the standard Gaussian distribution as follows,
[1/(2?) 1/2 ]exp(-x 2 /2)(2)
From the definition of Gaussian distribution it is easy to see that Gaussian distribution has the following characteristics [5] :
1. Single-peaked, a distribution that is unimodal. And symmetry, with its Mode and median and mean are the same. 2. Universality, a significant proportion of random variables encountered in real life are or approximately conform to the Gaussian distribution.
Even in an arbitrary distribution, in the case of a large sample, the distribution of the mean will approximate the Gaussian distribution. 3. Simplicity, i.e., only two parameters (?, ? 2 ) are needed to determine the shape of the entire distribution.
Because the normal distribution has so many good characteristics, it has become the most studied and applied distribution. However, it is obvious that not all data conform to Gaussian distribution, and in most cases the data conform to Gaussian distribution is only a good approximation. In fact [4] , the data of various fatigue lives are often not fit Gaussian distribution but better fit Weibull distribution, and sometimes the fatigue life is logarithmically distributed, but it is only an approximation. Because of this, Weibull distribution needs to be introduced and studied in more depth.
# III. Brief Introduction of Weibull Distribution
There are various expressions for the Weibull distribution, and a more general form is taken here [1] , with a probability density function:
f(x)=(b/?)[(x-x 0 )/?)] b-1 exp{-[(x-x 0 )/?] b } (3)
where b is the shape parameter, ? is the scale or proportional parameter, and x 0 is called the position parameter. In the field of fatigue it is customary to use the fatigue life N instead of x, N 0 instead of x 0 , and call it the safe life. In a non-strict sense [1] , "when 0 < b < 1 resembles a power-law function, while 1 < b < 3 is a left-skewed distribution, 3 < b < 4 approximates a Gaussian distribution, and b > 4 is a right-skewed distribution". This is the reason why the Weibull distribution is called the "full state distribution". As shown in the following fig.1 [5] :
Fig. 1: PDF of various Three-Parameter Weibull distributions when x 0 =0.5
It is easy to prove that the life is x i and the corresponding reliability [1] is,
p i =exp{-[(x i -x 0 )/?] b } (4)
It can be seen that when x=x 0 , p 0 =100%. This is the origin of 100% reliability safety life. If p 50 =50%, it means that the corresponding X is called the median value x m of X, that is, there are,
50%=exp{-[(x m -x 0 )/?] b }(5)
It is not difficult to get the expectation and variance of Weibull distribution with three parameters according to the definition [4] ,
E(X)=x 0 +?Î?"(1+1/b) (6) Var(X)=? 2 [Î?"(1+2/b)-Î?" 2 (1+1/b)](7)
In this way, the fatigue life data are given and the three parameters of Weibull distribution can be derived by ( 5), ( 6) and (7), which is the analytical metho [4] . In addition to the analytical method, the maximum likelihood method and some methods derived from it [6], [7] have been used more recently, but they have problems such as cumbersome derivation and inconvenient calculation, so we will not discuss them in depth here.
# IV. Origin of Z.T. GAO Method and Fitting Standard
Theoretically if a set of fatigue life data N is given, then using the median (N m ), mean (N av ) and mean squared deviation (s) of this array, then using the three equations ( 5), ( 6) and ( 7) is possible to solve for the estimated values of the three parameters of the Weibull distribution. However, for convenience ( 5), ( 6) and ( 7) can be reduced to a transcendental equation [1] with respect to b:
(N av -N m )[Î?"(1+2/b)-Î?" 2 (1+1/b)]+s[D 1/b -Î?"(1+1/b)] 1/2 =0 (8)
It is not difficult to find that N 0 (=127) derived from the analytical method is greater than the minimum value of 124 for this group of fatigue lives. And this is in contradiction with the definition of safe life N 0 . That is, the problem of inconsistent occurs. Another question is what happens if we fit this set of data with a Gaussian distribution? That is, which is the more appropriate distribution to fit?
The second problem can be judged by the magnitude of the determination coefficient [8] R 2 fitting the ideal reliability based on the so-called "average rank" [4] . The so-called ideal reliability means that the following formula is independent of the specific distribution,
p i =1-i/(n+1) (9)
where i is the order of the data from smallest to largest, and n is the number of data.
And the first problem is solved by the Z. T. Gao method [1] . The basic idea of the method is briefly where D = ln2. This equation is solvable by Newton's method, and after obtaining b, then ? and N 0 can be found by ( 7), (6).
if set,
Y i = ln(ln(1/p i )), X i = ln(N i -N 0 ) (11) d=-bln(?), ?=exp(-d/b)(12)
So (10) could been become,
Y i =bX i +d (13)
This is a system of linear regression equations that can be derived by the least squares method with coefficients b and d. However, it is important to note that here X i is related not only to the data N already given, but also to the required safety lifetime N 0 of Weibull distribution. This problem can be solved by determining the extreme value of the absolute value of the relative coefficient r of the regression line to determine the corresponding N 0 , but the mathematical derivation of this method is complex and error-prone [9] . It is better to use a different idea to use Python to find the series of r about N 0 directly in the interval 0 ? N 0 < N min (here N min is taken as the minimum value of the given data). Then Python intelligently finds the N 0 of r with the largest correlation coefficient, and at the same time determines b and ?. This is known as the Z.T. Gao algorithm. It is abbreviated as the Z.T. Gao method [1], [5] or GZT method.
Example 2: Now, using the data of Example 1, three parameters of Weibull distribution are determined by using GZT method, and the results are compared with Gaussian distribution. The results are as follows: This figure graphically demonstrates how GZT method finds the corresponding safe lifetime that maximizes the correlation coefficient. Since it is clear at the beginning of the process that N 0 cannot be greater than the minimum lifetime of the data, it is not possible to have a situation where it is inconsistent. Again, if the data are fitted with a Gaussian distribution and the coefficient of determination of the Weibull distribution estimated by GZT method, respectively, fitted with the ideal reliability (9):
Coefficient of determination obtained by fitting the Weibull distribution = 0.97999 Coefficient of determination obtained by fitting the Gaussian distribution = 0.95044
It can be seen that the fitted coefficient of determination of the Weibull distribution obtained by GZT method is greater than that of the Gaussian distribution. That is, in this sense the data are more realistically depicted by the Weibull distribution.
The advantage of GZT method is that the physical meaning is very intuitive, and there is no problem of "inconsistent". This method is not only convenient for solving the problem of estimating the three parameters of the Weibull distribution, but also easy to determine whether the original data fits better with the Weibull distribution or with the Gaussian distribution. It is also easy to extend to solve similar problems, such as fitting fatigue performance curves with three parameters [1] , and the confidence intervals of these three parameters will be discussed in separate papers [10], [11] .
# V. Problems of Logarithmization of Original Data
Due to the complexity of the Weibull distribution, when the original data is not so consistent with the Gaussian distribution, often take its logarithmic, from a mathematical point of view is equivalent to do a spatial transformation, at this time because the data "compressed", it may be closer to the Gaussian distribution [4] . This has the advantage of making the PDF of the original data taken logarithmically will be fitted quite well by the Gaussian distribution, which will be more convenient for people to study and apply. However, this will lose the physical meaning of the safety lifetime, while making the original data density distribution is "distorted". This is illustrated in the following two examples. "back" to the original state, only the median can "recover" (see Table 2, line 3), and the mean is leftskewed, the relative coefficient and the coefficient of determination is improved. Nevertheless, it is still not possible to obtain a 100% safe lifetime. In contrast, the fit with the Weibull distribution, as seen in Table 3, is a fairly good fit. Even after taking the logarithm, the fit is almost the same as that of the Gaussian distribution.
From the data in row 2 of the Weibull distribution parameters in Table 3 and ( 6) and ( 7), we can calculate that ?^=5.7137; ?^=0.1005 And this result is almost the same as the data in row 2 of Table 2. In this sense the Weibull distribution is indeed more general than the Gaussian distribution, which can be seen as a first-order approximation to the Weibull distribution. It can be seen that using the Weibull distribution to fit this set of fatigue life data does not require any logarithm of the data at all and the physical meaning of each parameter is very clear. Example 4: Looking again at the case of a small sample, 20 data for the life of a structure using Table 8- Also the following parameter table and histogram can be obtained. As seen in Fig. 3, the histogram of the original data is asymmetric and left-skewed, and fitting it with a Gaussian distribution would be less appropriate, as in fact demonstrated with the chi-square test [4] . At this point it would be more appropriate to use the Weibull distribution. Looking at the logarithm of the data, we can see from Fig. 4 that the data do appear to be symmetric, and the Gaussian distribution is indeed a good fit. The problem is that the fatigue life PDF left-skewed features are lost, and the physical meaning of safe life is lost. Even if the results obtained in the logarithmic case As seen in Fig. 5 and 6, similar to the case of the large sample, the original data are also left-skewed and appear symmetric after taking the logarithm. However, if the Weibull distribution is fitted, there is no need to take the logarithm of the original data. Even if the logarithm is taken, the data looks more symmetric, but the Weibull distribution does not fit worse than the Gaussian distribution. So in this sense, even for symmetric data, fitting with the Weibull distribution is possible. However, the difficulty in fitting the Weibull distribution is that it is more difficult to estimate the three parameters, but now there is no problem with GZT method.
# VI. Conclusion
1. The three-parameter Weibull distribution is a more general full state distribution than the Gaussian distribution. In the field of reliability, the physical meaning of its position parameter is particularly important, that is, the safe life under 100% reliability. 2. Based on the complexity of the three-parameter Weibull distribution, the previous methods to determine its three parameters by test data are complicated. The graphical method is more errorprone and inconvenient to use; while the analytical method may be inconsistent; and the GZT method makes full use of the advantages of Python, which solves this problem better. 3. In the past, the fatigue life data that were not so well fitted with Gaussian distribution were taken logarithmically so that they might be more consistent with Gaussian distribution, but the result of doing so made the 100% reliability of the safe life no longer exist. The fact is that the data itself is more consistent with Weibull distribution. Since Weibull distribution is a full state distribution, it is generally not necessary to take the fatigue life as logarithm in the future and directly fit the fatigue life data with the three-parameter Weibull distribution to get a better fit.
4. The two parameters of Gaussian distribution (mean and variance) are not very significant for asymmetric data, while for asymmetric data like structural fatigue life the three parameters of Weibull distribution (safety life, shape and scale parameters) will be much more significant, and in a sense these three parameters "contain" the two parameters of Gaussian distribution. This is probably the reason why the Weibull distribution can "contain" the Gaussian distribution. 5. Finally, it can be concluded that for asymmetric fatigue life, it is not necessary to take logarithms to fit with Gaussian distribution, but can be directly fitted with three-parameter Weibull distribution.
Further even for the more symmetric data, it is better to fit directly with the three-parameter Weibull distribution.
![Journal of Researches in Engineering Volume Xx XIII Issue I V ersion I 2 Year 2023 © 2023 Global Journ als](image-2.png "Global")
![From Gaussian Distribution to Weibull Distribution](image-3.png "I")
1![The data in Table8-2 in[4] are used to find the three parameters of the Weibull distribution by analytical method.](image-4.png "Example 1 :")
1![A set of fatigue life data (10 3 c) You can get it through Python code, Parameter estimation: b=1.221,N 0 =127,?=22.46 described below. Taking the logarithm of both sides of (4) twice yields that ln (ln(1/p i ))=bln(N i -N 0 )-bln(?)](image-5.png "Table 1 :")
2![Fig. 2: Schematic graph of Z.T. Gao method](image-6.png "Fig. 2 :")
3![Using the (large sample) 100 fatigue life data of a structure fromTable 12-3 of [1] P253, the Python code gives: Also the following parameter table and histogram can be obtained. Gaussian Distribution to Weibull Distribution Fatigue life (original data) N= [3.08, 3.26, 3.32, 3.48, 3.49, 3.56, 3.69, 3.7, 3.78, 3.79, 3.8, 3.87, 3.95, 4.07, 4.08, 4.1, 4.12, 4.2, 4.24, 4.25, 4.28, 4.31, 4.31, 4.36, 4.54, 4.58, 4.6, 4.62, 4.63, 4.65, 4.67, 4.67, 4.72, 4.73, 4.75, 4.77, 4.8, 4.82, 4.84, 4.9, 4.92, 4.93, 4.95, 4.96, 4.98, 4.99, 5.02, 5.03, 5.06, 5.08, 5.06, 5.1, 5.12, 5.15, 5.18, 5.2, 5.22, 5.38, 5.41, 5.46, 5.47, 5.53, 5.56, 5.6, 5.61, 5.63, 5.64, 5.65, 5.68, 5.69, 5.73, 5.82, 5.86, 5.91, 5.94, 5.95, 5.99, 6.04, 6.08, 6.13, 6.16, 6.19, 6.21, 6.26, 6.32, 6.33, 6.36, 6.41, 6.46, 6.81, 7.0, 7.35, 7.82, 7.88, 7.96, 8.31, 8.45, 8.47, 8.79, 9.87] (10^5cycle).](image-7.png "Example 3 :From")
34![Fig. 3: Histogram of original data (large sample) and fitting diagram of Gaussian and Weibull distribution](image-8.png "Fig. 3 :Fig. 4 :")
![4 of P136 in [2]. Again, this can be obtained by Python code as follows Fatigue life (raw data) N= [3.5, 3.8, 4.0, 4.3, 4.5, 4.7, 4.8, 5.0, 5.2, 5.4, 5.5, 5.7, 6.0, 6.1, 6.3, 6.5, 6.7, 7.3, 7.7, 8.4] (10^5cycle)](image-9.png "")
5![Fig. 5: Histogram of original data (small sample) and fitting diagram of Gaussian and Weibull distribution](image-10.png "Fig. 5 :")
6![Fig. 6: Histogram after logarithm of the original data (small sample) and fitting diagram of Gaussian and Weibull distribution](image-11.png "Fig. 6 :")
2(Where s is sample standard deviation)
3
4(Where s is sample standard deviation)
5
© 2023 Global Journ als ( ) I From Gaussian Distribution to Weibull Distribution
## Acknowledgments
We thank Mr. Wan Weihao for his support to this paper and related research work.
*
Gao Zhentong Method in the Fatigue Statistics Intelligence
JJXu
10.13700/j.bh.1001-5965.2020.0373
Journal of Beijing University of Aeronautics and Astronautics
47
10
2021
in Chinese
*
A statistical distribution function of wide applicability
WaloddiWeibull
Journal of Applied Mechanic Reliab
28
4
1951. 1951
*
AJHallinan
A review of the Weibull distribution
1993. 1993. 1993
25
*
ZGao
Fatigue applied statistics
1986
National Defense Industry Press
in Chinese
*
ZTGao
JXu J
Intelligent Fatigue Statistics Beijing: Beihang publishing house
2022
in Chinese
*
Maximum Likelihood Estimation for Three-Parameter Weibull Distribution Using Evolutionary Strategy Mathematical Problems in Engineering
FanYang
ZhiliHu Ren
Hu
10.1155/2019/6281781
ID 6281781
2019
8
*
On the Three-Parameter Weibull Distribution Shape Parameter Estimation
MahdiTeimouri
ArjunKGupta
Journal of Data Science
11
2013
*
KTrivedi
Probability and Statistics with Reliability, Queuing, and Computer Science Applications
Beijing
Electronic Industry Press
2015
in Chinese
*
An optimization method of correlation coefficient for determining A threeparameters Weibull distribution
H MFu
ZTGao
Acta Aeronautica et Astronautica Sinica
11
7
1990
in Chinese
*
Further research on fatigue statistics intelligence
JJXu
ZTGao
doi:10 7527/S1000-6829
Acta Aeronautica et Astronautica Sinica
43
8
225138
2022. 2021 25138
in Chinese
*
Digital Experiment for Estimating Three Parameters and Their Confidence Intervals of Weibull Distribution
Xu Jiajin
10.11648/j.ijsts.20221002.16
International Journal of Science
10
2
2022. 2022
Technology and Society.
*
Global Journal of Researches in Engineering Volume Xx XIII Issue I V ersion I