Statistical Mechanics in Bayesian Representation

0 downloads 0 Views 60KB Size Report
Charles-Maurice de Talleyrand-Périgord. Abstract. Statistical ... clear dialectic interrelationship between the two laws in question (cf. the work [2] for more detailed discussion .... functions, F(T1) and G(T2), to apply formula (1) by Aneja-Johal [8].
Statistical Mechanics in Bayesian Representation: How It Might Work and What Ought to be the Probability Distribution Behind It. E. B. Starikov*),**) *)

Institute for Theoretical Solid-State Physics, KIT, Wolfgang-Gaede-Str. 1, D-76131 Karlsruhe, Germany; E-mail: [email protected] **)

Department of Physical Chemistry, Chalmers University of Technology, SE-41296 Gothenburg, Sweden.

La statistique n'est qu'une addition correcte de chiffres faux. (The statistics is just the correct addition of the false numbers.) Charles-Maurice de Talleyrand-Périgord

Abstract Statistical mechanics is clearly being restricted by its fuzzy basement, namely its assumption as for the ‘large number of microscopic particles’. This communication tries to show and discuss some handy and useful way of how to circumvent such a restriction.

Introduction In our most recent review paper devoted to the ultimately true interpretation of the ideas by N. L. S. Carnot [1] we have posed the following question as for the statistical interpretation of thermodynamics: Was it just a plain success or solely a sheer despair? To our mind, the correct answer would be – the both. Why? Indeed, one of the two basic thermodynamics laws (the 1st one) cannot be considered a kind of statistical regularity, whereas another one (the 2nd one) – could nonetheless be made to obey the statistical regularities. Then, how it could be possible to combine this fact with the clear dialectic interrelationship between the two laws in question (cf. the work [2] for more detailed discussion of this important topic)? The first (to the best of our knowledge) interesting suggestion as to a possible way out of the above-mentioned logical blind alley came from Johannes Diderik van der Waals – as early as in 1911 [3]. In his short note Prof. Dr. van der Waals had discussed the interrelationship between the probability and entropy notions, as it appeared from the considerations by L. Boltzmann and J. W. Gibbs.

Specifically, what Prof. Dr. van der Waals had suggested was in no way just a speculation – but a clearly formulated, substantiated, justified and fully competent suggestion to employ the Bayesian approach in deriving the relationship between the entropy and probability notions. The only serious and effectual obstacle on the way to the practical embodiment of that ingenious suggestion by van der Waals was the fact that the Bayesian approach to the probability notion wasn’t really ‘trendy’ – not only at his time – but even approximately until the time of after the Second World War – it had to succumb to the ‘frequentist’ train of thoughts… Still, fortunately, van der Waals’ valuable suggestion has not experienced just a traceless dissolution in the whirlpool of the ‘trendy’ medium – it had still been employed in the work by George Augustus Linhart [4-7], and, moreover, it is being further developed in the most recent time (cf. the work by Preety Aneja and Ramandeep S. Johal [8] – as well as all the recent work from this very active and successful group [9-13]), as well as the references therein). Meanwhile, the ‘frequentist’ statistical-physical approach is based upon the long known and over-all accepted atomistic representation of the Matter, which makes us to operate with “extremely large number of atoms/molecules”, and therefore we have seemingly no other reasonable way than just to apply to the conventional statistical treatment … With this in mind, it is extremely important to note that the very possibility to avoid the conventional explicit microscopic consideration could be rather helpful, for notions like “extremely large number of atoms/molecules” are in fact FUZZY, in accordance with the well-known SORITES paradox. Indeed, it’s a practically difficult task to achieve a universally strict definition of what exactly the “large number” is and hence, operating with such notions and their derivatives, like “thermodynamic limit”, isn’t really a productive approach. Thus, the aim of our present communication should be to demonstrate how Bayesian approach might be handy and useful in circumventing the SORITES paradox, as well as to show the way of establishing mathematical/logical interconnection between its results and the well-known, tried and true products of the ‘frequentist’ train of thoughts. 1. Entropy considerations First of all, we start with the expression for entropy formally derived by G. A. Linhart [4-7], which is in fact nothing more than a temperature-dependent expression for the conventional Boltzmann’s entropy:

S=

(

)

C¥ T ln 1 + x K ; x º . K Tr e f

(1)

Now, we employ the reasoning used for Eq. 2 by Aneja-Johal [8], namely, that the difference between the ‘final’ and ‘initial’ entropies is equal to zero. With this in mind and making use of Eq. 1 here, we get the following interdependence between the temperatures T1 and T2 of the both interacting systems (T1 and T2 here – are the relative temperatures (divided by Tref), like x in Eq. 1 above, whereas TP º T + ; TM º T - (also – the both – are divided by Tref)):

(

æ - T1K + TP K + TM K 1 + TP K T 2 = çç 1 + T 1K è

) ö÷

1 K

(2)

÷ . ø

Thus, Eq. 2 here casts T2 as a function of T1, and vice versa. This is why, we differentiate T2 by T1 and simplify the expression of the resulting derivative, to get the following formula:

dT 2 =dT 1

(

T1-1+ K 1 + TM K

(

æ - T 1K + TP K + TM K 1 + TP K 1 + TP K çç 1 + T1K è 2 1 + T 1K

)(

)

(

)

) ö÷ ÷ ø

-1+

1 K

.

(3)

Now, we can rearrange Eq. 3 a little bit in such a way, that we get a product of two functions, F(T1) and G(T2), to apply formula (1) by Aneja-Johal [8].

( )( ) ( ) + (- T1 + TP + TM (1 + TP )) = (1 + TP ) ö÷ = )æçç1 + - T1 + TP1 ++TTM ÷ 1

)

T 1K - T 1K + 1 + TM K 1 + TP K º 1 + TP K + TM K 1 + TP K + T1K - T 1K =

( = (1 + T 1

= 1 + T1

K

K

K

K

K

K

K

K

K

K

è 1 + T 2K .

(1 + T1 )( K

K

)

(4)

ø

That is, our Eq. 3 can be recast as follows:

T 2-1+ K dT 2 T 1-1+ K dT 1 . = 1 + T 2K 1 + T1K

(

)

(

)

(5)

And, if we assume that dT 2 = -dT 1 , then, the desired prior, as a function of temperature (Pr(T)) is unique – for all the subsystems under study – and then (in G. A. Linhart’s representation !) reads, in effect, as a specific heat capacity at constant volume, æ T K -1 ö CV (T ) ÷= divided by temperature: Pr(T ) = çç (6) . K ÷ T è1+ T ø Interestingly, when the expectation (average) temperature is calculated, using the latter form of prior, we get exactly this same form of the Gaussian hypergeometric function 2F1 – as we obtain, when calculating the internal energy within G. A. Linhart’s approach (see Eq. 8 in the work [14]) … Thus, we readily see that Aneja-Johal approach fits well into the general framework of the G. A. Linhart’s ‘Bayesian’ statistical mechanics … But there is also a clear technical (mathematical) complication – to work out some closed formulas for the work, the efficiencies etc. – we have to approximate the available special transcendent functions … But, to our mind, it shouldn’t constitute any principal complication … 2. Work, expectation temperature, etc. … Internal energy considerations … To find the expression for the work done, we start with the formulation by Aneja-Johal [8]. Thus, we can extract the work (W) in some process involving the both systems in question, which is equal to the decrease in internal energy (U) of the total system, W = – DU, where DU = Ufin – Uini.

Then, using the Linhart’s expression for the internal energy (see the work [14], Eq. 8), we get the following formula for the work:

W = C¥Tref ù éæ æ æ 1 1 æ 1 æ 1 K öö K ööö êçç xPçç1- 2 F1 ç1, ,1 + ;- xP ÷ ÷÷ + xM çç1- 2 F1 ç1, ,1 + ;- xM ÷ ÷÷ ÷÷ ú K K è K øø è K ø ø øú , è êè è ú ê æ ê - ç x1æç1- 2 F1 æç1, 1 ,1 + 1 ;- x1K ö÷ ö÷ + x2æç1- 2 F1 æç1, 1 ,1 + 1 ;- x2 K ö÷ ö÷ ö÷ ú ÷ ç ÷ êë çè çè K K è K øø è K ø ø ÷ø úû è where: xP º

(7)

TP TM T1 T2 ; xM º ; x1 º ; x2 º . Tref Tref Tref Tref

We see that Eq. 7 is formulated in terms of the special function, namely – the Gauss’ Hypergeometric function 2F1, which complicates a lot its practical usage … Thus, we would need to simplify Eq. 7. So then, first, we use one of the Pfaff’s transformations: 1 æ1 1 æ 1 1 1 zK ö Kö K -K F 1, ,1+ ;-z , ,1+ ; F = 1+ z ç ÷ ç ÷; ( ) 2 1 2 1 è K ø K K 1+ zK ø èK K

(8)

and on the other hand, we employ the well known interconnection between the Gauss’ Hypergeometric and incomplete Beta functions: py - p By ( p, q ) = 2 F1 ( p,1 - q,1 + p; y ) 1 æ zK Þ çç K è1 + zK

ö ÷÷ ø

-

1 K

æ1 1 1ö 1 zK æ1 ç , 1 , , 1 ; º + F ç ÷ 2 1 çK K z K Kø K 1 + zK è K è 1+ z

B

K

ö ÷÷ ø

(9)

And now, if we recall the definition of the incomplete Beta function, we can cast it as follows: y

B y (a, b ) = ò t a -1 (1 - t ) dt; a > 0; b > 0. b -1

(10)

0

Then, it is possible to recast the right-hand part of Eq. 8, using Eq. 10, in the following way: 1

1 æ zK ö- K æ1 æ1 æ 1 1 1 ö z-1 1ö Kö K -K 1 F 1, ,1+ ;-z B ,1= 1+ z ç ÷ º B zK ç ,1- ÷; ç ÷ ç ( ) 2 1 K÷ zK èK è K ø K K è1+ z ø K ø K 1+ zK è K Kø 1+ zK

(11)

All these results can be properly checked using the software Mathematica … Therefore, it is in principle possible to express the Gauss hypergeometric functions in Eq. 7 via regularized incomplete Beta functions, as soon as we notice that:

I y (a, b ) =

By (a, b ) , B1 (a, b )

(12)

where Iy – is the regularized incomplete Beta function and B1 – the conventional complete Beta function. Then, taking Eq. 12 into account, it is in general possible to rewrite the right hand side of Eq. 11 as follows:

æ 1 æ1 1 1ö z-1 Kö F 1, ,1+ ;-z B zK ç ,1- ÷ º º ç ÷ 2 1 è K ø K èK K Kø K 1+ z

æ1 1ö B zK ç ,1- ÷ Kø æ 1 æ p öö æ1 z-1 1+zK è K 1 ö z-1 æ 1ö B1ç ,1- ÷ º çp cscç ÷÷ I zK ç ,1- ÷ , æ1 1ö èK è K øø K è K K Kø K è Kø 1+ z B1ç ,1- ÷ èK Kø

(13)

where csc º 1/sin stands for the conventional cosecant function. And, therefore, Eq. 7 might essentially be simplified in such a way:

W = C¥Tref éæ æ æ ö ööù -1 æ -1 æ ö ö êç xPç1- xP ç p cscæç p ö÷÷ I K æç 1 ,1- 1 ö÷÷ + xM ç1- xM çp cscæç p ö÷÷ I K æç 1 ,1- 1 ö÷÷÷ú ç è K øø xP K è K è K øø xM K è K êçè çè K è K ø÷ø K ø÷ø÷øú, (14) K è è 1+xP 1+ xM ú ê ö ú ê æ æ æ ö ö -1 æ -1 æ æ öö æ ö æ öö æ ö ê-ç x1ç1- x1 çp cscç p ÷÷ I K ç 1 ,1- 1 ÷÷ + x2ç1- x2 ç p cscç p ÷÷ I K ç 1 ,1- 1 ÷÷÷ ú ç è K øø x1 K è K è K øø x2 K è K K ø÷ø K ø÷ø÷ø ú K è K è êë çè çè è 1+ x1 1+ x2 û The important mathematical point here is that the function IY(a,b), the regularized incomplete Beta function, represents nothing more than the cumulative probability function for some random number Y obeying the Beta distribution. Hence, interestingly, all this does boil down to nothing more than just the Bayesian approach to statistics – as it originally ought to be – for, as well known, that was Reverend Thomas Bayes himself, who started employing the continuous Beta distribution function as a prior for the discrete binomial distributions … Well, and physically this means that we ought to consider the heat capacity at constant volume to be a random variable. Would the latter conclusion be a kind of heresy – and ought we to be immediately condemned by all the researchers’ community? Not at all, God bless, for such a seemingly unexpected standpoint could still be throughout plausible (see, for example, the work [15], the references therein, as well as the corresponding discussion in the work [14]). Indeed, the (statistical-mechanical) physical sense of heat capacity consists in that it shows how the ‘sum’ of all the possible elementary excitations in the system would describe the macroscopic state of the system. Linhart’s formula (please, cf. Eq. 5 in [14]) is just the mathematical expression of this property – moreover, it shows, how the experimentally measurable heat capacity, that is, in our notations, Y º C , might well be expressed in terms of C the standard probability theory ( )… C¥

To sum up, Eq. 14 here is expressing the work to be done via the probabilities to achieve some definite value of the heat capacity (or, generally speaking, some definite macroscopic state of the system under study) at some particular temperature value. Thus, from here on, the only way to handle Eq. 14 in terms of elementary functions would be to introduce some proper, pertinent approximations for the function IY(a,b). In this case there are just two possibilities, namely: Y – around 0, as well as Y – around 1. Then, we get: IY (a, b ) µ

Y a æ a(1 - b )Y ö + O Y 2 + ...÷;Y ® 0 - and - a Ï N; ç1 + aB(a, b ) è a +1 ø

( )

(1 - Y )b Y a æ1 - (a + b)(Y - 1) + O((Y - 1)2 )ö;Y ® 1 - and - b Ï N. IY (a, b ) µ 1 ç ÷ bB(a, b ) è b +1 ø

(15)

And, finally – the last, but not the least … Conclusions The above considerations clearly show us that statistical mechanical inferences could be pretty well – and in a completely straightforward way – handled within the Bayesian approach. This occurs to be possible, since we are closely following the approach by G. A. Linhart [4-7]. The immediate profit on this way consists in that our considerations are absolutely not restricted to the throughout fuzzy notion of some “large number” of atoms/molecules, as it is normally the case for the conventional treatment (see, for example, [16] and the references therein). And this profit is definitely achievable within the framework of Bayesian approach, which enables us to get rid of the fuzziness introduced by the straightforwardly and blindly adopted atomistic hypothesis. So, here we might get an opportunity to significantly widen the actual horizons of the conventional statistical thermodynamics, to include not only strictly macroscopic, but also meso- and even nanoscopic levels of studies … However, there is the only serious and crucially important poser that still remains unanswered: What ought to be the formal logical links between the Bayesian representation outlined above and the conventional Gibbs ensembles described by the Normal (Gaussian) distribution function, the tried and true Boltzmann exponential distribution etc., which are already long and very well known to be really faithful and useful mathematical instruments of theoretical physics ? Well, the immediate and clear answer is: Surely, we do not have to overthrow or, that is to say, completely refurbish all the conventional statistical mechanics ! But we just ought to widen the horizons of the latter and carefully look after the detailed conceptual links between its different chapters… …Regretfully, the short format of the present communication doesn’t really allow us to dwell on this important topic, but we would at least like to point out here that:

The Beta Density is anyway of extreme importance in the formal mathematical derivation of the conventional Canonical Distribution [17]. The Beta Probability Distribution is the tried and true method of successful statistical evaluation in the cases where fuzzy sets are at work [18]. Acknowledgements I greatly appreciate expressing my sincere gratitude to Dr. Rasmandeep Singh Johal, Associate Professor (Physics), Indian Institute of Science Education & Reseaerch, Mohali, for detailed and thorough discussions on the above theme of our common research interest.

References 1. E. B. Starikov, What Nicolas Léonard Sadi Carnot wanted to tell us in fact. Pensée Journal, v. 76, pp. 171-214, 2014. 2. E. B. Starikov, Valid Entropy-Enthalpy Compensation: Its True Physical Meaning. J. Appl. Solut. Chem. Mod., v. 2, pp. 240-245, 2013. 3. J. D. van der Waals, Über die Erklärung der Naturgesetze auf statistisch-mechanischer Grundlage. Phys. Z., v. XII, pp. 547-549, 1911. 4. G. A. Linhart, NOTE. The Relation Between Entropy and Probability. The Integration of the Entropy Equation. J. Am. Chem. Soc., v. 44, pp. 140-142, 1922. 5. G. A. Linhart, Correlation of Entropy and Probability. J. Am. Chem. Soc., v. 44, 18811886, 1922. 6. G. A. Linhart, Additions and Corrections – Correlation of Entropy and Probability. J. Am. Chem. Soc., v. 44, 2968-2969, 1922. 7. G. A. Linhart, Correlation of Heat Capacity, Absolute Temperature and Entropy. J. Chem. Phys., v. 1, pp. 795-797, 1933. 8. P. Aneja, R. S. Johal, Prior information and inference of optimality in thermodynamic processes. J. Phys. A, Math. Theor., v. 46, p. 365002, 2013. 9. R. S. Johal, Models of Finite Bath and Generalized Thermodynamics. In: Studies in Fuzziness and Soft Computing. A. Sengupta, Ed., v. 206, pp. 207-217, 2006. 10. P. Aneja, R. S. Johal, Prior Probabilities and Thermal Characteristics of Heat Engines. Cent. Eur. J. Phys., v. 10, pp. 708-714, 2012. 11. R. S. Johal, Efficiency at Optimal Work from Finite Reservoirs: A Probabilistic Perspective. J. Non-Eq. Thermodyn., v. 40, pp. 1-12, 2015. 12. P. Aneja, R. S. Johal, Form of Prior for Constrained Thermodynamic Processes. Eur. Phys. J. B, v. 88, pp. 129-138, 2015.

13. R. S. Johal, Renuka Rai, G. Mahler, Reversible Heat Engines: Bounds on Estimated Efficiency from Inference. Found. Phys., v. 45, pp. 158-170, 2015. 14. E. B. Starikov, Many Faces of Entropy or Bayesian Statistical Mechanics. ChemPhysChem, v. 11, pp. 3387-3394, 2010. 15. R. L. Scott, The Heat Capacity of Ideal Gases. J. Chem. Educ., v. 83, pp. 1071-1081, 2006. 16. A. R. Khokhlov, A. Yu. Grosberg, V. S. Pande, Statistical Physics of Macromolecules (Polymers and Complex Materials). AIP Press, New York, USA, 1997.

17. B. H. Lavenda, “Statistical Physics. A Probabilistic Approach”, Wiley-Interscience Publication, New York, Chichester, Brisbane, Toronto, Singapore, 1991, pp. 224-230. 18. M. Smithson, J. Verkuilen, “Fuzzy Set Theory. Applications in the Social Sciences”, Sage Publishing, Thousand Oaks, CA, USA, 2006, pp. 41-43.