Probability
- [1] arXiv:2405.14924 [pdf, ps, html, other]
-
Title: Upper tail large deviations of the directed landscapeComments: 62 pages, 2 figuresSubjects: Probability (math.PR); Mathematical Physics (math-ph)
Starting from one-point tail bounds, we establish an upper tail large deviation principle for the directed landscape at the metric level. Metrics of finite rate are in one-to-one correspondence with measures supported on a set of countably many paths, and the rate function is given by a certain Kruzhkov entropy of these measures. As an application of our main result, we prove a large deviation principle for the directed geodesic.
- [2] arXiv:2405.15128 [pdf, ps, other]
-
Title: Fluctuations around the mean-field limit for attractive Riesz potentials in the moderate regimeSubjects: Probability (math.PR); Analysis of PDEs (math.AP)
A central limit theorem is shown for moderately interacting particles in the whole space. The interaction potential approximates singular attractive or repulsive potentials of sub-Coulomb type. It is proved that the fluctuations become asymptotically Gaussians in the limit of infinitely many particles. The methodology is inspired by the classical work of Oelschläger on fluctuations for the porous-medium equation. The novelty in this work is that we can allow for attractive potentials in the moderate regime and still obtain asymptotic Gaussian fluctuations. The key element of the proof is the mean-square convergence in expectation for smoothed empirical measures associated to moderately interacting $N$-particle systems with rate $N^{-1/2-\varepsilon}$ for some $\varepsilon>0$. To allow for attractive potentials, the proof uses a quantitative mean-field convergence in probability with any algebraic rate and a law-of-large-numbers estimate as well as a systematic separation of the terms to be estimated in a mean-field part and a law-of-large-numbers part.
- [3] arXiv:2405.15142 [pdf, ps, other]
-
Title: Characterization of Gradient Condition for Asymmetric Partial Exclusion Processes and Their Scaling LimitsComments: 25 pagesSubjects: Probability (math.PR)
We consider partial exclusion processes~(PEPs) on the one-dimensional square lattice, that is, a system of interacting particles where each particle random walks according to a jump rate satisfying an exclusion rule that allows up to a certain number of particles can exist on each site. Particularly, we assume that the jump rate is given as a product of two functions depending on occupation variables on the original and target sites. Our interest is to study the limiting behavior, especially to derive some macroscopic PDEs by means of (fluctuating) hydrodynamics, of fluctuation fields associated with PEPs, starting from an invariant measure. The so-called gradient condition, meaning that the symmetric part of the instantaneous current is written in a gradient form, and that the invariant measures are given as a product measure is technically crucial. Our first main result is to clarify the relationship between these two conditions, and we show that the gradient condition and the existence of product invariant measures are mutually equivalent, provided the jump rate is given in the above simple form, as it is imposed in most of the literature, and the dynamics is asymmetric. Moreover, when the width of the lattice tends to zero and the process is accelerated in diffusive time-scaling, we show that the family of fluctuation fields converges to the stationary energy solution of the stochastic Burgers equation (SBE), under the setting that the jump rate to the right neighboring site is a bit larger than the one to the left side, of which discrepancy is given as square root of the width of the underlying lattice. This fills the gap at the level of universality of SBE since it has been proved for the exclusion process (a special case of PEP) and for the zero-range process.
New submissions for Monday, 27 May 2024 (showing 3 of 3 entries )
- [4] arXiv:2405.14913 (cross-list from stat.ME) [pdf, ps, other]
-
Title: High Rank Path Development: an approach of learning the filtration of stochastic processesSubjects: Methodology (stat.ME); Machine Learning (cs.LG); Probability (math.PR); Machine Learning (stat.ML)
Since the weak convergence for stochastic processes does not account for the growth of information over time which is represented by the underlying filtration, a slightly erroneous stochastic model in weak topology may cause huge loss in multi-periods decision making problems. To address such discontinuities Aldous introduced the extended weak convergence, which can fully characterise all essential properties, including the filtration, of stochastic processes; however was considered to be hard to find efficient numerical implementations. In this paper, we introduce a novel metric called High Rank PCF Distance (HRPCFD) for extended weak convergence based on the high rank path development method from rough path theory, which also defines the characteristic function for measure-valued processes. We then show that such HRPCFD admits many favourable analytic properties which allows us to design an efficient algorithm for training HRPCFD from data and construct the HRPCF-GAN by using HRPCFD as the discriminator for conditional time series generation. Our numerical experiments on both hypothesis testing and generative modelling validate the out-performance of our approach compared with several state-of-the-art methods, highlighting its potential in broad applications of synthetic time series generation and in addressing classic financial and economic challenges, such as optimal stopping or utility maximisation problems.
- [5] arXiv:2405.15074 (cross-list from stat.ML) [pdf, ps, other]
-
Title: 4+3 Phases of Compute-Optimal Neural Scaling LawsSubjects: Machine Learning (stat.ML); Machine Learning (cs.LG); Optimization and Control (math.OC); Probability (math.PR); Statistics Theory (math.ST)
We consider the three parameter solvable neural scaling model introduced by Maloney, Roberts, and Sully. The model has three parameters: data complexity, target complexity, and model-parameter-count. We use this neural scaling model to derive new predictions about the compute-limited, infinite-data scaling law regime. To train the neural scaling model, we run one-pass stochastic gradient descent on a mean-squared loss. We derive a representation of the loss curves which holds over all iteration counts and improves in accuracy as the model parameter count grows. We then analyze the compute-optimal model-parameter-count, and identify 4 phases (+3 subphases) in the data-complexity/target-complexity phase-plane. The phase boundaries are determined by the relative importance of model capacity, optimizer noise, and embedding of the features. We furthermore derive, with mathematical proof and extensive numerical evidence, the scaling-law exponents in all of these phases, in particular computing the optimal model-parameter-count as a function of floating point operation budget.
- [6] arXiv:2405.15353 (cross-list from math.CO) [pdf, ps, html, other]
-
Title: Sharing tea on a graphJ. Pascal Gollin, Kevin Hendrey, Hao Huang, Tony Huynh, Bojan Mohar, Sang-il Oum, Ningyuan Yang, Wei-Hsuan Yu, Xuding ZhuComments: 19 pages, 2 figuresSubjects: Combinatorics (math.CO); Probability (math.PR)
Motivated by the analysis of consensus formation in the Deffuant model for social interaction, we consider the following procedure on a graph $G$. Initially, there is one unit of tea at a fixed vertex $r \in V(G)$, and all other vertices have no tea. At any time in the procedure, we can choose a connected subset of vertices $T$ and equalize the amount of tea among vertices in $T$. We prove that if $x \in V(G)$ is at distance $d$ from $r$, then $x$ will have at most $\frac{1}{d+1}$ units of tea during any step of the procedure. This bound is best possible and answers a question of Gantert.
We also consider arbitrary initial weight distributions. For every finite graph $G$ and $w \in \mathbb{R}_{\geq 0}^{V(G)}$, we prove that the set of weight distributions reachable from $w$ is a compact subset of $\mathbb{R}_{\geq 0}^{V(G)}$. - [7] arXiv:2405.15379 (cross-list from stat.ML) [pdf, ps, html, other]
-
Title: Log-Concave Sampling on Compact Supports: A Versatile Proximal FrameworkSubjects: Machine Learning (stat.ML); Machine Learning (cs.LG); Probability (math.PR); Statistics Theory (math.ST)
In this paper, we explore sampling from strongly log-concave distributions defined on convex and compact supports. We propose a general proximal framework that involves projecting onto the constrained set, which is highly flexible and supports various projection options. Specifically, we consider the cases of Euclidean and Gauge projections, with the latter having the advantage of being performed efficiently using a membership oracle. This framework can be seamlessly integrated with multiple sampling methods. Our analysis focuses on Langevin-type sampling algorithms within the context of constrained sampling. We provide nonasymptotic upper bounds on the W1 and W2 errors, offering a detailed comparison of the performance of these methods in constrained sampling.
- [8] arXiv:2405.15539 (cross-list from stat.ML) [pdf, ps, other]
-
Title: A generalized neural tangent kernel for surrogate gradient learningComments: 52 pages, 3 figures + 2 supplementary figuresSubjects: Machine Learning (stat.ML); Disordered Systems and Neural Networks (cond-mat.dis-nn); Machine Learning (cs.LG); Probability (math.PR); Neurons and Cognition (q-bio.NC)
State-of-the-art neural network training methods depend on the gradient of the network function. Therefore, they cannot be applied to networks whose activation functions do not have useful derivatives, such as binary and discrete-time spiking neural networks. To overcome this problem, the activation function's derivative is commonly substituted with a surrogate derivative, giving rise to surrogate gradient learning (SGL). This method works well in practice but lacks theoretical foundation. The neural tangent kernel (NTK) has proven successful in the analysis of gradient descent. Here, we provide a generalization of the NTK, which we call the surrogate gradient NTK, that enables the analysis of SGL. First, we study a naive extension of the NTK to activation functions with jumps, demonstrating that gradient descent for such activation functions is also ill-posed in the infinite-width limit. To address this problem, we generalize the NTK to gradient descent with surrogate derivatives, i.e., SGL. We carefully define this generalization and expand the existing key theorems on the NTK with mathematical rigor. Further, we illustrate our findings with numerical experiments. Finally, we numerically compare SGL in networks with sign activation function and finite width to kernel regression with the surrogate gradient NTK; the results confirm that the surrogate gradient NTK provides a good characterization of SGL.
- [9] arXiv:2405.15643 (cross-list from stat.ML) [pdf, ps, html, other]
-
Title: Reducing the cost of posterior sampling in linear inverse problems via task-dependent score learningComments: 23 pages, 2 figuesSubjects: Machine Learning (stat.ML); Machine Learning (cs.LG); Analysis of PDEs (math.AP); Numerical Analysis (math.NA); Probability (math.PR)
Score-based diffusion models (SDMs) offer a flexible approach to sample from the posterior distribution in a variety of Bayesian inverse problems. In the literature, the prior score is utilized to sample from the posterior by different methods that require multiple evaluations of the forward mapping in order to generate a single posterior sample. These methods are often designed with the objective of enabling the direct use of the unconditional prior score and, therefore, task-independent training. In this paper, we focus on linear inverse problems, when evaluation of the forward mapping is computationally expensive and frequent posterior sampling is required for new measurement data, such as in medical imaging. We demonstrate that the evaluation of the forward mapping can be entirely bypassed during posterior sample generation. Instead, without introducing any error, the computational effort can be shifted to an offline task of training the score of a specific diffusion-like random process. In particular, the training is task-dependent requiring information about the forward mapping but not about the measurement data. It is shown that the conditional score corresponding to the posterior can be obtained from the auxiliary score by suitable affine transformations. We prove that this observation generalizes to the framework of infinite-dimensional diffusion models introduced recently and provide numerical analysis of the method. Moreover, we validate our findings with numerical experiments.
- [10] arXiv:2405.15666 (cross-list from math.AP) [pdf, ps, other]
-
Title: Well-posedness and invariant measures for the stochastically perturbed Landau-Lifshitz-Baryakhtar equationComments: Comments welcome!Subjects: Analysis of PDEs (math.AP); Probability (math.PR)
In this paper, we study the initial-boundary value problem for the stochastic Landau-Lifshitz-Baryakhtar (SLLBar) equation with Stratonovich-type noise in bounded domains $\mathcal{O}\subset\mathbb{R}^d$, $d=1,2,3$. Our main results can be briefly described as follows: (1) for $d=1,2,3$ and any $\mathbf{u}_0\in\mathbb{H}^1$, the SLLBar equation admits a unique local-in-time pathwise weak solution; (2) for $d=1$ and small-data $\mathbf{u}_0\in\mathbb{H}^1$, the SLLBar equation has a unique global-in-time pathwise weak solution and at least one invariant measure; (3) for $d=1,2$ and small-data $\mathbf{u}_0\in\mathbb{L}^2$, the SLLBar equation possesses a unique global-in-time pathwise very weak solution and at least one invariant measure, while for $d=3$ only the existence of martingale solution is obtained due to the loss of pathwise uniqueness.
Cross submissions for Monday, 27 May 2024 (showing 7 of 7 entries )
- [11] arXiv:2310.13400 (replaced) [pdf, ps, html, other]
-
Title: Malliavin differentiability of McKean-Vlasov SDEs with locally Lipschitz coefficientsComments: 29 pages, no Figures; To appear in Special issue of the Portuguese Mathematical SocietySubjects: Probability (math.PR)
In this small note, we establish Malliavin differentiability of McKean-Vlasov Stochastic Differential Equations (MV-SDEs) with drifts satisfying both a locally Lipschitz and a one-sided Lipschitz assumption, and where the diffusion coefficient is assumed to be uniformly Lipschitz in its variables. As a secondary contribution, we investigate how Malliavin differentiability transfers across to the interacting particle system associated with the McKean-Vlasov equation to its limiting equation. This final result requires both spatial and measure differentiability of the coefficients and doubles as a standalone result of independent interest since the study of Malliavin derivatives of weakly interacting particle systems seems novel to the literature.
- [12] arXiv:2312.07096 (replaced) [pdf, ps, html, other]
-
Title: Probabilistic representation of the gradient of a killed diffusion semigroup: The half-space caseSubjects: Probability (math.PR); Dynamical Systems (math.DS)
We introduce a probabilistic representation of the derivative of the semigroup associated to a multidimensional killed diffusion process defined on the half-space. The semigroup derivative is expressed as a functional of a process that is normally reflected when it hits the hyperplane. The representation of the derivative also involves a matrix-valued process which replaces the Jacobian of the underlying process that appears in the traditional pathwise derivative of a classical diffusion. The components of this matrix-valued process become zero except for those on the first row every time the reflected process touches the boundary. The results in this paper extend those in recent work of the authors, where the one-dimensional case was studied.
- [13] arXiv:2403.19380 (replaced) [pdf, ps, html, other]
-
Title: The image of random analytic functions: coverage of the complex plane via branching processesComments: Updated to remove unused lemma and changed formattingSubjects: Probability (math.PR); Complex Variables (math.CV)
We consider the range of random analytic functions with finite radius of convergence. We show that any unbounded random Taylor series with rotationally invariant coefficients has dense image in the plane. We moreover show that if in addition the coefficients are complex Gaussian with sufficiently regular variances, then the image is the whole complex plane. We do this by exploiting an approximate connection between the coverage problem and spatial branching processes. This answers a long-standing open question of J.-P. Kahane, with sufficient regularity.
- [14] arXiv:2405.04510 (replaced) [pdf, ps, html, other]
-
Title: Macroscopic flow out of a segment for Activated Random Walks in dimension 1Comments: v2: A few typos corrected, figures added and minor changes to improve presentation. 17 pagesSubjects: Probability (math.PR)
Activated Random Walk is a system of interacting particles which presents a phase transition and a conjectured phenomenon of self-organized criticality. In this note, we prove that, in dimension 1, in the supercritical case, when a segment is stabilized with particles being killed when they jump out of the segment, a positive fraction of the particles leaves the segment with positive probability.
This was already known to be a sufficient condition for being in the active phase of the model, and the result of this paper shows that this condition is also necessary, except maybe precisely at the critical point. This result can also be seen as a partial answer to some of the many conjectures which connect the different points of view on the phase transition of the model. - [15] arXiv:1904.01048 (replaced) [pdf, ps, html, other]
-
Title: Non-compact quantum spin chains as integrable stochastic particle processesComments: 35 pages, 2 figures, v2: typos fixed and references added, v3: typo fixed, v4: minor correctionSubjects: Mathematical Physics (math-ph); Statistical Mechanics (cond-mat.stat-mech); High Energy Physics - Theory (hep-th); Probability (math.PR)
In this paper we discuss a family of models of particle and energy diffusion on a one-dimensional lattice, related to those studied previously in [Sasamoto-Wadati], [Barraquand-Corwin] and [Povolotsky] in the context of KPZ universality class. We show that they may be mapped onto an integrable $\mathfrak{sl}(2)$ Heisenberg spin chain whose Hamiltonian density in the bulk has been already studied in the AdS/CFT and the integrable system literature. Using the quantum inverse scattering method, we study various new aspects, in particular we identify boundary terms, modeling reservoirs in non-equilibrium statistical mechanics models, for which the spin chain (and thus also the stochastic process) continues to be integrable. We also show how the construction of a "dual model" of probability theory is possible and useful. The fluctuating hydrodynamics of our stochastic model corresponds to the semiclassical evolution of a string that derives from correlation functions of local gauge invariant operators of $\mathcal{N}=4$ super Yang-Mills theory (SYM), in imaginary-time. As any stochastic system, it has a supersymmetric completion that encodes for the thermal equilibrium theorems: we show that in this case it is equivalent to the $\mathfrak{sl}(2|1)$ superstring that has been derived directly from $\mathcal{N}=4$ SYM.
- [16] arXiv:2107.01720 (replaced) [pdf, ps, html, other]
-
Title: Exact solution of an integrable non-equilibrium particle systemComments: 45 pages, 2 figures, v2: minor improvements, v3: typo fixedSubjects: Mathematical Physics (math-ph); Statistical Mechanics (cond-mat.stat-mech); High Energy Physics - Theory (hep-th); Probability (math.PR); Exactly Solvable and Integrable Systems (nlin.SI)
We consider the integrable family of symmetric boundary-driven interacting particle systems that arise from the non-compact XXX Heisenberg model in one dimension with open boundaries. In contrast to the well-known symmetric exclusion process, the number of particles at each site is unbounded. We show that a finite chain of $N$ sites connected at its ends to two reservoirs can be solved exactly, i.e. the factorial moments of the non-equilibrium steady-state can be written in closed form for each $N$. The solution relies on probabilistic arguments and techniques inspired by integrable systems. It is obtained in two steps: i) the introduction of a dual absorbing process reducing the problem to a finite number of particles; ii) the solution of the dual dynamics exploiting a symmetry obtained from the Quantum Inverse Scattering Method. Long-range correlations are computed in the finite-volume system. The exact solution allows to prove by a direct computation that, in the thermodynamic limit, the system approaches local equilibrium. A by-product of the solution is the algebraic construction of a direct mapping between the non-equilibrium steady state and the equilibrium reversible measure.
- [17] arXiv:2209.07554 (replaced) [pdf, ps, other]
-
Title: Detecting Planted Partition in Sparse Multi-Layer NetworksComments: Updated simulations and clarified certain resultsSubjects: Statistics Theory (math.ST); Probability (math.PR)
Multilayer networks are used to represent the interdependence between the relational data of individuals interacting with each other via different types of relationships. To study the information-theoretic phase transitions in detecting the presence of planted partition among the nodes of a multi-layer network with additional nodewise covariate information and diverging average degree, Ma and Nandy (2023) introduced Multi-Layer Contextual Stochastic Block Model. In this paper, we consider the problem of detecting planted partitions in the Multi-Layer Contextual Stochastic Block Model, when the average node degrees for each network is greater than $1$. We establish the sharp phase transition threshold for detecting the planted bi-partition. Above the phase-transition threshold testing the presence of a bi-partition is possible, whereas below the threshold no procedure to identify the planted bi-partition can perform better than random guessing. We further establish that the derived detection threshold coincides with the threshold for weak recovery of the partition and provide a quasi-polynomial time algorithm to estimate it.
- [18] arXiv:2305.16539 (replaced) [pdf, ps, other]
-
Title: On the existence of powerful p-values and e-values for composite hypothesesComments: 47 pages, 6 figuresSubjects: Statistics Theory (math.ST); Information Theory (cs.IT); Probability (math.PR); Methodology (stat.ME)
Given a composite null $ \mathcal P$ and composite alternative $ \mathcal Q$, when and how can we construct a p-value whose distribution is exactly uniform under the null, and stochastically smaller than uniform under the alternative? Similarly, when and how can we construct an e-value whose expectation exactly equals one under the null, but its expected logarithm under the alternative is positive? We answer these basic questions, and other related ones, when $ \mathcal P$ and $ \mathcal Q$ are convex polytopes (in the space of probability measures). We prove that such constructions are possible if and only if $ \mathcal Q$ does not intersect the span of $ \mathcal P$. If the p-value is allowed to be stochastically larger than uniform under $P\in \mathcal P$, and the e-value can have expectation at most one under $P\in \mathcal P$, then it is achievable whenever $ \mathcal P$ and $ \mathcal Q$ are disjoint. More generally, even when $ \mathcal P$ and $ \mathcal Q$ are not polytopes, we characterize the existence of a bounded nontrivial e-variable whose expectation exactly equals one under any $P \in \mathcal P$. The proofs utilize recently developed techniques in simultaneous optimal transport. A key role is played by coarsening the filtration: sometimes, no such p-value or e-value exists in the richest data filtration, but it does exist in some reduced filtration, and our work provides the first general characterization of this phenomenon. We also provide an iterative construction that explicitly constructs such processes, and under certain conditions it finds the one that grows fastest under a specific alternative $Q$. We discuss implications for the construction of composite nonnegative (super)martingales, and end with some conjectures and open problems.
- [19] arXiv:2402.00423 (replaced) [pdf, ps, html, other]
-
Title: Hierarchical Integral Probability Metrics: A distance on random probability measures with low sample complexitySubjects: Statistics Theory (math.ST); Probability (math.PR)
Random probabilities are a key component to many nonparametric methods in Statistics and Machine Learning. To quantify comparisons between different laws of random probabilities several works are starting to use the elegant Wasserstein over Wasserstein distance. In this paper we prove that the infinite dimensionality of the space of probabilities drastically deteriorates its sample complexity, which is slower than any polynomial rate in the sample size. We propose a new distance that preserves many desirable properties of the former while achieving a parametric rate of convergence. In particular, our distance 1) metrizes weak convergence; 2) can be estimated numerically through samples with low complexity; 3) can be bounded analytically from above and below. The main ingredient are integral probability metrics, which lead to the name hierarchical IPM.
- [20] arXiv:2402.18839 (replaced) [pdf, ps, other]
-
Title: Extended Flow Matching: a Method of Conditional Generation with Generalized Continuity EquationComments: 27 pages, 10 figuresSubjects: Machine Learning (cs.LG); Analysis of PDEs (math.AP); Functional Analysis (math.FA); Optimization and Control (math.OC); Probability (math.PR)
The task of conditional generation is one of the most important applications of generative models, and numerous methods have been developed to date based on the celebrated flow-based models. However, many flow-based models in use today are not built to allow one to introduce an explicit inductive bias to how the conditional distribution to be generated changes with respect to conditions. This can result in unexpected behavior in the task of style transfer, for example. In this research, we introduce extended flow matching (EFM), a direct extension of flow matching that learns a ``matrix field'' corresponding to the continuous map from the space of conditions to the space of distributions. We show that we can introduce inductive bias to the conditional generation through the matrix field and demonstrate this fact with MMOT-EFM, a version of EFM that aims to minimize the Dirichlet energy or the sensitivity of the distribution with respect to conditions. We will present our theory along with experimental results that support the competitiveness of EFM in conditional generation.
- [21] arXiv:2403.18576 (replaced) [pdf, ps, html, other]
-
Title: Logarithmic correlation functions in 2D critical percolationComments: V2: Significantly revised version, several new results addedSubjects: Mathematical Physics (math-ph); Statistical Mechanics (cond-mat.stat-mech); High Energy Physics - Theory (hep-th); Probability (math.PR)
It is believed that the large-scale geometric properties of two-dimensional critical percolation are described by a logarithmic conformal field theory, but it has been challenging to exhibit concrete examples of logarithmic singularities and to find an explanation and a physical interpretation, in terms of lattice observables, for their appearance. We show that certain percolation correlation functions receive independent contributions from a large number of similar connectivity events happening at different scales. Combined with scale invariance, this leads to logarithmic divergences. We study several logarithmic correlation functions for critical percolation in the bulk and in the presence of a boundary, including the four-point function of the density (spin) field. Our analysis confirms previous findings, provides new explicit calculations and explains, in terms of lattice observables, the physical mechanism that leads to the logarithmic singularities we discover. Although we adopt conformal field theory (CFT) terminology to present our results, the core of our analysis relies on probabilistic arguments and recent rigorous results on the scaling limit of critical percolation and does not assume a priori the existence of a percolation CFT. As a consequence, our results provide strong support for the validity of a CFT description of critical percolation and a step in the direction of a mathematically rigorous formulation of a logarithmic CFT of two-dimensional critical percolation.
- [22] arXiv:2405.13914 (replaced) [pdf, ps, html, other]
-
Title: The chromatic number of very dense random graphsComments: 37 pagesSubjects: Combinatorics (math.CO); Probability (math.PR)
The chromatic number of a very dense random graph $G(n,p)$, with $p \ge 1 - n^{-c}$ for some constant $c > 0$, was first studied by Surya and Warnke, who conjectured that the typical deviation of $\chi(G(n,p))$ from its mean is of order $\sqrt{\mu_r}$, where $\mu_r$ is the expected number of independent sets of size $r$, and $r$ is maximal such that $\mu_r > 1$, except when $\mu_r = O(\log n)$. They moreover proved their conjecture in the case $n^{-2} \ll 1 - p = O(n^{-1})$.
In this paper, we study $\chi(G(n,p))$ in the range $n^{-1}\log n \ll 1 - p \ll n^{-2/3}$, that is, when the largest independent set of $G(n,p)$ is typically of size 3. We prove in this case that $\chi(G(n,p))$ is concentrated on some interval of length $O(\sqrt{\mu_3})$, and for sufficiently `smooth' functions $p = p(n)$, that there are infinitely many values of $n$ such that $\chi(G(n,p))$ is not concentrated on any interval of size $o(\sqrt{\mu_3})$. We also show that $\chi(G(n,p))$ satisfies a central limit theorem in the range $n^{-1} \log n \ll 1 - p \ll n^{-7/9}$.