Fitting a Beta-Binomial distribution in Python can be done in two main ways: using Maximum Likelihood Estimation (MLE) with scipy.optimize or using Bayesian inference with PyMC.

Since the Beta-Binomial distribution is defined by two parameters (α and β) and the number of trials (n), here is how to approach it.


1. The Maximum Likelihood Approach (Frequentist)

If you have a dataset of observed counts (e.g., successes in n trials), you can use scipy.stats to define the log-likelihood function and scipy.optimize to find the parameters that maximize it.

import numpy as np
from scipy.stats import betabinom
from scipy.optimize import minimize

1. Generate dummy data (n=10 trials, alpha=2, beta=5)

n = 10 data = betabinom.rvs(n, 2, 5, size=1000)

2. Define the negative log-likelihood function

def neg_loglik(params, data, n): alpha, beta = params if alpha <= 0 or beta <= 0: return 1e10 # Parameters must be positive return -np.sum(betabinom.logpmf(data, n, alpha, beta))

3. Minimize

initial_guess = [1, 1] result = minimize(neg_loglik, initial_guess, args=(data, n), method='Nelder-Mead')

alpha_fit, beta_fit = result.x print(f"Fitted Alpha: {alpha_fit:.4f}, Fitted Beta: {beta_fit:.4f}")

---

2. The Bayesian Approach (Recommended)

If you want to estimate the uncertainty of your parameters or have prior knowledge, PyMC is the industry standard. This is often more robust than MLE for Beta-Binomial models.

import pymc as pm
import numpy as np

Data

n = 10 data = betabinom.rvs(n, 2, 5, size=100)

with pm.Model() as model: # Priors for alpha and beta (using HalfNormal as they must be > 0) alpha = pm.HalfNormal('alpha', sigma=10) beta = pm.HalfNormal('beta', sigma=10)

# Likelihood
obs = pm.BetaBinomial('obs', alpha=alpha, beta=beta, n=n, observed=data)
# Inference
trace = pm.sample(1000, tune=1000)

Extract results

print(trace.posterior['alpha'].mean().values) print(trace.posterior['beta'].mean().values)

---
### Which one should you choose?
  1. Use MLE (scipy) if:
    • You have a large amount of data.
    • You need a quick point estimate.
    • You don't want to install heavy dependencies like PyMC.
  1. Use Bayesian (PyMC) if:
    • You have a small sample size (Beta-Binomial is notoriously difficult to fit with small data using MLE).
    • You need to know the confidence intervals of your parameters.
    • You want to incorporate prior knowledge about the distribution.

Important Note on Parameterization

Sometimes the Beta-Binomial is parameterized by the mean (μ) and concentration (κ) instead of α and β.

  • α=μκ
  • β=(1μ)κ

If your optimization is failing, it is often easier to optimize for μ (bounded between 0 and 1) and κ (positive) and then convert them back to α and β.

Sign up for free to save this answer and access it later

Sign up →