| Title: | Interactive Fixed Effects Estimator for Panel Data |
|---|---|
| Description: | Implements the interactive fixed effects ('IFE') panel estimator of Bai (2009) <doi:10.3982/ECTA6135> with analytical standard errors ('homoskedastic', 'HC1' robust, and cluster-robust by unit). Supports asymptotic bias correction for large panels (Bai 2009) and a dynamic extension for predetermined regressors (Moon and Weidner 2017 <doi:10.1017/S0266466615000328>). Includes information-criterion-based factor number selection (Bai and Ng 2002 <doi:10.1111/1468-0262.00273>). Also implements an unbalanced panel extension using the expectation-maximisation algorithm of Bai (2009) with exact inferential statistics from Su, Wang and Wang (2025) <doi:10.2139/ssrn.5177283>, including nuclear-norm regularisation initialisation, singular value thresholding for factor number selection, and analytical bias correction for both strictly and weakly exogenous regressors. All computations use base R only with no external dependencies. |
| Authors: | Binzhi Chen [aut, cre] |
| Maintainer: | Binzhi Chen <[email protected]> |
| License: | GPL-2 | GPL-3 |
| Version: | 0.1.4 |
| Built: | 2026-06-08 06:53:24 UTC |
| Source: | https://github.com/rickchen0910/xtife |
Balanced panel of cigarette sales and prices across 46 US states for 30 years (1963–1992). Originally used in Baltagi (1995) and widely used as a benchmark dataset for panel estimators.
cigarcigar
A data frame with 1,380 rows and 9 variables:
US state identifier (integer, 1–46)
year (integer, 1963–1992)
cigarette price index
state population
population aged 16 and over
consumer price index
per-capita disposable income
per-capita cigarette sales (packs per person per year)
minimum cigarette price in adjoining states
Baltagi, B.H. (1995) Econometric Analysis of Panel Data. Wiley. Distributed with the plm R package (Croissant and Millo 2008).
Baltagi, B.H. (1995). Econometric Analysis of Panel Data. Wiley.
Croissant, Y. and Millo, G. (2008). Panel data econometrics in R: the plm package. Journal of Statistical Software, 27(2), 1–43. doi:10.18637/jss.v027.i02
Fits the panel model
for balanced panel data with analytical standard errors.
ife( formula, data, index, r = 1L, force = "two-way", se = "standard", bias_corr = FALSE, method = "static", M1 = 1L, tol = 1e-09, max_iter = 10000L )ife( formula, data, index, r = 1L, force = "two-way", se = "standard", bias_corr = FALSE, method = "static", M1 = 1L, tol = 1e-09, max_iter = 10000L )
formula |
R formula: |
data |
data.frame in long format (one row per unit-time observation) |
index |
character(2): |
r |
integer >= 0, number of interactive factors (default 1) |
force |
additive FE specification: |
se |
SE type: |
bias_corr |
logical; if |
method |
|
M1 |
integer; lag bandwidth for the B1 dynamic bias term
(default |
tol |
convergence tolerance (default |
max_iter |
maximum iterations (default |
An S3 object of class "ife" with the following components:
coef – named p-vector of estimated coefficients
vcov – p x p variance-covariance matrix
se – named p-vector of standard errors
tstat – named p-vector of t-statistics
pval – named p-vector of two-sided p-values
ci – p x 2 matrix of 95% confidence intervals (CI.lower, CI.upper)
table – data.frame coefficient table (Estimate, Std.Error, t.value, Pr.t, CI.lower, CI.upper)
F_hat – T x r estimated factor matrix
Lambda_hat – N x r estimated loading matrix
residuals – T x N residual matrix (full model)
sigma2 – estimated error variance
df – residual degrees of freedom
n_iter – iterations to convergence
converged – logical
N, T, r, force, se_type – model dimensions and options
call – matched call
Bai, J. (2009). Panel data models with interactive fixed effects. Econometrica, 77(4), 1229–1279. doi:10.3982/ECTA6135
Moon, H.R. and Weidner, M. (2017). Dynamic linear panel regression models with interactive fixed effects. Econometric Theory, 33, 158–195. doi:10.1017/S0266466615000328
Bai, J. and Ng, S. (2002). Determining the number of factors in approximate factor models. Econometrica, 70(1), 191–221. doi:10.1111/1468-0262.00273
data(cigar, package = "xtife") fit <- ife(sales ~ price, data = cigar, index = c("state", "year"), r = 2, force = "two-way", se = "standard") print(fit)data(cigar, package = "xtife") fit <- ife(sales ~ price, data = cigar, index = c("state", "year"), r = 2, force = "two-way", se = "standard") print(fit)
Fits the IFE model for r = 0, 1, ..., r_max and evaluates
five information criteria at each value of r. Returns IC1, IC2, and IC3
from Bai and Ng (2002) Proposition 1, applied to IFE residuals per Bai
(2009) Section 9.4, plus a BIC-style penalty (IC_bic) and a
small-sample-corrected prediction criterion (PC) from Bai (2009).
The criterion-minimising r for each IC is flagged with "*" in the
printed table, and a data-driven recommendation (favouring IC_bic when
the Bai-Ng criteria decrease monotonically) is displayed.
ife_select_r( formula, data, index, r_max = NULL, force = "two-way", verbose = TRUE, tol = 1e-09, max_iter = 10000L )ife_select_r( formula, data, index, r_max = NULL, force = "two-way", verbose = TRUE, tol = 1e-09, max_iter = 10000L )
formula |
R formula passed to |
data |
long-format data.frame |
index |
character(2): |
r_max |
maximum r to consider (default: |
force |
additive FE type (default |
verbose |
logical; if |
tol |
convergence tolerance (default |
max_iter |
maximum iterations (default |
(invisibly) a data.frame with columns r, V_r, IC1, IC2,
IC3, IC_bic, PC, converged, and attribute "suggested" (named
integer vector giving the IC-minimising r for each criterion).
Bai, J. (2009). Panel data models with interactive fixed effects. Econometrica, 77(4), 1229–1279. doi:10.3982/ECTA6135
Bai, J. and Ng, S. (2002). Determining the number of factors in approximate factor models. Econometrica, 70(1), 191–221. doi:10.1111/1468-0262.00273
data(cigar, package = "xtife") sel <- ife_select_r(sales ~ price, data = cigar, index = c("state", "year"), r_max = 4)data(cigar, package = "xtife") sel <- ife_select_r(sales ~ price, data = cigar, index = c("state", "year"), r_max = 4)
Estimates the number of interactive factors in an unbalanced panel using the singular value thresholding (SVT) rule of Su, Wang and Wang (2025, Section 3.3, eq. 3.7).
ife_select_r_unb(formula, data, index, c_f = 0.6, nu_NT = NULL, verbose = TRUE)ife_select_r_unb(formula, data, index, c_f = 0.6, nu_NT = NULL, verbose = TRUE)
formula |
R formula: |
data |
Data frame in long format. |
index |
Character vector of length 2: |
c_f |
SVT threshold constant (default 0.6). |
nu_NT |
Optional scalar or vector of NNR penalty values. If
|
verbose |
Logical; print result table. Default |
Invisibly returns a list with components r_hat, sv
(normalised singular values), threshold, c_f,
c_NT, and nu_used.
Su, L., Wang, F. and Wang, Y. (2025). Estimation and inference for unbalanced panel data models with interactive fixed effects. SSRN Working Paper 5177283.
data(cigar, package = "xtife") set.seed(42) cigar_unb <- cigar[sample(nrow(cigar), 1200L), ] ife_select_r_unb(sales ~ price, data = cigar_unb, index = c("state", "year"))data(cigar, package = "xtife") set.seed(42) cigar_unb <- cigar[sample(nrow(cigar), 1200L), ] ife_select_r_unb(sales ~ price, data = cigar_unb, index = c("state", "year"))
Fits the pure interactive fixed effects model
for unbalanced panels (units observed at different sets of time periods)
via an Alternating Maximisation (AM) outer loop that iterates between
updating and the factors ,
with the EM algorithm of Bai (2009) Appendix B used as the inner loop
to update given .
Exact inferential statistics (standard errors and bias correction) follow
Su, Wang and Wang (2025).
ife_unbalanced( formula, data, index, r = 1L, se = "standard", init = "ols", bias_corr = FALSE, exog = "strict", L_T = NULL, c_f = 0.6, nu_NT = NULL, tol = 1e-09, max_iter = 10000L, tol_em = 1e-07, max_iter_em = 500L )ife_unbalanced( formula, data, index, r = 1L, se = "standard", init = "ols", bias_corr = FALSE, exog = "strict", L_T = NULL, c_f = 0.6, nu_NT = NULL, tol = 1e-09, max_iter = 10000L, tol_em = 1e-07, max_iter_em = 500L )
formula |
R formula: |
data |
Data frame in long format (one row per observed unit-time pair). |
index |
Character vector of length 2: |
r |
Positive integer. Number of interactive factors (default 1).
To absorb additive fixed effects into the factor structure (the
recommended approach for unbalanced panels; see Description), set
|
se |
SE type: |
init |
Initialisation method: |
bias_corr |
Logical. Apply the SWW2025 Theorem 4.2 analytical bias
correction. Supports both strictly and weakly exogenous regressors
(controlled by |
exog |
Exogeneity assumption: |
L_T |
Bartlett kernel bandwidth for HAC standard errors ( |
c_f |
SVT threshold constant (default 0.6, SWW2025 eq. 3.7).
Used only when |
nu_NT |
NNR penalty grid. If |
tol |
Outer-loop convergence tolerance on
|
max_iter |
Maximum outer-loop iterations. Default |
tol_em |
Inner EM convergence tolerance. Default |
max_iter_em |
Maximum inner EM iterations per outer step.
Default |
Additive fixed effects. The SWW2025 model does not include explicit
additive unit effects or time effects .
SWW2025 (p.13, Theorem 3.2 discussion) states that the convergence and
asymptotic theory extend "in spirit" to two-way fixed effects models, and
(p.17, eq.\ 4.1–4.2 discussion) that "linear/nonlinear panels with one
way/two way/interactive fixed effects are all covered by this framework."
However, SWW2025 does not formally derive the SE or bias-correction
formulas for the explicitly demeaned unbalanced case.
The standard approach — supported by Bai (2009, p.1), who shows that
two-way additive effects equal for the special choice
, — is to
absorb the additive effects into the factor structure by increasing :
Unit FE only: set r = r_true + 1.
Two-way FE: set r = r_true + 2.
The SWW2025 inferential theory (SE and bias correction) then applies directly to the augmented factor model.
An S3 object of class "ife_unb" with components:
Named p-vector of estimated coefficients
(bias-corrected when bias_corr = TRUE).
Named p-vector of uncorrected coefficients (only when
bias_corr = TRUE).
p x p variance-covariance matrix.
Named p-vector of standard errors.
Named p-vector of t-statistics.
Named p-vector of two-sided p-values.
p x 2 matrix of 95 percent confidence intervals.
Data frame coefficient table.
TT x r estimated factor matrix (normalised F'F/TT = I_r).
N x r estimated loading matrix.
n_obs numeric vector of full-model residuals at observed cells.
Estimated error variance ().
Residual degrees of freedom.
Number of observed unit-time cells.
Outer-loop iterations to convergence.
Logical.
Model dimensions and options.
Options used.
Bias components (only when
bias_corr = TRUE). b2 is a zero vector when
exog = "strict".
Variable names.
Unique unit and time identifiers.
Integer index vectors for residuals.
Matched call.
Bai, J. (2009). Panel data models with interactive fixed effects. Econometrica, 77(4), 1229–1279. doi:10.3982/ECTA6135
Su, L., Wang, F. and Wang, Y. (2025). Estimation and inference for unbalanced panel data models with interactive fixed effects. SSRN Working Paper 5177283.
data(cigar, package = "xtife") # Drop ~10 % of rows to create an unbalanced panel set.seed(1) cigar_unb <- cigar[sample(nrow(cigar), 1200L), ] fit <- ife_unbalanced(sales ~ price, data = cigar_unb, index = c("state", "year"), r = 2L) print(fit)data(cigar, package = "xtife") # Drop ~10 % of rows to create an unbalanced panel set.seed(1) cigar_unb <- cigar[sample(nrow(cigar), 1200L), ] fit <- ife_unbalanced(sales ~ price, data = cigar_unb, index = c("state", "year"), r = 2L) print(fit)
Prints a formatted summary of an object of class "ife",
including panel dimensions, number of factors, additive fixed effect
specification, SE type, and a coefficient table with standard errors,
t-statistics, p-values, and 95% confidence intervals. If bias correction
was applied, bias terms are also reported. Information criteria are printed
when the object contains them (i.e., when called from ife_select_r()).
## S3 method for class 'ife' print(x, digits = 4, ...)## S3 method for class 'ife' print(x, digits = 4, ...)
x |
an object of class |
digits |
number of significant digits (default 4) |
... |
unused |
x invisibly.
data(cigar, package = "xtife") fit <- ife(sales ~ price, data = cigar, index = c("state", "year"), r = 2, force = "two-way", se = "standard") print(fit)data(cigar, package = "xtife") fit <- ife(sales ~ price, data = cigar, index = c("state", "year"), r = 2, force = "two-way", se = "standard") print(fit)