Research — The Base

Methodological Base

The three research pillars rest on a shared commitment to empirical rigour. This page describes the analytical infrastructure that makes the findings in each pillar trustworthy.

Empirical software engineering research faces a recurring practical problem: the phenomena worth studying — productivity, satisfaction, team cohesion, adoption behaviour — do not submit easily to controlled experiments. Field conditions are messy. Organisations are not laboratories. The answer is not to lower the analytical standard but to choose methods that meet the conditions of the field while still producing credible, reproducible findings.

Three methodological contributions address this problem directly.

01

PLS-SEM and Soft Theory

Partial Least Squares Structural Equation Modelling (PLS-SEM) is a technique for studying relationships between latent variables: constructs like productivity, psychological safety, and team cohesion that cannot be directly observed but can be reliably measured through multiple indicators.

When controlled experiments are not feasible, SEM provides a rigorous alternative. It models complex, multi-variable relationships simultaneously, handles measurement error explicitly, and is well-suited to the kinds of survey and longitudinal data that software engineering research routinely produces.

A paper in ACM Computing Surveys examined the use of PLS-SEM in software engineering research and provides updated methodological guidelines for both authors and reviewers. The guidelines address common misapplications and specify the conditions under which PLS-SEM produces valid inferences. Researchers in empirical SE who want to use the approach — or review papers that use it — will find the decision criteria directly applicable.

02

Empirical Standards

Peer review in software engineering has a well-documented consistency problem. One frequently cited study found that 89% of reviewers would recommend rejecting a paper that had already been published in the same journal. That level of inconsistency is not a minor calibration issue — it means the field cannot reliably distinguish good research from weak research based on peer review alone.

Empirical Standards address this structurally. They are transparent decision trees that specify what a study of a given type must include before publication: what qualifies as essential, what is desirable, and what constitutes extraordinary rigour, all tailored to the specific methodology used. A researcher running a longitudinal study consults the longitudinal standard. A reviewer assessing a survey study consults the survey standard.

The longitudinal standard is a direct contribution to this effort. It specifies the essential, desirable, and extraordinary elements for studies that track phenomena over time — the study type used most frequently across the research pillars on this site.

03

Guidelines for LLM-Assisted Research

LLMs are increasingly used inside empirical research workflows: as annotators that classify qualitative data, as judges that evaluate generated outputs, and as synthesis tools that process large corpora. Each of these roles introduces methodological risks that current reporting norms do not adequately address.

Community guidelines developed at llm-guidelines.org establish reporting requirements for SE studies that involve LLMs in the research process itself. The core requirements are: declare the model version and configuration used; report the prompts and interaction logs; validate LLM outputs against human judgment before treating them as data; use open-model baselines where available; and report limitations and mitigation strategies explicitly.

These are not optional practices. A study that uses an LLM as an annotator but does not report its configuration or validate its outputs cannot be replicated and should not be trusted. The same standard of transparency that applies to any other instrument in the research process applies here.